A total of 3,367 SSRS DataSet queries were processed resulting in over a 116,000 possible references to database objects. The regular expression above would pull out rows similar to the following: database_nmĪ picture of the SSIS Data Flow Task using RegExtractor in our production environment is shown below. Select top 1 eph2.Rate, eph2.RateChangeDateįrom HumanResources.EmployeePayHistory eph2
Select e.EmployeeID, current_rate.Rate, current_rate.RateChangeDate RegExtractor will process a SQL statement and output three columns for any strings that match the regular expression: database_nm, schema_nm, and object_nm.Īs an example, let's assume the following query has been used in a report titled Employee Pay to extract a list of employees from AdventureWorks and their current pay rate: This will cast a wide net and pull out many strings that are not actually database objects. So, the regular expression used in this case looks for any string that could resemble a database object in the form of: database_name.schema_name.object_name. The captured matches are output as new columns in the data flow." The component "allows you to define a regular expression with captured matches (backreferences) that you can apply to a column in your data flow. Step 3: Extract Any String that Resembles a Database ObjectĪ colleague of mine, Eric Just, developed an open source SSIS Data Flow Task component named RegExtractor ( ). Now that each DataSet's query has been extracted, it's time to mine that text to pull out any string that resembles a database object. R.Path,- report as DataSet,- Name attribute of DataSet elementĭS.Item.value('(./Query/CommandText)', 'varchar(max)') as CommandText,- DataSet's queryĭS.Item.value('(./Query/DataSourceName)', 'varchar(100)') as DataSource- ConnectionĬross apply r.Content.nodes('/Report/DataSets/DataSet') as DS(Item) - then, join the datasets as rows WITH XMLNAMESPACES (DEFAULT '') - setup namespace The XML value() method is then used to pull out pieces of each DataSet's XML notably the source query in the CommandText column. The T-SQL below shows using the XML nodes() method to return each DataSet in a report as a row of data. The table that now stores the report data is called metadata.reports. Now that a copy of the report catalog exists in our EDW complete with an XML column storing the report's RDL, SQL Server's built-in XML capabilities make it relatively easy to extract a report's T-SQL. Step 2: Extract Queries from SSRS Reports Note: Microsoft promotes using web services to interact with the report catalog rather than hitting the database directly. Select c.ItemID, c.Path, c.CreationDate, c.ModifiedDate, c.Content The source query for pulling report data from our Report Server is simply the following: All other columns in the destination table are typed the same as the source table. The destination table on our EDW server uses the XML data type for the Content column, which enables the use of XML methods. This column is typed as image and stores the RDL XML file for the report. Of note is the Content column in the report server's dbo.catalog table. SSIS is used to extract report data from the report server's catalog table and store it on our EDW server. Step 1: Extract Report Catalog from Report Server