In the previous post we used Biml to create a very basic package, but with no tasks. Now we’ll take it a step further and start adding the steps or tasks within our package. To recap, this is what our SSIS package should look like at the end:

 

package designer 1

Step 1: Add variables

The first Execute SQL task in the package will execute a stored procedure and assign the output of that proc to variables. These variables will be used to filter our source data.

Adding variables to a package with Biml is not difficult, and the Biml code looks like this:

<Variables>
    <Variable Name="Data_Source_Load_Start_Date" DataType="String">2022-01-01</Variable>
    <Variable Name="Data_Source_Load_Start_Day" DataType="String">1</Variable>
    <Variable Name="Data_Source_Load_Start_Month" DataType="String">1</Variable>
    <Variable Name="Data_Source_Load_Start_Year" DataType="String">2022</Variable>
 </Variables>

As you can see from the above: Four variables are added, representing different date parts. In this particular project I do it as a standard step in all staging packages so that I have the option to filter data either by year, month, date etc. and is a by-product of having data at different levels of detail that may require different types of filters.

Yes, it’s the real world and not all data sources are perfect, standardized or consistent…

I am also deliberately storing the date parts as text/string variables, which makes it easier to inject into SSIS expressions (next post) without the need for a conversion. I have defined default values for the variables, but you of course do not need to do this for string variables if you don’t want to.

Step 2: The container

All of the tasks in my SSIS package are contained within a single sequence container. In Biml code a sequence container is seen as a task…in fact, anything that can be added from the SSIS toolbox to the Control Flow designer is a task and therefore the code will look like the following:

<Tasks>
    <Container Name="Sequence Container - Package" DelayValidation="true" ConstraintMode="Linear">
    </Container>
</Tasks>

The “Linear” ConstraintMode property let’s the Biml compiler know that all tasks within this container should be executed in series (one after the other with a precedence constraint). See the language reference for a more comprehensive explanation and other options.

Step 3: The Execute SQL tasks

Execute SQL tasks have two main components: The query that needs to be executed and the results returned by that query. If you don’t need to do anything with the results, or if the query does not return any results then the second part can be omitted. With that in mind, our first task that executes the stored procedure and map the results to the package variables will look like this:

<ExecuteSQL Name="Execute SQL Task - Set Load Start Date Variables" ConnectionName="BI_Admin" DelayValidation="true" ResultSet="SingleRow">
    <DirectInput>exec [BI_Admin].[ETL].[Get_Derived_Date_Parameters];</DirectInput>
    <Results> 
        <Result Name="Data_Source_Load_Start_Date" VariableName="User.Data_Source_Load_Start_Date" />
        <Result Name="Data_Source_Load_Start_Day" VariableName="User.Data_Source_Load_Start_Day" />
        <Result Name="Data_Source_Load_Start_Month" VariableName="User.Data_Source_Load_Start_Month" />
        <Result Name="Data_Source_Load_Start_Year" VariableName="User.Data_Source_Load_Start_Year" />
    </Results>
</ExecuteSQL>

The <DirectInput> tag is pretty self-explanatory and executes the stored procedure (in this case). It could also have been a SQL query.

The <Results> element contains a <Result> child element for each field returned in our query. I found this a little confusing at first, but the Name property’s value should reflect the name of the field that is returned from your query or stored proc.

The VariableName property is then used to map the returned field’s value to the relevant variable, and note that we don’t use the “User::<variable>” notation that we see elsewhere in SSIS, but we refer to the scope of the variable with the “User.” prefix. I’m guessing that the change in syntax is because of some other dependencies and/or potential clashes between the Biml compiler and SSIS itself.

Note that in this particular case I am not using variable values as input to the stored procedure, but using the result set of the stored proc to set the SSIS package variables’ values. Your Biml code will look different if you need to do it the other way around.

And now for the second Execute SQL task that truncates the staging table:

<ExecuteSQL Name="Execute SQL Task - RLM - Truncate Staging Table" ConnectionName="Staging" DelayValidation="true" ResultSet="None">
    <DirectInput>truncate table [Staging].[RLM].[PD8001];</DirectInput>
</ExecuteSQL>

Note that you have to specify “None” for the ResultSet property to indicate that you’re not expecting a result set from the query and/or stored procedure.

We’re almost there…and in the next blog post we’ll add the final piece to our package, the Data Flow task. Here’s what the entire Biml code will look like after today’s additions:

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
 
    <Connections>
        <OleDbConnection Name="BI_Admin" CreateInProject="true" ConnectionString="Data Source=<>;Initial Catalog=BI_Admin;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;" />
        <OleDbConnection Name="Staging" CreateInProject="true" ConnectionString="Data Source=<>;Initial Catalog=Staging;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;" />
        <OdbcConnection Name="RLM" CreateInProject="true" ConnectionString="<>" DelayValidation="true" />
    </Connections>
 
    <Packages>
 
        <Package Name="Staging_PD8001" DelayValidation="true" ProtectionLevel="DontSaveSensitive">
            
            <Variables> 
                <Variable Name="Data_Source_Load_Start_Date" DataType="String">2022-01-01</Variable> 
                <Variable Name="Data_Source_Load_Start_Day" DataType="String">1</Variable> 
                <Variable Name="Data_Source_Load_Start_Month" DataType="String">1</Variable> 
                <Variable Name="Data_Source_Load_Start_Year" DataType="String">2022</Variable> 
            </Variables>

            <Tasks> 

                <Container Name="Sequence Container - Package" DelayValidation="true" ConstraintMode="Linear"> 
                    <Tasks>
                        <ExecuteSQL Name="Execute SQL Task - Set Load Start Date Variables" ConnectionName="BI_Admin" DelayValidation="true" ResultSet="SingleRow"> 
                            <DirectInput>exec [BI_Admin].[ETL].[Get_Derived_Date_Parameters];</DirectInput> 
                            <Results> 
                                <Result Name="Data_Source_Load_Start_Date" VariableName="User.Data_Source_Load_Start_Date" /> 
                                <Result Name="Data_Source_Load_Start_Day" VariableName="User.Data_Source_Load_Start_Day" /> 
                                <Result Name="Data_Source_Load_Start_Month" VariableName="User.Data_Source_Load_Start_Month" /> 
                                <Result Name="Data_Source_Load_Start_Year" VariableName="User.Data_Source_Load_Start_Year" /> 
                            </Results> 
                        </ExecuteSQL>

                        <ExecuteSQL Name="Execute SQL Task - RLM - Truncate Staging Table" ConnectionName="Staging" DelayValidation="true" ResultSet="None"> 
                            <DirectInput>truncate table [Staging].[RLM].[PD8001];</DirectInput> 
                        </ExecuteSQL>
                    </Tasks>
                </Container> 

            </Tasks>

        </Package>
 
    </Packages>
 
</Biml>

 

Other posts in this series:

My first #Biml project

My first #Biml project: Starting small

My first #Biml project: Data flow tasks

 

February 26, 2018

4 thoughts on “My first #Biml project: Adding tasks

Leave a Reply

%d bloggers like this: