Introduction to Business Intelligence Markup Language (BIML)

Business Intelligence Markup Language, or BIML, is a dialect of XML that describes Business Intelligence objects such as ETL Packages, Cubes, and even Facts and Dimensions (here we will focus mainly on ETL Packages).  This post will serve as an
intro to this very powerful (and mostly free!) BI development tool.  In this post we will just create a singular SSIS package, but once we dive deeper into BIML’s capabilities we will be able to build dynamic and automated ETL design patterns.  I hope to turn this into a series of posts on BIML.
Unless otherwise noted, we will assume SQL Server 2014 and VS 2012, although much of this applies at least back to SQL Server 2008.  
Getting Started
First, in order to use BIML, you will need a tool to translate the actual BIML code into the desired BI object.  There are two main options for compiling BIML code – the free Visual Studio add-on, BIDS Helper, and the proprietary IDE, Mist, by Varigence.  We will focus on FREE 🙂
·
Assuming you already have SQL Server and Visual Studio installed, download and install the awesome VS add-in BIDS Helper (while there, check out the other great features it adds).
·
Once BIDS Helper is installed, if you start a new SSIS project, you will now have a new option when you right click on the project in the solution explorer
Now you will get a new file called BimlScript.biml under the “Miscellaneous” folder in the solution explorer.  If you open the file, you’ll see the XML root node of your new BIML file.
<Biml xmlns=http://schemas.varigence.com/biml.xsd>
</Biml>
Before going any further, lets set up our demo tables, Employee and EmployeeDest – both with identical columns – and insert a few rows.
/****** Object:  Table [dbo].[Employee]    Script Date: 11/1/2015 4:35:35 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Employee](
       [EmployeeId]
[int] IDENTITY(1,1) NOT NULL,
       [FirstName]
[nvarchar](50) NULL,
       [LastName]
[nvarchar](100)
NULL,
       [Notes]
[nvarchar](255)
NULL,
       [CreateDate]
[datetime] NULL,
 CONSTRAINT [PK_Employee] PRIMARY
KEY CLUSTERED
(
       [EmployeeId]
ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE =
OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
/****** Object:  Table [dbo].[EmployeeDest]    Script Date: 11/1/2015 4:35:43 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[EmployeeDest](
       [EmployeeId]
[int] IDENTITY(1,1) NOT NULL,
       [FirstName]
[nvarchar](50) NULL,
       [LastName]
[nvarchar](100)
NULL,
       [Notes]
[nvarchar](255)
NULL,
       [CreateDate]
[datetime] NULL,
 CONSTRAINT [PK_EmployeeDest] PRIMARY
KEY CLUSTERED
(
       [EmployeeId]
ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE =
OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
INSERT INTO [dbo].[Employee]
           ([FirstName]
           ,[LastName])
     VALUES
           (‘Erik’,‘H.’),
                 (‘Jeff’,‘K.’),
                 (‘Ed’,‘B.’)
GO
Generate a Simple Package
We will start simple and just create a single package, but you can take a look at the  vast array of possibilities at the Varigence website (which is unfortunately always  painfully slow) – https://www.varigence.com/Documentation/Language/Element/AstPackageNode
.
As you would with an SSIS package made via the standard UI, we will start by setting up our connection managers.  We do this with a <Connections> and <Connection> tag as seen below.  As you can see, thanks to BIDS Helper, we also have some intellisense to help us along.
  <Connections>
    <Connection Name=MyConnection ConnectionString=Data Source=localhost;Initial Catalog=Sandbox;
Provider=SQLNCLI11.1; Integrated Security = SSPI;
/>
  </Connections>
With our connection defined, we now want to add the BIML to generate our actual package(s).  As with connections, we start with a <Packages> node, then <Package>, and within <Package> we have <Tasks>.  The <Tasks> tag loosely  translates to the Control Flow tab of an SSIS package in the UI.  So all of the nodes underneath a task, such as ExecuteSQL as shown below, would be Control Flow  objects.
<Biml xmlns=http://schemas.varigence.com/biml.xsd>
  <Connections>
    <Connection Name=MyConnection ConnectionString=Data Source=localhost;Initial Catalog=Sandbox;
Provider=SQLNCLI11.1; Integrated Security = SSPI;
/>
  </Connections>
  <Packages>
    <Package Name=MyFirstPackage ConstraintMode=Linear> <!–Name and
attributes of package. Linear tells BIML to execute the components in sequence.
–>
      <Tasks> <!–“Tasks”
are roughly equivalent to components on the Control Flow
–>
        <ExecuteSQL Name=Select Employees ConnectionName=MyConnection> <!–This
would be on the Control Flow
–>
          <DirectInput>
            SELECT COUNT(1) AS TotalEmployees
FROM Employee
          </DirectInput>
        </ExecuteSQL>
      </Tasks>
    </Package>
  
  </Packages>
  </Biml>
So if we take the above code and put it in our BimlScript.biml file (and assuming we  have a DB named Sandbox and the sample tables created), we can then generate our first package.  Right click on the BIML file and select “Generate SSIS Packages”  as shown below.
Assuming no errors, you should now have an SSIS package called  MyFirstPackage.dtsx which looks like the below:

 

 
Simple, and useless!
Adding a Data Flow
Now let’s add some Data Flow components to our BIML file.  As you might expect, within the <Tasks> tag, we now would add a <DataFlow> tag to add a Data Flow to our SSIS Control Flow.  In BIML, all of the components within a Data Flow are called “Transformations” and thus go in… a <Transformations> tag.  Here’s a quick review:
In order to add a Data Flow and an OleDbSource to our package, within the <Tasks>  tag, we add
<Dataflow Name=Extract Employee Data>
          <Transformations>
            <OleDbSource Name=Employee Source ConnectionName=MyConnection>
              <DirectInput>
                SELECT [EmployeeId]
                ,[FirstName]
                ,[LastName]
                ,[Notes]
                ,[CreateDate]
                FROM [Sandbox].[dbo].[Employee]
              </DirectInput>
            </OleDbSource>
          </Transformations>
 </Dataflow>
after our <ExecuteSQL> tag.  You’ll notice that within the <OleDbSource> node there is another node for <DirectInput>.  This is exactly what would go in the OLE  DB Source Editor SQL Command box in the SSIS UI.  This is a good point to note that, as with all XML, BIML IS CASE SENSITIVE.
We could generate the package again at this point, and successfully re-create the MyFirstPackage.dtsx package (note that BIDS Helper helpfully will ask you if you want to overwrite an existing package when you re-generate it), but it wouldn’t do much yet.
Add a Derived Column and OLEDB Destination
For the last part of this demo, lets add a simple Derived Column transform and then our destination.  So after our </OleDbSource> tag, we can add a <DerivedColumn> tag.  As you can see from the screenshot, the intellisense lists the wide variety of components that you can use with BIML.
Within the <DerivedColumns> tag, we can have <Columns> and then each individual <Column> which contains the value that would go into the Derived Column component’s Expression box in the SSIS UI.  You can add multiple columns in one
<DerivedColumns> tag.  Also note that in this case I set it to ReplaceExisting = “true”, just as you can in the UI. At this point, our full BimlScript file should look like
this
<Biml xmlns=http://schemas.varigence.com/biml.xsd>
  <Connections>
    <Connection Name=MyConnection ConnectionString=Data Source=localhost;Initial Catalog=Sandbox;
Provider=SQLNCLI11.1; Integrated Security = SSPI;
/>
  </Connections>
  <Packages>
    <Package Name=MyFirstPackage ConstraintMode=Linear> <!–Name and
attributes of package
–>
      <Tasks> <!–“Tasks”
are roughly equivalent to components on the Control Flow
–>
        <ExecuteSQL Name=Count Employees ConnectionName=MyConnection> <!–This
would be on the Control Flow
–>
          <DirectInput>
            SELECT COUNT(1) AS TotalEmployees
FROM Employee
          </DirectInput>
        </ExecuteSQL>
        <Dataflow Name=Extract Employee Data>
          <Transformations>
            <OleDbSource Name=Employee Source ConnectionName=MyConnection>
              <DirectInput>
                SELECT [EmployeeId]
                ,[FirstName]
                ,[LastName]
                ,[Notes]
                ,[CreateDate]
                FROM [Sandbox].[dbo].[Employee]
              </DirectInput>
            </OleDbSource>
            <DerivedColumns Name=Set Columns>
              <Columns>
                <Column Name=CreateDate DataType=DateTime ReplaceExisting=true> <!–The value inside the <Column> tags is what would be
in the Expression box on a Derived Column component
–>
                  GETDATE()
                </Column>
                <Column Name=Notes DataType=String ReplaceExisting=true>
                  “Added via ETL”
                </Column>
              </Columns>
            </DerivedColumns>
            <OleDbDestination Name=Insert Employees ConnectionName=MyConnection>
              <ExternalTableOutput Table=dbo.EmployeeDest></ExternalTableOutput>
<!–
Destination target table–>
            </OleDbDestination>
          </Transformations>
        </Dataflow>
   
      </Tasks>
    </Package>
  </Packages>
  </Biml>
And if we re-generate the package, and run it, hopefully it’s all green for you also.
Summary and Next Steps
Obviously this post just scratches the surface of BIML and doesn’t even begin to touch on its applications for dynamic automated development.  But like with everything, with BIML it’s good to start small and build on what you’ve learned.  In this post we covered the basic syntax of BIML, and created a very simple SSIS package to move data from one table to another with a simple transformation.
One of the best things about BIML is that packages created with it are identical
to any package created in the SSIS UI – other than maybe straighter lines.  If you generate packages via BIML, a future developer does not need to know a thing about BIML to do maintenance work on your BIML generated package.  
In future posts, I will cover more components and uses for BIML and the added power that BIMLScript – .Net code embedded within BIML – gives us to generate
true dynamic packages.  
Erik

Leave a Reply

Your email address will not be published. Required fields are marked *