*This is an old post of mine from my employer’s old blog site that I had not moved here. I remembered about it when I was trying to do something similar and figured I’d post it. Today I would probably do it in C# using a timer or Thread.sleep.
For a recent project, my colleague (@craigblevin) and I found ourselves reloading a QA environment of a Data Warehouse from scratch. In this case, our source data consisted of all flat files – upwards of 70 gigs worth. In normal operation, our ETL easily handled the incremental data, but we found that reloading all of the data at once choked up our minimally resourced SQL Server. So we broke the files up into more manageable chunks and ran them one at a time (in this case going back to modify our DW ETLs themselves to process one file at a time was not an expedient option). After spending the better part of a weekend taking turns checking on the progress, manually moving a file, and starting the SQL job again, we realized we could build another SSIS package to take care of this for us. But how to make sure the previous run was complete before copying over the next file and restarting the job?
Set up a For Each File Loop
The first step was to set up a simple For Each File Loop in SSIS. In this case, as depicted below, we wanted to move 1 file at a time from our staging directory into our processing folder.
Once the file was in the processing folder, our package fired the SQL Job to start the deployed SSIS packages to load the DW.
Check for the job to finish
Once the deployed job was running, we had to know when it was completed so the package could start the next file sequentially. For this we created a For Loop which contained an Execute SQL Task which checked our custom DW Audit tables to see if the job was completed. The SQL Task simply set variable @IsLoadComplete == 1 when the last run was complete, and the For Loop continued until @IsLoadComplete = 1.
This led to a problem though – the Loop Until Complete container would execute the SQL Task continuouslyuntil the loop condition was met. We did not want to constantly query the server tens of thousands of times until the job was complete.
Have SSIS wait a specified period of time
Since SSIS does not have a native “Wait” task, our quick solution was to create a nested For Loop that does nothing with an evaluation expression that checks the current time until its 10 minutes after the start time of the container. This causes the “Check” SQL statement to run only once per 10 minutes.
EvalExpression: DATEADD(“mi”, 10, @[System::ContainerStartTime]) > GETDATE()
(The finished product is below (ignore the warning on the send mail task).
How does it perform?
In our case, since we were just loading a QA environment and wanted a quick and dirty solution, we were not concerned with performance. We successfully allowed this design to run in VS on our local medium powered machine overnight while processing multiple files without issue. However, a quick check of CPU usage did show a 30% spike in CPU usage – so in the end this solution won’t win any performance awards.
In hindsight, was there a better way? (hint: yes)
While writing this post, and checking the CPU usage, I started thinking there had to be a more efficient way. Since I am not a programmer by nature, I went straight for available SSIS components instead of opening up a script task. But a quick search for “SSIS Script Task Wait” found several examples (such as Mike Davis) of doing this with a Script Task – and sure enough using a Script Task with code such as below, performs the same function with much less CPU usage.
Credit to http://www.bidn.com/blogs/MikeDavis/ssis/158/make-an-ssis-package-delay-or-wait-for-data for the below Script Task code.
Public Sub Main()
Dimsec As Double= Convert.ToDouble(Dts.Variables("intDelayTime").Value)
Dimms AsInt32 = Convert.ToInt32(sec * 1000)
Dts.TaskResult = ScriptResults.Success
So if we had to do it again, replacing the “Do Nothing” For Loop with a Script Task such as the above would probably be less resource intensive and accomplish the same thing!