Monthly Archives: February 2017

Setting Up a Raspberry Pi 3 as a Print Server

Since the dawn of the digital age, man has been trapped in a never ending struggle with printers.  Total disaster.  I hate printers.  But this post isn’t about ranting about how terrible printers are in general – its about getting one to work wirelessly.  Since my wife and I switched exclusively to laptops, and I decommissioned (read – drive failed and its sat in various states of disassembly for 3 years) my last desktop that acted as a file and print server in our home office, we searched for a “wireless printer” that actually consistently works.  We’ve gone through several of various brands and while they “work” part of the time, we’ve never found one that was reliable and consistently stayed connected to our home network.  The latest, a Brother MFC-L2740DW multi-function laser, had the nasty habit of going “offline” to save power approximately every 24 hours.  Searches online yielded tons of posts with the same issue, and apparently there being no solution other than literally turning it off and turning it back on again.  Sure, you have to physically go to a printer to pick up what you printed, but wtf?  My thermostat can stay connected 24/7 but a several hundred dollar printer cant? Sorry, I said I wasn’t going to rant about this…

So the solution that I decided to try was to set up another dedicated print server to physically attach the printer to, hopefully keeping it “online” all the time and allowing the server to manage the wireless connectivity instead of the printer itself.  But I did not want to have to power and manage an entire desktop just so the printer is always at the ready.  Enter the Raspberry Pi.  For those unaware, Raspberry Pi is the name of a popular company that makes a series of small single board computers of the same name.  For this project I purchased a Raspberry Pi 3 Model B along with a kit containing a power supply, micro usb card pre-loaded with Raspbian (a Debian based Linux distribution for Raspberry Pi), a tiny little case, and even tiny little heatsinks.  The total cost for the kit was ~$50.  The Raspberry Pi 3 itself is about $35.

(Raspberry Pi 3 Model B)
This was my first project with a Raspberry Pi, and the first time I dabbled with Linux at all in a number of years.  So I had to fumble my way through getting it set up, connecting to it remotely from Windows, getting the printer working on the Pi, and finally getting our Windows machines to print through it.  Though in the end it was pretty simple, hopefully this post will help overcome some bumps I had along the road.
Pi Set Up
Getting the RPi up and running was simple enough.  The kit I got was from https://www.loverpi.com/ and it came with an 8GB Micro SD card preloaded with Raspbian as well as a bunch of software.  All I had to do was put in the SD card, dig up an old USB keyboard and mouse, hook it up via HDMI to a TV, and plug it in.  It booted right up to the GUI and was straightforward to connect the built in Wifi.

Enabling VNC
First order of business was getting VNC set up so I could stop using a terrible old USB keyboard while sitting on the floor in front of the TV.  From what I had read, Real VNC should have been included with the version of Raspbian that had come preloaded, and all I had to do was go to Menu > Preferences > Raspberry Pi Configuration, click on Interfaces, and Enable VNC.  But when I went to this screen, the VNC option wasn’t there.  So I figured I would try to update Raspbian to get the latest packages.  Again, its been a while since I’ve used Linux, and it was never very much, so I’ll show the terminal commands I used, but I’m not going to get into exactly what everything does here – because I probably don’t even really know.
sudo apt-get update
sudo apt-get upgrade

After updating Raspbian, which took a while because it downloaded a lot of updates, after I rebooted the RPi, the above VNC option WAS there for me to enable.  Then I just had to download the RealVNC viewer for Windows – https://www.realvnc.com/download/viewer/ –  on to my laptop and I was able to connect remotely to the RPi just by using the IP Address in the VNC Viewer address bar.

 Now I was able to safely put the RPi into its tiny little clear case, connect it to the printer via USB cable, and hide it away in the office.  I also attached an mSATA USB adapter for some extra storage with a spare mSATA drive, but that’s not required for this project.

Setting up CUPS

So next was to get the printer working locally with the RPi.  For that I turned to CUPS (Common UNIX Printing System), because that’s what the internet told me to use to set up a Linux system as a print server.  CUPS was already installed as part of the pre-loaded software on Raspbian, and I knew it was updated from earlier, so I should have just been able to go to its web interface and configure the printer.  Wrong.

To administer CUPS, you point your web browser (on the RPi) to http://localhost:631/ to get the admin screen shown below.

But clicking on “Adding Printers and Classes”, popped up a username/pw screen, which the instructions said to log into with your root user account, but even that yielded: “Unable to add printer: Forbidden”.

This was a spot where Linux knowledge would probably have helped.  Basically the CUPS software creates a user group called lpadmin on the machine, and a user must be a member of it to do anything via the web interface – but it doesn’t automatically add anyone to this group, not even the root user.  Though its a different Linux distrobution, I found the Ubuntu CUPS help page helpful https://help.ubuntu.com/lts/serverguide/cups.html

So as instructed in that link, I had to add the root user (in the case of the Rpi, root username Pi) to the lpadmin group.

sudo usermod -aG lpadmin username

NOW I was able to click on Adding Printers and Classes > Add Printer and select my already recognized printer.

You then have to select your make and model of printer to pick the correct driver.  The Brother MFC-L2740 was not listed.  I was able to find a Debian printer driver for this model on Brother’s support page, and although it installed on Raspbian, it is built for a different architecture and would not print.  Many people online seemed to have luck selecting similar model printers, so I gave the Brother MFC-9600 – CUPS+Gutenprint v5.2.10 driver a shot, and amazingly it worked!

I was able to click the Maintenance drop-down and successfully print a test page.

Adding the Printer to Windows Machines

After getting the printer set up on the RPi, the last thing to do was be able to add the new network printer to our Windows laptops.  Viewing the Network tab on Windows did not list the RPi on the network.  What I needed to do was allow file and print services to be shared from the RPi to Windows.  To do this, I needed to install Samba which is explained simply enough here – http://www.deviceplus.com/how-tos/raspberrypi-guide/raspberrypi_entry_006/

Install Samba:

sudo apt-get install samba

Edit Samba config file with nano text editor to set your Work Group and enable WINS support:

sudo nano /etc/samba/smb.conf

rasp06_img04

Scroll down through the config file and set the below attributes, then press CTRL+O to “WriteOut” aka Save the changes.

workgroup = your_workgroup_name
wins support = yes
 After making these changes we need to restart Samba.
service smbd restart
Now when we go to the Network screen in Windows, we can see the RPi on the network!

And within the RPi on the network we can find, and add, our printer!

In order to properly add the printer to Windows, the Windows machines should have the necessary printer drivers installed so you can select them when you Right Click > Connect on the printer you want.

Summary

After following the above steps, you should be up and running with a Raspberry Pi print server that you can connect to wirelessly with a Windows machine.  With this set up, the printer and the print server should be able to remain on and ready for use while only consuming a very small amount of power.  I have not yet tried to set up and use the scanning functions of the multi-function printer, but by now having Samba set up, I should be able to easily share files and folders as well and be able to scan directly to the RPi and access the files from any other machine on the network.  Along the same lines, the mSATA adapter pictured above can be used as shared storage via the RPi as a file share.

Extract Multiple Access DBs (with different schemas) via SSIS Script Task into SQL Server

*Another post moved over from my old blog.  Since I wrote this, a coworker modified the code to extract data from multiple Excel workbooks and tabs in a similar fashion.  At some point I’ll try to add that addendum.

Problem:

We’ve all heard it before – “Well I get the crucial financial numbers from this Access Database…”. Or even worse, “we have hundreds of access databases all over the place with varying schemas”. On a recent project I was faced with the later – several hundred Access DBs (mdb files) that had accumulated over a number of years, in various places across the network, all with different schemas. In the client’s defense, some of this was an outside 3rd party’s doing and they had since implemented an enterprise class solution for the process going forward. Never the less, I was tasked with extracting all of this legacy Access data into SQL Server.

   🙁

 

I figured I would just be able to use a For Each Loop within SSIS to extract each mdb file and just drop it to SQL Server. I quickly realized however that because every Access DB could potentially have a different schema, I would need to dynamically handle the schemas in each source file. I briefly thought about using BIML (see my recent posts about Business Intelligence Markup Language) to dynamically create the SSIS packages, but I really didn’t want hundreds of packages or components since this was a one and done job. So for better or for worse I turned to the trusty Script Task.

Solution:

So what I needed to do was:

  1. Connect to each Access DB file in different locations across the network
  2.  Create the table schema for that specific mdb file in SQL Server
  3.  Copy the data from the Access file to SQL Server
  4.  Run some other processing stored procedures not related to this post

Turns out there was also a 1b) – map each file location as a network drive using a separate Active Directory account. There was no need to keep the data after the last processing step was complete, so I was able to drop the created SQL Server tables when I was done.

Here is an overview of the control flow of my completed SSIS package:

First I use an Execute SQL Task to select all of the file names and locations of the mdb files from a pre-populated table (thankfully I had a spreadsheet of all the file names and locations). To set up the For Each Loop to loop through all the rows returned in the Execute SQL Task, we need to first create a User Variable of type Object – here I called it “mdbsToProcess”. Then in the Execute SQL Task properties, we set the ResultSet property to Full Result Set.

Then we click on the Result Set option and set the result set to the object variable we created.

Next I use a For Each Loop to loop through each of the mdb files I need to process. The For Each Loop is set up as follows to use the mdbsToProcess variable, and loop through each record to get information we need for each mdb file.

For the final code sample – click here. I’ll walk through it below. This is just a copy/paste from the SSIS Script Task, so please ignore all the SSIS default stuff.

After the Script Component generates and loads all of the tables from the mdb file into SQL Server, I simply get the current time, execute some other stored procedures to process the data unrelated to this post, then log the results.

Code Explanation:

By no means am I am App Dev guy, so feel free to improve upon this and let me know. But below is an overview of the more interesting parts of the Script Task code. First we set up all our variables (its quick and dirty so some are hard coded, others are passed in from SSIS).


 public void Main()

{

string FullFilePath = Dts.Variables["User::FullFilePathFromDB"].Value.ToString(); //Full network file path and name of mdb file
string TargetServer = Dts.Variables["$Package::TargetServer"].Value.ToString(); //Target SQL Server
string TargetDB = Dts.Variables["$Package::TargetDB"].Value.ToString(); //Target SQL DB
string tableName = null; //name of table in source
string TargetTableName = null; //name of table being created/populated
string DropTableSQL = null;
string NetworkDrivePassword = "MyNetworkDrivePassword";
string NetworkDriveUser = "MyNetworkDriveUserName";
string NetworkSharePath = "\\MyNetworkPath\Share$";

intCountCreatedTables = 0; //Used in log file

string SourceConnString = ("Provider=Microsoft.Jet.OLEDB.4.0;" + "Data Source=" + FullFilePath);
string TargetConnString = ("Data Source=" + TargetServer + "; Password=MyPassword;User ID=MyUserName;Initial Catalog=" + TargetDB + ";");

string command;
command = "/C NET USE " + NetworkSharePath + "/delete"; //If share mapped, disconnect to prevent multi "user" error

//WriteToLog("Executing Cmd Line: " + command);

ExecuteCommand(command, 5000);

command = "/C NET USE " + NetworkSharePath + " /user:" + NetworkDriveUser + " " + NetworkDrivePassword; //grant access

//WriteToLog("Executing Cmd Line: " + command);
ExecuteCommand(command, 5000);

 

Then we execute 2 NET USE commands to first delete any existing mapped network drive by the given name, and then to map it based on the given credentials. This is only necessary if the credentials you run the SSIS package under do not have access to the network shares. I found I had to first delete it if it existed or I could run into a “multi user” error. These commands are executed by the ExecuteCommand() method which just runs the NET USE commands via cmd.exe.

Next we use the GetSchema method on the Source Connection (AKA our Access file/OleDbConnection) to populate a data table with every table within the mdb file. Note that we specify the “Tables” schema object type.

OleDbConnection SourceConn = new OleDbConnection(SourceConnString);
SqlConnection TargetConn = new SqlConnection(TargetConnString);

SourceConn.Open();

WriteToLog("Connect to File # " + Dts.Variables["User::FileNumber"].Value + " in " + FullFilePath);

DataTable DBTabsTables = SourceConn.GetSchema("Tables");

SourceConn.Close();

Now that we have a table of all of the Access file tables, we iterate through each one in the DBTabsTables DataTable and generate our DROP and CREATE scripts which we pass to the CreateTable() method. In this case I specified the tables I wanted, but you can remove the IF statement if you want them all.

foreach (DataRow row in DBTabsTables.Rows) //For each table in mdb file

{

tableName = row["table_name"].ToString();

if(tableName == "AccessTable1" || tableName == "AccessTable2" || tableName == "AccessTable3") //only get specified tables from Access file. Or remove IF to get all tables

{

DropTableSQL = "IF OBJECT_ID('dbo."+ tableName + "') IS NOT NULL BEGIN DROP TABLE " + tableName + " END; "; //build drop table if exists SQL

CreateTable(SourceConn, tableName, TargetConn, DropTableSQL, tableName);  //For the initial create, we want to use the source tableName

++CountCreatedTables;

BulkLoadTable(SourceConn, tableName, TargetConn, tableName); //For the initial bulk load, we want to use the source tableName

}

}

 

Within CreateTable() we again see that we use GetSchema on the connection object, but this time we use the “Columns” object type and another variable called “restrictions”. If you look a few lines from the top of the code snippet below, you can see that “restrictions” is a 4 value array of strings. You can read about the options here, but I am passing the name of the current table within the loop into the 3rdposition in the array (remember, its 0 based), which is then used in the GetSchema call to restrict it to only 1 table. So now we have a DataTable called ColumnDataTable which contains 1 record for each column in the designated table.

public void CreateTable(OleDbConnection SourceConn, string tableName, SqlConnection TargetConn, string DropTablesSQL, string TargetTableName)

{

string[] restrictions = new string[4];

restrictions[2] = tableName; //restrict which table information is returned by GetSchema

string accessTableFieldName;

DataRow myRow;

SourceConn.Open();

DataTable ColumnsDataTable = SourceConn.GetSchema("Columns", restrictions); //Fill DataTable with columns information

SourceConn.Close();

 

When using the schema information to build the CREATE scripts for the new SQL tables, you need to be careful about data types between OleDB and SQL Server. There is probably a better way to do this, but I used the below switch statement to brute force change the datatype names to their SQL counterparts.

//For every row in the table

for (int i = 0; i < SortedColumnsDataTable.Rows.Count; i++)

{ //Get column name and type

myRow = SortedColumnsDataTable.Rows[i];

accessTableFieldName = "[" + myRow["column_name"] + "] ";

switch (((OleDbType)myRow["data_type"]).ToString())//Change OleDBType to SQL datatypes
{
case"Boolean": accessTableFieldName += "bit";
break;
case"Currency": accessTableFieldName += "money";
break;
case"Date":
case"DBDate":
case"DBTimeStamp": accessTableFieldName += "datetime";
break;
case"VarWChar":
case"WChar":
case"VarChar":
case"Char": accessTableFieldName += "nvarchar(" + myRow["character_maximum_length"] + ")";
break;
case"UnsignedTinyInt": accessTableFieldName += "int";
break;
case"Double": accessTableFieldName += "Float";
break;
default: accessTableFieldName += ((OleDbType)myRow["data_type"]).ToString();
break;

}

Lastly, in BulkLoadTable() we execute a BCP command to copy the data from the mdb file to the newly created SQL Server table. Be sure to set the batchsize and timeout length to something that can handle the amount of data you are transferring.

public void BulkLoadTable(OleDbConnection SourceConn, string tableName, SqlConnection TargetConn, string TargetTableName) //use bcp to load data from source to target
{
OleDbCommand SelectFromCmd = newOleDbCommand("SELECT * FROM " + tableName, SourceConn);

OleDbDataReader rdr;

SourceConn.Open();

TargetConn.Open();

rdr = SelectFromCmd.ExecuteReader();

SqlBulkCopy sbc = new SqlBulkCopy(TargetConn);

sbc.DestinationTableName = TargetTableName;

sbc.BatchSize = 50000;

sbc.BulkCopyTimeout = 120;

sbc.WriteToServer(rdr);

sbc.Close();

rdr.Close();

SourceConn.Close();

TargetConn.Close();

WriteToLog("Load  " + TargetTableName);

}

 

Summary

In this post we discussed how to use SSIS and a Script Task to iterate through any number of Access mdb files in order to import the data into SQL Server. The code used can handle Access dbs in varying schemas. This type of process can be very useful when you are dealing with a large number of legacy Access dbs and the Import Wizard isn’t practical. Hopefully this post will help someone else to upgrade/migrate a legacy business process into a more enterprise ready solution – not to allow an “Access for the Enterprise” application to continue to exist!

Erik

@ErikH_BI

 

Having an SSIS Component “Wait” Until a Condition is met or a Time Period Passes

*This is an old post of mine from my employer’s old blog site that I had not moved here.  I remembered about it when I was trying to do something similar and figured I’d post it.  Today I would probably do it in C# using a timer or Thread.sleep.

Background
For a recent project, my colleague (@craigblevin) and I found ourselves reloading a QA environment of a Data Warehouse from scratch. In this case, our source data consisted of all flat files – upwards of 70 gigs worth. In normal operation, our ETL easily handled the incremental data, but we found that reloading all of the data at once choked up our minimally resourced SQL Server. So we broke the files up into more manageable chunks and ran them one at a time (in this case going back to modify our DW ETLs themselves to process one file at a time was not an expedient option). After spending the better part of a weekend taking turns checking on the progress, manually moving a file, and starting the SQL job again, we realized we could build another SSIS package to take care of this for us. But how to make sure the previous run was complete before copying over the next file and restarting the job?

Set up a For Each File Loop

The first step was to set up a simple For Each File Loop in SSIS. In this case, as depicted below, we wanted to move 1 file at a time from our staging directory into our processing folder.

Once the file was in the processing folder, our package fired the SQL Job to start the deployed SSIS packages to load the DW.

Check for the job to finish

Once the deployed job was running, we had to know when it was completed so the package could start the next file sequentially. For this we created a For Loop which contained an Execute SQL Task which checked our custom DW Audit tables to see if the job was completed. The SQL Task simply set variable @IsLoadComplete == 1 when the last run was complete, and the For Loop continued until @IsLoadComplete = 1.

This led to a problem though – the Loop Until Complete container would execute the SQL Task continuouslyuntil the loop condition was met.  We did not want to constantly query the server tens of thousands of times until the job was complete.

Have SSIS wait a specified period of time

Since SSIS does not have a native “Wait” task, our quick solution was to create a nested For Loop that does nothing with an evaluation expression that checks the current time until its 10 minutes after the start time of the container. This causes the “Check” SQL statement to run only once per 10 minutes.

EvalExpression: DATEADD(“mi”, 10, @[System::ContainerStartTime]) > GETDATE()

(The finished product is below (ignore the warning on the send mail task).

How does it perform?

In our case, since we were just loading a QA environment and wanted a quick and dirty solution, we were not concerned with performance.  We successfully allowed this design to run in VS on our local medium powered machine overnight while processing multiple files without issue. However, a quick check of CPU usage did show a 30% spike in CPU usage – so in the end this solution won’t win any performance awards.

In hindsight, was there a better way? (hint: yes)

While writing this post, and checking the CPU usage, I started thinking there had to be a more efficient way. Since I am not a programmer by nature, I went straight for available SSIS components instead of opening up a script task. But a quick search for “SSIS Script Task Wait” found several examples (such as Mike Davis) of doing this with a Script Task – and sure enough using a Script Task with code such as below, performs the same function with much less CPU usage.

Credit to http://www.bidn.com/blogs/MikeDavis/ssis/158/make-an-ssis-package-delay-or-wait-for-data for the below Script Task code.

Public Sub Main()
'
Dimsec As Double= Convert.ToDouble(Dts.Variables("intDelayTime").Value)
Dimms AsInt32 = Convert.ToInt32(sec * 1000)
System.Threading.Thread.Sleep(ms)
'
Dts.TaskResult = ScriptResults.Success
End Sub

So if we had to do it again, replacing the “Do Nothing” For Loop with a Script Task such as the above would probably be less resource intensive and accomplish the same thing!