We are currently storing a large number of certificate documents in our SQL server instance, which have accumulated over the years. Our client wanted a bulk download of these certificates, and needed a once off solution. This post details the steps I used to accomplish the export, along with some of the issues.
I was not keen to manually save all the files myself, so BCP to the rescue. This is simple command-line application used to copy data to and from SQL server using a user defined format.
The files are stored in a varbinary(max) field and are generally a few 100’s of kilobytes, most of them being pdf’s. However some are saved as jpg or png’s as well, which complicated the export somewhat. Fortunately the previous developers were smart enough to include the filename in the database table, so extraction of the files automatically is possible.
If you wish to use BCP from SSMS, then the first step is to enable xp_cmdshell. This feature is disabled by default as it allows someone to run arbitrary batch scripts from SQL server.
Using BCP from SSMS makes it much faster to test and debug your script, but remember to switch it off afterwards.
The next step is to get a list of the files we want to export:
I suggest you use paging if you have a large number of files, otherwise SSMS will fall over.
We can read these files into a cursor as we have to perform some additional processing, just in case the filenames are invalid or you wish to include extra identifiers in the filename written to disk:
Now we want to generate our BCP commands:
Opening a cmd prompt you can now execute the following BCP command:
It will prompt you for additional file formatting information, as shown:
(Including the -x flag as well will generate the .fmt file as xml.)
Saving the formatting file that is generated will allow to you to specify the format of each of the expected filetypes in your table. Since I was dealing with pdf, png and jpg files I generated the appropriate formatting file for each type.
Our bcp command generation now looks like this:
(The settings files are attached for reference at the end of the post. You will notice the pdf file is saved in xml, and the other 2 in fixed width formats, I could not make them the same. More explanation as to the structure and use of the formatting file can be found here and here.)
Our bcp command now looks like this:
Running this command from the command line or via “EXEC xp_cmdshell” in SSMS will now produce the following output:
And our file has been successfully saved to disk.
Some Closing Thoughts
BCP is a very useful tool to solve a specific problem. However I would not recommend this as the “normal” way to get data in and out of SQL server, as the overhead is not worth it unless you are transferring large amounts of data. It is also worth remembering that it is a very good idea to store the filename of the originally imported file to ensure you can read the data later correctly.
Another tip is to ensure you have the correct file format for your file type, otherwise BCP will append the wrong number of bytes to your file on export making it unreadable.
This post was originally published on Entelect’s internal Tech Blog, Yoda.