We are currently storing a large number of certificate documents in our SQL server instance, which have accumulated over the years. Our client wanted a bulk download of these certificates, and needed a once off solution. This post details the steps I used to accomplish the export, along with some of the issues.

I was not keen to manually save all the files myself, so BCP to the rescue. This is simple command-line application used to copy data to and from SQL server using a user defined format.

The files are stored in a varbinary(max) field and are generally a few 100’s of kilobytes, most of them being pdf’s. However some are saved as jpg or png’s as well, which complicated the export somewhat. Fortunately the previous developers were smart enough to include the filename in the database table, so extraction of the files automatically is possible.

If you wish to use BCP from SSMS, then the first step is to enable xp_cmdshell. This feature is disabled by default as it allows someone to run arbitrary batch scripts from SQL server.

-- To allow advanced options to be changed.
EXEC sp_configure 'show advanced options', 1
GO
-- To update the currently configured value for advanced options.
RECONFIGURE
GO
-- To enable the feature.
EXEC sp_configure 'xp_cmdshell', 1
GO
-- To update the currently configured value for this feature.
RECONFIGURE
GO

Using BCP from SSMS makes it much faster to test and debug your script, but remember to switch it off afterwards.

The next step is to get a list of the files we want to export:

select
f.FileId, f.Filename, LOWER(SUBSTRING(f.filename, len(f.filename) - 2, 3)) as FileType
from dbo.Files f
ORDER BY f.FileId
OFFSET 0 ROWS
FETCH NEXT 500 ROWS ONLY;

I suggest you use paging if you have a large number of files, otherwise SSMS will fall over.

We can read these files into a cursor as we have to perform some additional processing, just in case the filenames are invalid or you wish to include extra identifiers in the filename written to disk:

set @fixedFileName = @Filename

set @fixedFileName = REPLACE(@fixedFileName, '/', '')
set @fixedFileName = REPLACE(@fixedFileName, '\\', '')
set @fixedFileName = REPLACE(@fixedFileName, ':', '')
set @fixedFileName = REPLACE(@fixedFileName, '%', '')

set @fixedFileName = RTRIM(LTRIM(@fixedFileName));

SET @currentFile = 'C:\temp\FileExport\' + @fixedFileName

Now we want to generate our BCP commands:

SET @bcpCommand = 'bcp "SELECT Data From FileExport.dbo.Files Where FilesId = ''' + convert(varchar(50), @FileId) + '''" queryout "'

SET @bcpCommand = @bcpCommand + @currentFile + '" -T '

print @bcpCommand

Opening a cmd prompt you can now execute the following BCP command:

bcp "SELECT FileData From FileExport.dbo.Files Where FileId = '1'" queryout "C:\temp\FileExport\1-File1.pdf" -T -x

It will prompt you for additional file formatting information, as shown:

Generate FMT File

(Including the -x flag as well will generate the .fmt file as xml.)

Saving the formatting file that is generated will allow to you to specify the format of each of the expected filetypes in your table. Since I was dealing with pdf, png and jpg files I generated the appropriate formatting file for each type.

Our bcp command generation now looks like this:

IF (@FileType = 'pdf')
BEGIN
SET @bcpCommand = @bcpCommand + '-f C:\Temp\FileExport\Settings\PDFExportFormat.fmt'
END
ELSE IF (@FileType = 'png')
BEGIN
SET @bcpCommand = @bcpCommand + '-f C:\Temp\FileExport\Settings\PNGExportFormat.fmt'
END
ELSE IF (@FileType = 'jpg')
BEGIN
SET @bcpCommand = @bcpCommand + '-f C:\Temp\FileExport\Settings\JPGExportFormat.fmt'
END

(The settings files are attached for reference at the end of the post. You will notice the pdf file is saved in xml, and the other 2 in fixed width formats, I could not make them the same. More explanation as to the structure and use of the formatting file can be found here and here.)

Our bcp command now looks like this:

bcp "SELECT FileData From FileExport.dbo.Files Where FileId = '1'" queryout "C:\temp\FileExport\1-File1.pdf"
        -T -f "C:\Temp\FileExport\Settings\PDFExportFormat.fmt"

Running this command from the command line or via “EXEC xp_cmdshell” in SSMS will now produce the following output:

File Export

And our file has been successfully saved to disk.

File Export Output

Some Closing Thoughts

BCP is a very useful tool to solve a specific problem. However I would not recommend this as the “normal” way to get data in and out of SQL server, as the overhead is not worth it unless you are transferring large amounts of data. It is also worth remembering that it is a very good idea to store the filename of the originally imported file to ensure you can read the data later correctly.

Another tip is to ensure you have the correct file format for your file type, otherwise BCP will append the wrong number of bytes to your file on export making it unreadable.

The full sql file can be found here and a copy of the export settings can be found here.



This post was originally published on Entelect’s internal Tech Blog, Yoda.