Pop Rivett and the FTP directory

Dr Pop Rivett diagnoses URL-Aphasia in an anxious and exhausted patient and divulges a technique of synchronising a local directory with a remote FTP directory, all in TSQL!

434-drpop.jpg

“Oh Dr Rivett! The poor wee man is quite exhausted I fear. I’m so glad you can come to help. My husband has been up at the PC for the past forty-eight hours trying to work out how to import data files from a remote FTP site. I’m beginning to fear that he has done himself an injury.”

“Ahem, Mrs Muggins, we in the medical profession recognise this condition. URL-Aphasia it is called. An inability to remember good websites. Mr Muggins must know when to turn for help. There has bound to be someone on SQL Server Central or Simple Talk who has done it before and knows how to help”

“Oh Doctor, can you perhaps write us a sick-note, for poor Mr Muggins is almost a broken man”

“Not a bit of it. I can give you the solution myself” (rustling around in his doctor’s bag) “Here is Dr ‘Pop’ Rivett’s patent medicine to ensure that local directories are synchronised with remote FTP sites. You can adapt it to all sorts of problems. Mr Muggins will be right as rain by the morning, and rearing to get stuck into a day’s work.”

So often, one needs to get a datafeed into a SQL Server from an FTP site. It is common, for example, to have to get log files from a website via FTP and add these in to a SQL Server table for anaysis and reporting. The actual BCP process to get the file is pretty well understood but the process of synchronising a local directory with a remote FTP directory is less well understood. Something similar would be required if you were implementing a log shipping process across the internet. It is also very handy for archiving the contents of websites in cases where files are added but not changed.

In this example, we’ll just use the ordinary FTP utility that is part of Windows. It is a command-line utility which hasn’t changed much for years. The advantage is that it exists on every server and one does not need to do any installation. it is crude but sufficient

We’ll:

  1. find out if any new files exist on the FTP site that are not in the local directory
  2. get the file(s)

Firstly, we need a routine that talls us whay files are on the remote FTP site. This is done using FTP, putting the resulting list into a file, that we can subsequently read into a SQL Server table. This is done in a stpored procedure as follows:

Then we’ll need a very similar routine in order to get a file from a remote FTP site:
Now we can create a stored procedure that downloads any new file from the FTP site that we haven’t got locally. Normally, you will want to put in all sorts of logging into the system to check on progress but here is the routine in sufficient detail to get you started.

If you are using this process for importing data into tables you may want to get files into the download directory if they appear in neither the download directory or an archive directory. This would be a simple addition.