SQL Backup box shot

SQL Backup

SQL Backup used, to speed up backups of ever-growing 380 GB database

As the online news service called NewsScape moves into its sixth year, it exhibits a lot of the characteristics of a child at the same age: It has a huge appetite for information, grows by leaps and bounds each year, and makes those who oversee it wonder if they can maintain control as it gets bigger.

The 380 GB database that sustains NewsScape is growing at more than 50 GB a year, creating a need for a fast and reliable backup solution. And, because NewsScape is a public service company, that solution must be reasonably priced.

Maintained by man and dog

NewsScape was created to showcase the database searching and handling capabilities of Enformatica, a U.K. company that was sold to USP Networks a couple of years ago. It is maintained by "a man and a dog – an aging golden retriever named Sambucus," according to the man, Andrew Clarke. The site is used primarily by journalists and researchers who need to check the history of a company, issue or individual.

Clarke describes NewsScape as "an ad-free, moderated news-Google with a long memory." The site has four Perl processes that act as web crawlers, accessing 3,000 major news sources on the Internet and collecting and analyzing an equal number of news stories each day. The stories are analyzed by proprietary software that scans the words and looks for language that "deviates from the Zeitgeist," according to Clarke.

When the tsunami hit, for example, the database was crammed with stories using similar words to describe the disaster. NewsScape's virtual editor combed these stories and looked for the ones that included unique clusters of words not found in the mass coverage. These were the stories posted on NewsScape.

Besides worldwide news, NewsScape has a fondness for companies in trouble – recalls, executive indictments, and lawsuits are all grist for the editorial mill. The service also favors unusual human-interest pieces, such as the one about an ex-exotic dancer using eBay to sell a prosthetic breast that was involved in an injury claim at a strip club.

Backing up the beast

The NewsScape database saves the entire body of text for every story it collects, so visitors can search for a keyword and receive a list of articles in just a few seconds. Since stories are generated every day, and the web crawlers never stop, the NewsScape database is the equivalent of the plant in "Little Shop of Horrors" – it keeps getting fed, and consequently keeps getting bigger.

Currently, two twin Pentium systems with 800 GB each of storage are used to house the database. Thanks to cheap memory, a public service organization such as Clarke's doesn't have to worry too much about the cost of storage space. What had him worried for a while, however, was backing up the behemoth.

"SQL Server's backup is slow and the end file for our site was just too huge to store easily," says Clarke. "We had tried other solutions, but they were too expensive to justify on a public-service site."

When Red Gate Software introduced SQL Backup 3.0, it seemed to be a good fit: It was inexpensive and provided key features such as compression to save hardware resources, speed (about two to three times faster than native SQL Server backup), and security, with 128-bit Rijndael encryption.

Clarke tried it on the NewsScape database and liked the results. SQL Backup 3.0 compressed more than 44 million pages of data, including some tables a billion rows long, by 63 percent. Database backup and restoration took just over three days, compared to about a week for native SQL Server backup.

Clarke expects that the log-shipping feature within SQL Backup 3.0 will be a major advantage to him. He says he will use this feature to keep the secondary server up to date with changes on the primary one. If the primary server goes down, he can easily exchange servers and switch the schedule of log shipping within SQL Backup. This is automatic and can be done remotely.

"The log-shipping feature in SQL Backup will give us some capabilities that we would otherwise need to upgrade to the Enterprise Edition of SQL Server to receive," says Clarke. "It could save us a few thousand pounds."

Databases keep on growing

While most IT departments have more resources than a man and a dog, database backup automation at an affordable price is always welcome. As Clarke points out, government-mandated regulations and business growth on the Internet are making very large, quickly expanding, databases a fact of life in business. Ultimately, organizations with these types of databases have a choice: Manage the database or have it manage you.

At NewsScape there is really no option but to automate. The management team isn't going to get any bigger, and the database will just keep on growing.