Database Geek of the Week – Hilary Cotter

An interview with Hilary Cotter, a specialist in SQL Server full-text search and replication.

Hilary Cotter is the first Database Geek of the Week who lives in my neck of the woods, in Central N.J. I met him at one of the user group-related events-a dinner gathering of geeks, as I recall-and was recently reacquainted with him when he spoke in the class just before mine at the Philadelphia Code Camp.

Hilary specializes in two areas of SQL Server with which I have had only passing experience: SQL Server full-text search and replication. He has written a hefty book on the subject called A Guide to SQL Server 2000 Transactional and Snapshot Replication, and a companion volume on merge replication is in the works for sometime this year. Hilary is simultaneously working on a book about Microsoft search technologies.

In listening to Hilary’s talk on replication, I learned just how little I understood about the topic. It is always great to listen to someone in love with his or her subject. If Hilary will be speaking at a Code Camp or another conference in your area, I encourage you to stop in.

Hilary and I had the following exchange via email:

Doug: Tell me a little about how you got involved in database-related work. Was it a natural fit? An accident?

Hilary: I started my career as a programmer, but got tired of the late nights and maintaining other people’s code, not to mention my own code-yeech! I became very interested in how systems interacted and drifted into network administration, with a heavy scripting and database component. I then realized that the network administration field was getting crowded, so I moved into Internet technologies. Since the web is database dependent, I worked more and more with databases. The way I view it, I was always doing database work, but over time I did less and less other stuff and more and more database stuff.

I am also very interested in unstructured data and document repositories and how they relate to databases. How does your file system, your Outlook PST, and Exchange differ from an RDBMS, for instance? Why doesn’t Exchange host OLTP applications? Why doesn’t SQL host messaging applications? Why is WINFS in limbo?

Doug: How did you become interested in replication? I love the idea of replication, but my success with it has been less than perfect.

Hilary: It was customer demand. I was once wrestling with a replication problem and had to speak with a PSS support engineer about it. The guy understood replication intimately, and listening to him describe how it works was captivating. So I explored it. I have yet to run across another technology that, with every layer, leaves me stunned with the intelligence and thought behind it. It has a real “wow” factor. Merge replication in particular keeps me picking my jaw up off the floor. The architects behind merge replication had to wrestle with some very complex problems, and they have come up with simple, elegant solutions to these problems.

There is something fascinating about keeping various pools of data in differing degrees of synchronicity. Service Broker interests me for similar reasons. As hardware becomes more and more powerful, the realm of what’s possible starts to become practical and, perhaps more important, affordable. This has stretched databases into the realm of document repositories on the one end of the spectrum and data islands on the other (consider your PDA a data island).

Doug: Have you played with SQL Server 2005? Especially with respect to replication, have you seen anything cool?

Hilary: Yes I have. For one reason or another I had a hard time installing the early SQL 2005 betas. The recent CTP and beta have been much better. In SQL 2005, Microsoft has addressed some particular issues with replication. I like the peer-to-peer replication features of transactional replication. I like to think that this will be popular, but it’s hard to say how it will be received or what the uptake will be. The ability to sync merge subscriptions over the Internet through https is interesting, but we saw an earlier incarnation of it with SQL CE, so it’s not really new.

Doug: Without violating any non-disclosure agreements, can you talk about the work you are doing these days?

Hilary: I am working for a large multimedia publishing company. The company is a heavy search-and-replication consumer and I get a chance to play with terabytes of data. The replication design that will be implemented is very challenging, but it’s in a state of flux. Over time, it may be replaced by a very simple replication topology.

Doug: Microsoft is listed among the companies for which you’ve worked. What did you do there?

Hilary: I worked for Microsoft Consulting Services as a SQL developer for a large financial client.

Doug: When you program in .NET, do you favor VB.NET, C# or some other language?

Hilary: C#. I had heard early on that C# and VB.NET were completely symmetrical, but that C# was the language to program in. Although I purchased some books on VB.NET, to this day I have yet to code a single application in it. It’s been exclusively C#.

Doug: What do you think about using VB.NET or C# for stored procedures, functions and triggers?

Hilary: From what I understand about the CLR, it will have limited use except for perhaps mathematical or financial calculations. There may be some advanced text functionality that you can do with it, but nothing I do leaps out and screams, “Use the CLR!” which would mean using C# or VB.NET. I believe the work I do that could benefit from the .NET languages is best done on the client side. There was always the ability to do Merge conflict resolvers using COM objects, so it might be possible to use the CLR for that. Again, its not something that jumps up, bites me on the nose and says, “Use me! Use me!”

Doug: In addition to using CLR languages for developing code, CLR types can be used for storing data. Are there any implications for replication in using CLR types?

Hilary: You can replicate functions that use the CLR. I am unaware of any implications other than enabling the subscriber for the CLR and ensuring the subscriber is SQL 2005 or above.

Doug: I do some mobile development work. Has any of your work involved SQL Server 2005 Mobile, formerly SQL Server CE? Have you worked with replication from and to SQL Server Mobile?

Hilary: I have played with SQL CE 2 but never worked with SQL CE as part of my job, which I hope will change soon. I have often wanted to reproduce the problems I see with replication and SQL CE on the newsgroups, but have been unable to do so, which is puzzling to me.

I have not used SQL Server Mobile yet. Before I was able to try it, my four-year-old son broke my PocketPC while playing the jawbreaker game.

Getting back to the evolution of databases, I think PDAs will play a greater and greater role in our lives, but their visibility may not be high. I see a time when they will be integrated into credit/debit cards, grocery carts, vehicles, etc. I notice that Firestone uses them as point-of-sale terminals.

Doug: Have you recently read any good programming or database-related books?

Hilary: I read SQL Server 2005 New Features by Michael Otey, which I thought was a perfect overview of SQL 2005. I have been reading chapters of A First Look at SQL 2005 for Developers, which is great. Karen Watterson referred me to perhaps the best overview of XML and SQL 2005 on the web (http://www.vldb.org/conf/2004/IND5P2.PDF), and Michael Rhys’s white papers are excellent as well. I recently finished Steve McConnell’s Code Complete about software development, which I enjoyed, although I found it similar to Jon Louis Bentley’s Programming Pearls, which I browse from time to time. I am also struggling with An Introduction to Support Vector Machines.

Doug: Can you think of any particularly cool tip or trick to help developers struggling with replication?

Hilary: Few DBAs are aware of agent profiles. You can build different profiles for different aspects of your agent. You can create a debugging profile, for instance, as well as high-performance and recovery profiles. Simply right click on your agent and select Agent Profiles, and then click New profile. Frequently the default agent properties are the least performant.

I can’t suggest anything to ease the replication experience other than playing with the product. The problem with replication is that it can break in so many ways. An unreliable network connection, poor database design, and a tight security policy can cause your replication solution to fail. Frequently it’s hard for the novice to figure out the cause of the problem. I urge people to post their questions on microsoft.public.sqlserver.replication, where Paul Ibison or I will answer them.

###

Do you know someone who deserves to be a Database Geek of the Week? Or perhaps that someone is you? Send me an email at editor@simple-talk.com and include “Database Geek of the Week suggestion” in the subject line.