Book Review: The Art of XSD – SQL Server XML schemas

The 14 chapters of "The Art of XSD", written by MVP Jacob Sebastian, will take the reader step-by-step all the way from the basics of XML Schema design all the way to advanced topics on SQL Server XML Schema Collections. Reviewer Hima Bindu Vejella gives it an 8/10 rating, and gives us an excellent distilled description of what the book has to offer.

1055-XSD%20cover.png

Are you interested in developing XML Web Services as an integral part of your enterprise applications? Are you thinking of publishing your website or company news in the form of XML feeds? Are you planning to develop an application that consumes XML web services from third party providers such as Google, Amazon or Energy star? If you’re about to embark on any of these projects, then The Art of XSD will be an indispensable resource to you. The goal of the book is simple; to teach you everything you need to know to design and create an XML schema, which is naturally an essential component of any XML-based service.With this in hand, SQL Server developers with no prior knowledge of XSD can build powerful XML Schema Collections to create XSDs for Web Applications and Web services. Of course, if you’re new to using XML, the obvious first question is:

Why Use an XML Schema?

An XML Schema is itself an XML document, but what makes it special is that it describes or validates another XML document, defining the latter’s structure. If you’re wondering why you should use XML at all, it’s because XML has grown and matured over the past decade into a potentially powerful method for exchanging data, to the extent that Intel actually designed hardware with the primary intent of pushing the language’s adoption. More to the point, XML schemas support various datatypes, help in the understanding of secure data communications, are highly extensible, and are particularly well-suited to data representation.

By using an XSD (XML Schema Documents) to define the nature and structure of data which is to be shared between any number of services, it is possible to create dynamic, data-driven web applications for situations where information is changing rapidly. In addition, strong typing, standardized null representation, and XML-based language support make XSD more powerful than DTD (Document Type Definition, which is the alternative & older method of document structure definition). XSD becomes significantly more important when frequently-updated data needs to be exchanged in XML format with other internet-based applications, or when your application needs to consume published data sources such as weather-reports, stock exchange movements or currency conversion date, all of which are updated very frequently, sometimes several times a second. This is partly because the Schema (and an understanding of Schema) will make both incoming and outgoing feeds easy to interpret, and partly because you will be able to define and know (respectively) how rapidly the data feeds are changing. XSD plays a vital role in data exchange, describing information about data types, as well as the rules and the constraints that are being used to validate the values of elements and attributes.

What the Book Covers

Author Jacob Sebastian assumes no prior knowledge of XML on the part of the reader, and writes in simple English, using intuitive step-by-step examples. Jacob strives to explain the Art of XSD from its most basic principles so as to ensure his readers know everything they need to at every step. While the images and screenshots were not always legible (and admittedly it is a challenge to squeeze screenshots from modern monitors into compact books), his exhaustive and methodical approach makes the book thorough yet accessible, and well suited to an XSD beginner. After reading the book, you will have an excellent understanding of what XML and XSD are and why they’re relevant, as well as all the ins and outs of XML Schema components and designs. The importance of XML in real-time live applications is made clear throughout the book, and with Jacob’s guidance you’ll be an XSD artist in no time.

XML Schema Design

The book starts with the basics, covering “When do we need an XML Schema”, “Why do we need XSD?” and “How to write a document to define the basic structure of an XML document?” quickly and clearly. The XML Schema languages such as DTD and XDR are explained to some extent, and comparisons between them and XML are clearly made. Next up, to make sure you’re prepared to start writing XML Schemas immediately, Jacob describeds XML Schema support in SQL Server 2000, 2005 and 2008. For the record, that includes support for XML, OpenXML, SQLXML, Querying Data over HTTP, XML Views ,For XML, XML Data Type, XQuery Support and Lax Validation Support. The Schema Validation enhancements in SQL Server 2008 are explained in detail, keeping the book relevant and useful to current readers.

When it comes to actually writing an XML Schema, Namespaces are as essential as aliases are in T-SQL, and are clearly necessary when you need to avoid ambiguity and differentiate between similarly named classes. While there is a distinction between Default Namespaces and XSD Namespaces (which Jacob illuminates), it suffices to say for now that they give the document structure and add contextual meaning to the elements of the schema. Once you’ve wrapped you head around that, you need to start adding elements to your schema; and after you’ve structured and populated your schemas, the next step will be to use a SQL Server XML Schema Collection to store the definition of your documents so that you can perform Validations on them. That sounds simple enough, yet it turns out that SQL Server does not allow for easy alteration of Schema Collections, so dropping and recreating collections is recommended if you need to make any adjustments. Given how essential Schema Collections are throughout the rest of the book, Jacob takes care to provide a very clear explanation on how to use the SQL Server Management Studio XML/XSD editor for designing XML Schema quickly and painlessly. With the essential steps XML Schema creation covered, it’s time to test your knowledge.

At this point, and then again throughout the book, Jacob presents practical exercises relating to the fictional “North Pole Corporation”, walking you through how to build XSD schemas for a ‘real world’ Order Processing Application using the techniques covered in the book. Whilst these ‘labs’ sessions are invaluable, the book would have benefited from having each chapter finish with a more targeted exercise to cement the reader’s understanding of the concepts just covered. This would have been an elegant supplement to Jacob’s concise summaries at the end of each chapter.

XML Schema Components

Getting to grips with the components of XML Schemas is, at a primitive level, about understanding that the basic building blocks of a schema are elements and attributes, which need to be declared, and can be grouped or structurally controlled with Occurrence and Order Indicators. Elements & Attributes are the essential pillars for the construction of XML Schemas, and so it is good to see that Global & Local Element Declarations, the 14 Element Declaration Parameters , as well as mandatory & optional attributes are thoroughly explained. They can also be associated with certain XML Data-types, which allows for effective validation of your XML Schema documents. Jacob breaks down all of these definitions and permutations, explaining how they provide reusability and their role in defining XSD, as well as providing code samples to clearly demonstrate how these all fit together.

The question about whether information should be placed in an Element or an Attributes is one that keeps coming up. Occasionally there IS no question – only an element will do – and yet often there simply is no right answer, as they both perform much the same way. That being said, there are some restrictions on Attributes which may push you towards using Elements. Thankfully, entire chapters are dedicated to stepping through the internal structures of Element and Attribute Declarations, complete with descriptions of various parameters, and pointing out the difference between Global and Local Declarations (the former is, unsurprisingly, top-level; the latter is declared within the deeper structure of the document and has a more limited scope)

XML supports almost 50 built-in data-types, each of which has both a Value Space and Lexical Space characteristic. These data-types can be broken down into Primitive (i.e. base types of XML) or Derived (i.e. built on the backs of Primitive data-types), and are essential for efficient data description and validation. Jacob describes the properties of all 19 of XML’s Primitive Data-types, as well as the 25 Derived Data Types, complete with demonstration snippets and enhancements to the data-types in SQL Server 2008. Given that, by this point, a lot of new ground has been covered, this chapter then finishes off with a case-study and guidelines, with the help of lab to cement understanding.

In addition to Attribute groups & Element groups, Order & Occurrence indicators, Local & Global Attribute declarations, the book also gives detailed breakdowns and explanations on Simple & Complex types. Simple Type Attributes (which can be derived from List, Union, Restriction actions) consist of Global (named) and Local (anonymous) content, whereas Complex Types are categorized as simple content, element only content, mixed content and empty content. Jacob focuses on deriving complex type from simple types, including examples and all the possible ways of doing it – crystal clear. Of course, if this all sounds like a bit of a confusing mess (and it can, when boiled down like this), then don’t worry; as you get deeper into the book, Jacob visualizes the relationships between components to make understanding easier. I found the explanations of the attributes and their usage in creating XSD to be detailed and useful, and this part of the book certainly serves as an excellent reference guide for learning about the various components of XSD.

XSD Validation

XML Schemas are fundamentally quite simple documents, so the key to making the most of them is in having a deep understanding of the various XML Schema components. As you may have gathered by now, Jacob makes sure that the book is able to provide that level of understanding. Now that you’ve been introduced to the menagerie of XML Schema components, you need to start thinking about validating the format your Schemas. Although XSD validation is touched upon at several points, it’s only now that you have a complete understanding (or at any rate, complete reference) of XSD that you’re in a position to really dig deeply into the topic. The pattern restrictions used to validate values in XML uses Regular Expression (Regex) patterns to do so. Anyone who is familiar with Regex patterns will feel comfortable with this aspect of XSD, particularly bearing in mind that Jacob has provided a comprehensive breakdown of the Regex meta characters which you’ll need to be specifically aware of in XSD. Given that this is broaching new territory (as far as the book is concerned), Jacob wisely finishes these explanations off with one of his “Lab Sessions”.

Advanced Topics

By this point, you should be comfortable enough with the fundamentals of effective schema design to be interested in some of the more advanced aspects the practice. For starters, while a schema declaration only needs a namespace attribute, there are a host of other attributes (such as targetNamespace and attributeFormDefault) which are useful because they control the behavior of the schema processor when it validates XML instances against the Schema collection (Just to clarify the terminology, a schema processor is just a software tool with validates an XML document against an XSD document). True to form, Jacob provides an excellent breakdown of these various attributes and clearly explains their effects and rules you should bear in mind when using them.

At the end of the book, Jacob dedicates a chapter to talking about SQL Server Schema Collections, why they are used and how to alter the Schemas they hold (which, unsurprisingly, consists of more than just designing XSD). Given that Schema collections are referenced through the book, this information might have been better placed earlier in the book, but the brief coverage in Chapter 2 is generally enough to see you through to this point. Ultimately, Jacob’s goal was probably to ensure the reader knew enough about XSDs to be able to use SQL Server Schema Collections effectively. Essentially, Schema Collections are just system objects, much like tables or indexes, designed to store XML schemas. Naturally, given that this is the interface between two different technologies, there are workarounds and frustrations (such as the fact that SQL Server will not preserve the value of your ID attributes), and Jacob does sterling work to cover as much of this territory as possible and offer expert advice.

Final words

Each chapter in the book takes you another step closer to creating your own XML Schemas, starting with the fundamentals and working through all the critical aspects of the XSD, and ultimately providing you with a solid grasp of the basics and some snippets of more advanced material. Prior knowledge of XSD is not required to read this book, as you will learn by following comprehensive examples at every stage of your journey. The book unfortunately does not cover Extending XML Schemas or XML Schema 1.1, and it would have been particularly helpful if the author had highlighted XML Schema best practices. Nevertheless, starting from the absolute basics, ‘The Art of XSD’ explains both what an XML Schema is and how to write one, ensuring you have the essential knowledge of the building blocks of XSD, and now how to declare and work with them. By the time you’re finished reading, you’ll be an accomplished XSD user, comfortable with integrating XML and SQL Server, and ready to employ these schemas in your real-time applications.

The material is well written, and the author succeeds in contextualizing the reader’s learning process by using realistic scenarios. Each chapter will leave you ready to tackle the next, and there’s no doubt in my mind that this book will give you the extensive base you need in order to progress to next level of expertise in XSD. It’s a good book and an excellent reference which every web developer should keep in their arsenal.

To read an early chapter of Jacob’s book, see his article on ‘Introduction to XML Schema‘.

If you’d like To add “The Art of XSD” to your reference arsenal, the paperback is available from Amazon and Amazon (UK). Alternatively, if you’d like download a PDF version of the book, or read more of the details about the book, please go here to see The Art of XSD on Simple-Talk.