Use OData to Execute RESTful CRUD Operations on Big Data in the Cloud

Microsoft's OData, the Open Data Protocol, allows consumers of data to get the metadata of data sources, to perform queries and, if necessary, to create, change or delete data items. It now provides the API for many of the available services on the internet, including the Open Government Data Initiative

The Open Data Protocol (OData) has become the Web 2.0 counterpart of ODBC, Microsoft’s Open Database Connectivity API, by enabling create, read, update, and delete (CRUD) operations with HTTP POST, GET, PUT, and DELETE methods on structured and semi-structured data formatted as AtomPub <feed> documents with <entry> elements.

OData promises to be the lingua franca for managing both structured and semi-structured big data generated by today’s cloud-based Web applications, such as real-time social analytics. For example, Microsoft’s new Codename “Social Analytics” project streams filtered tweets from Twitter, posts from Facebook and blogs, as well as Q&A from StackOverflow.com as a real-time OData feed from the Windows Azure Marketplace‘s DataMarket. Another Microsoft incubator project, Codename “Data Explorer”, enables Web app developers to create mashups of Excel spreadsheets, data files, and SQL Server databases with OData-formatted big data from the DataMarket.

Other major big-data generators have joined the OData band wagon. Netflix publishes its entire video catalog as OData (editor’s note: while this was true at time of writing, it is no longer the case), and eBay uses OData for publishing auction details with Windows Azure. SAP uses OData in its SAP NetWeaver Gateway to enable connectivity to SAP applications from multiple programming languages. Visoft released a Major Ruby OData Update v0.1.0 on December 7, 2011. Microsoft encourages third-party adoption of OData by releasing it under the Open Specification Promise and supporting OData.org, the official source for OData resources. Microsoft announced on December 6, 2011 a new self-service Publishing Portal which automates the process of licensing OData-formatted information from third parties. The Portal also supports paid or free access to SQL Azure databases.

Several publicly available, Web-based applications enable browsing unprotected OData sources. Figure 1 shows a preview of Fabrice Marguerie’s Silverlight Sesame OData Browser displaying part of the OData entries for 15 Netflix movies. The Media column displays thumbnail promotional images (BoxArt). Columns not visible in Figure 1 enable displaying additional title-related information, such as Synopsis, AverageRating, ReleaseYear, Type, URL (for a descriptive Web page), and so on. The XML tab displays the underlying 100-kB OData XML document, which you can download from my Windows Live SkyDrive account. Other free OData browsers include the OData Explorer and LINQPad v.4.37.1+.

1684-15NetflixMoviesSesameBrowser.png

Netflix Movies Sesame Browser

OData appeals to today’s Web-oriented and mobile device application developers by implementing the Web-friendly Representation State Transfer (REST) architectural style for distributed applications. RESTful Web services can be located by a Uniform Resource Identifier (URI) and implement standard HTTP verbs (GET, POST, PUT, and DELETE) to retrieve, create, update, and delete data items. In 2007, Microsoft’s Pablo Castro, whom I call the “father of OData,” championed OData as a Web-friendly alternative to SOAP-based XML Web services in an incubator project called “Astoria.” (Astoria, Oregon, is one of the two most cloudy cities in the conterminous U.S. with 239 cloudy days per year.) Castro described the project’s initial design goals as:

  • Web friendly, just plain HTTP
  • Uniform patterns for varying schemas
  • Focus on data, not formats
  • Stay high-level, abstract the store

The initial implementation adopted Microsoft’s Entity Data Model v1 to define entity property data types and associations, and supported plain old XML (POX), RDF+XML, and JavaScript Object Notation (JSON) formats.

In 2008, Project “Astoria” adopted the IETF-standard Atom Publishing (AtomPub) Protocol, which is based on the open-source Atom Syndication Format, a Google-sponsored substitute for Dave Weiner’s Really Simple Syndication (RSS) protocol. Key features of the AtomPub format are the capability to access any entity by a Uniform Resource Identifier (URI) and related entities by navigating a graph of associations. To simplify creating “Astoria” data sources, .NET Framework v3.5 SP1 added support for ADO.NET Entity Framework entities in Windows Communication Framework (WCF) contracts. The System.Data.Service.Client namespace made it easier to write C# and Visual Basic OData client applications, called OData consumers. .NET v4.0 added WebHttp services for building non-SOAP (REST) Web services. The OData.org site provides links to client libraries for Drupal, Java, Javascript, Joomla, Objective C, PHP, Ruby, Silverlight, and Windows Phone 7, as well as server libraries for Java, MySQL, PHP, and Ruby.

Microsoft has leveraged its investment in OData development and evangelism by adopting the protocol for Windows Azure table storage, as well as data interchange with SharePoint 2010 and SharePoint Online lists, SQL Server Reporting Services, Dynamics CRM 2012, Team Foundation Server 2010, and (in the future) Microsoft Pinpoint. An earlier cloud-based SQL Azure OData provider from SQL Azure Labs has been retired, but Julie Lerman’s Data Service in the Cloud MSDN article describes how to create your own. You can download a sample OData document from one of my SQL Azure databases here or open it in your browser at https://odata.sqlazurelabs.com/OData.svc/v0.1/jc650b4zaf/AdventureWorksLTAZ2008R2 (Editor’s note: link has been deprecated).

Note: To view OData documents in Internet Explorer 7 and later, open the Internet Options dialog, click the Content tab, click Settings in the Feeds and Web slices Section, and clear the Turn On Feed Reading View check box.

The C# source code for my Codename “Social Analytics” WinForms Client Sample App (see Figure 2) is available from the Social Analytics folder of my SkyDrive account. The grid displays important information about each <item> element; the graph displays trends in “buzz” and sentiment about Windows 8 from Twitter, Facebook, and Stack Overflow. This app shows developers how to write an OData client to download a <feed> document with large number (100,000+) of <item> elements. Microsoft Codename “Social Analytics” downloads a maximum of 500 items per query; paging requires executing Skip(n * 500) and Take(500) functions when issuing LINQ queries against an instance of the VancouverSliceContext object that’s defined by a Service Reference to Microsoft’s VancouverWindows8 OData provider. (Vancouver was the original codename for “Social Analytics.”)

1684-CodeNameSocialAnalyticsClientUI.png

Code Name “Social Analytics” Client UI

OData URI Conventions specify the construction of queries to return particular resources by appending collection names to a Service Root URI, which is identified by an .svc extension as in http://services.odata.org/OData/OData.svc. The Service Root URI returns metadata that lists the collections exposed by the service, Products, Categories, and Suppliers, in OData.org’s example:

Adding a $metadata suffix (http://services.odata.org/OData/OData.svc/$metadata) returns a detailed schema of each service entity. Appending /CollectionName to the Service Root URI, such as http://services.odata.org/OData/OData.svc/Categories, returns the entire collection as <entry> elements of a <feed> document. Append (n), where n is the 1-based index of the item, to return a particular collection member, as in http://services.odata.org/OData/OData.svc/Categories(2). The $expand query option returns related items, such as products within a category (http://services.odata.org/OData/OData.svc/Categories?$expand=Products.) Other query options include a $orderby=PropertyName suffix for sorting (http://services.odata.org/OData/OData.svc/Products?$orderby=Rating%20desc), $top=n to return the first n items of a sorted collection, as in $take=n described above, and $filter=expression to filter by equal (eq=value) and other logical expressions. Arithmetic operators and string functions add flexibility to queries. LINQ queries executed against WCF contexts that implement the IQueryable interface translate C# LINQ query syntax, such as the following (from the Social Analytics project):

to the corresponding OData query string:

https://api.datamarket.azure.com/Vancouver/VancouverWindows8/ContentItems() ?$filter=CalculatedToneId%20ne%20null&$orderby=PublishedOn%20desc&$skip= 0&$top=500&$expand=ContentItemType&$select=Id,ContentItemTypeId, ContentItemType/Name,Title,PublishedOn,CalculatedToneId,ToneReliability

The OData provider is responsible for parsing and translating the query string to the data source’s native query language, such as T-SQL for SQL Server. .NET providers use WCF and the Entity Framework to handle translation chores.

OData is rapidly becoming a ubiquitous protocol for a wide variety of data sources and programming languages. If eliminating data silos by increasing compatibility of diverse data formats is one of your organization’s goals, begin the New Year by becoming an OData expert.