In their everyday lives, people are surrounded by technology that uses Geographic Information Systems (GIS), from viewing images on Google Earth, to navigating their cars with the aid of a Global Positioning System (GPS, a real-time GIS).
In each of these examples a spatial query is implicitly performed when the user picks a location. Where is my nearest petrol station? Where is the closest town? The ability to perform spatial queries, and display data spatially, is the essence and power of GIS.
SQL Server 2008 brings with it two new spatial data types, allowing spatial data to be stored, queried and rendered from our relational database tables, and opening the door to the development of GIS-enabled SQL Server applications.
In this article, I’m going to steer clear of the technical details of writing spatial queries, and instead attempt to give you a broad overview of what GIS is capable of, and how you might find a use for it in your work. I’ll discuss SQL Server 2008 offers in terms of spatial capabilities, special considerations for accurate use of spatial data, and how you can get started making maps with your data, using SQL 2008, available sources of GIS data and third-party software.
Basic Vector Mapping
GIS is usually used for creating maps, for presentation as internal reports or for publishing on the internet. The simplest use is for displaying on a map the locations of, for example, your company’s hire shops and depots. The map may include roads, county boundaries, or the outlines of European countries, whichever is more appropriate for your company.
All this data is stored as a collection of nodes (locations stored as values of x and y) and in the case of roads and boundaries, information is also stored on the vertices (the lines connecting the nodes). This enables the data to be drawn rapidly at any scale (although this general rule breaks down in some GIS, when you have 100-millions of nodes).
You can differentiate, with a label or symbol, between stores that hire out survey equipment, versus fencing, or other tools, as shown in Figure 1.
A benefit of using GIS over a normal drafting package is that you can add a location by entering its exact coordinates, rather than just estimating its position on the map. If a store changes location, or starts to stock different equipment, then the location marker and symbol will change automatically when the data is updated. An added bonus of GIS is that standard mapping features such as north-arrows, scale bars and grid lines (showing latitude/longitude or grid coordinates) are automatically produced.
Mapping Using Raster Data
In addition to vector data, you can also add raster information to your map. Raster data is image data, so it is defined by pixels rather than nodes and vertices. This is nicely illustrated if you zoom-in on a dataset. For vector data, as you zoom in, the lines and edges of features remain smooth whereas for raster data those edges become increasingly pixelated.
Commonly used raster data includes imagery and digital elevation models, obtained from satellites, aerial surveys or scanned paper documents. This raster data can be overlaid on to the vector data representing your factory locations.
Digital elevation data allows you to add the z-dimension to your data, so you could, for example, calculate the average height of your stores above sea level. Contours can be drawn and the slope calculated so that you don’t accidentally plan to build a new factory on a cliff face. Figure 2 shows a typical raster map:
Uses of raster data can get quite specialized. Raster data lends itself well to complex analysis because it presents a grid-based arrangement suited to implementing numerical solutions of various complex algorithms. For example, hydrologists use flow-routing algorithms on digital elevation models to predict the flow rates during a flood.
Stacked rasters can be used to represent spectral data (in a three-band image, one raster might represent ‘red’, one ‘green’ and one ‘blue’). LandSat images are one example of this kind of data. Certain spectra (combinations of RGB values) are known to be characteristic of certain land uses, such as rainforest, urban area or salt-marsh. Using these known spectral characteristics, GIS can be used to perform image segmentation, allowing you to map areas that represent rainforest, without actually having to visit them. This is the main way in which ecologists map the destruction of the rainforests in Indonesia and Amazonia.
Applications of GIS
As individuals have become more aware of the power of GIS, so have businesses, intent on analysing the locations of their customers or suppliers, calculating the most efficient travel for employees or deliveries, or displaying on the internet a map of how to get to their location. Complexity varies from a simple map showing the locations of their branches, with opening times, up to high end uses including serving GIS over the web. The former just requires a single person to make the map in a GIS package on a workstation, whereas the latter requires a dedicated GIS server, experts to maintain the server and all the staff to be trained in basic use of GIS.
You have collected information on your customers regarding where they live and you have a loyalty card, so you know whether they are buying or hiring spanners, cherry pickers or fencing, how often and how many. Using a postcode/zipcode database, you can change these addresses into geographic locations and plot them on a map. This simple procedure will tell you if your customers are primarily from one particular location.
To get added value, spatial queries can be used to dig deeper into the data. For example, you can find out what distance customers travel to buy your goods, either in a straight line or, if you have information on the road network, then the actual distance. You can look for spatial trends in buying habits, such as whether people from urban areas are spending more than those from rural areas. You can test how successful putting an advert in a local newspaper has been, by seeing if sales from that area increased after the advert. There are many, many more possibilities.
Environmental Analysis and Risk Ranking
When planning to build a new factory, it is possible that you’ll be required by law to produce an environmental impact analysis, quantifying the risk that your factory poses, especially if it is likely to cause pollution of the landscape, either visually or by emission. As a first step, you can obtain vector files outlining the environmentally sensitive, and legally-protected, sites for your country, such as national parks and nature reserves. Most countries publish these through their Environmental Agency. You can then calculate the proximity of your factory using a spatial query.
Any effluent from your site might affect local watercourses or groundwater, so you might, from survey organisations, obtain vector maps of the geology (telling you about groundwater) and maps of the streams / lakes / ponds / rivers, as shown in Figure 3:
Any fumes or effluent from your proposed site might pose a hazard to nearby housing, schools, or other public buildings. Once you have their locations, you can assess the potential environmental hazard by defining a scoring system, based on the proximity of the feature. This is all done automatically, within the GIS or within SQL 2008, using a spatial update query with an embedded conditional statement. If the government introduces more strict regulations at a later date, or if more nature reserves are added for example, you can just adjust the scoring scheme, and then just recalculate the score.
The mobile GIS device with which most people are familiar is, of course, the GPS in their car. However, uses of mobile GIS extend well beyond this. Mobile GIS can, for example, be used to track the course of, assess the potential impact of, and aid in the emergency management of natural disasters such as floods or hurricanes. For example, if a hurricane approaches, forecasters can use GIS to track its progress in real time and so provide information about its predicted position of landfall. Based on this data, you can produce maps allowing you to estimate how many people to evacuate (and the time needed to do so), or how much infrastructure might be damaged.
On the ground, you can track the locations of emergency relief workers and they can input data directly from their specific location, such as how many casualties have occurred, or the extent of infrastructure damage. This information can be used to create a map for the rescue teams enabling them to plan their strategy effectively. Displaying information on maps enables decisions to be made quickly.
This sort of “mobile data” can be submitted to the GIS via a form or text message, and often using a dedicated mobile PDA device. This kind of set up is very specialist and for everyday use field workers often input the data into the GIS on a laptop, or wait until they return to the office (which is not real-time, but maps can still be produced).
For emergencies, real-time data transmission and display is vital, but it can also come in handy for field work. For example, one can enter site information with photos and sample locations so that management can make decisions remotely on which samples need sending for further analysis.
Company-wide GIS solutions take the form of a GIS Server where data can be viewed, added to, and altered by some or all of the employees (as defined by the administrator). This form of GIS can be particularly useful when the company is large and has offices and clients in many locations.
An example would be a construction company whose planners can submit building footprints to the GIS Server, which can then be used by the managers to request changes or approve for construction. Furthermore, as the planned construction proceeds, site details can be added to the GIS to track the status of various parts of the build based on progress (e.g. in progress, completed) or on health and safety factors (hard hat area, large plant in use).
This GIS data can be made available online to the construction company’s client or the local council, for example.
GIS in SQL Server 2008
In this section, we’ll take a look at how SQL Server 2008 deals with spatial data and the different data types it supports. I cover the basic concepts relating to spatial reference information and one of the most common pitfalls in dealing with spatial data. Lastly there is some information about GIS systems that you can use to display your SQL 2008 spatial data on a map and sources of free data for you to test your spatial data skills.
Spatial Data Types
SQL Server 2008 introduces two new spatial data types that can be used to store different kinds of geographical elements such as points, lines, and polygons. This allows complex spatial data elements, of up to 2GB in size, to be stored in a single item of data. It also means that this GIS data can be held in normal databases and, with the help of spatial indexes, can be queried as if it were normal data.
The two new data types are:
- GEOMETRY : Used to store planar (flat-earth) data. It is generally used to store XY coordinates that represent points, lines, and polygons in a two-dimensional space. For example storing XY coordinates in the Geometry data type can be used to map the exterior of a building.
- GEOGRAPHY: Used to store ellipsoidal (round-earth) data. It is used to store latitude and longitude coordinates that represent points, lines, and polygons on the earth’s surface. For example, GPS data that represents the lay of the land can be stored in the GEOGRAPHY data type. Calculations using this type take into account the curvature of the Earth.
Figure 4, illustrates the two data-types being used for mapping. You will find examples of these in everyday situations: most paper maps use 6-figure grid references (Geometry coordinates), in linear units of metres, and GPS give coordinates in angular units of degrees of latitude and longitude.
The Geometry and Geography data types are compatible with the Geographic Markup Language (GML) format, as well as the Well Known Text (WKT) and the Well Known Binary (WKB) formats for geographic data, as defined by the Open Geospatial Consortium (OGC) Simple Features Specification for SQL.
SQL Server has built-in functions for importing and displaying data in these formats. There are several utilities, most notably GDAL/OGR (a part of FWtools), that can be used to convert from other formats into one of these standard formats. The bulk import of spatial data is best done using third-party utilities such as SAFE’s FME Workbench. FME supports a huge variety of spatial data formats such as the Ordnance Survey Great Britain’s (OS GB) MasterMap. If your data is available in ESRI-Shape files you can use Morten Nielsen’s ‘SQL Server 2008 Spatial Tools’.
The Geometry and Geography data types are implemented as .NET Common Language Runtime (CLR) data types. This means that they can support various properties and methods that can be applied to the data. SQL Server 2008 has a range of built-in T-SQL functions for doing the basic mapping and aggregation tasks that are required for the most common business applications, such as mapping sales revenue by location. In addition, it has a set of CLR functions for manipulating shapes, doing shifts, interpolating points and locating points. The aim is to have a library of all the common conversions, transformations and aggregations. For the simplest mapping applications, however, the work can be done simply using SSMS, using TSQL.
Spatial Reference Information (or: Help! My Data is in the Wrong Country!)
Imagine this situation: you are a consultant that has been asked to do a pan-European project to assess the environmental impact of petrol filling stations. You need to collect GIS data from all the countries of Europe relating to the location of the filling stations, water bodies and nature reserves. When you first plot up the data from Belgium and France, the data for Belgium appears somewhere in Africa (see the blue dots in Figure 5). What is happening here?
The reason for this common pitfall is that each set of data is using a different spatial reference system. The Agency that you received the French data from informed you that this data was in a Lambert Conformal Conic projection on the WGS84 Datum, which is a description of the spatial reference
However the agency providing the Belgian data, did not give you any information about the spatial reference, so you just plotted the two datasets up together, which is implicitly making the assumption that they have the same spatial reference (WGS84 is the most common spatial reference system used).
However, Figure 5 is telling you that the Belgian data has a different spatial reference from the French data. In fact, the Belgian data is in a Lambert Conformal Conic projection on the Belge 1972 Datum.
Basically when you attach spatial reference information to your data, you are providing information on what kind of coordinates your data store. Spatial Reference information is composed of three pieces of information:
- Ellipsoid – this is an approximation to the shape of the Earth, given literally as a major and a minor axis.
- Datum – this describes where the centre of the ellipsoid lies and the orientation of its axes.
- Projection – this is a 2D or flat representation of a 3D surface (e.g. surface of the Earth – see figure 6). This allows you to give your location in grid coordinates of metres.
So, using an ellipsoid and a datum you can describe your location in latitude and longitude (i.e. Geographic data type) and then the projection describes your location in grid coordinates (i.e. Geometry data type).
The reason for this seemingly complicated system, and why every country uses a different combination of ellipsoid-datum-projections, is purely historical. This is the system that surveyors used to simplify their calculations before the advent of computers.
Unfortunately, if you cannot find out the spatial reference information from the supplier then you have to make a guess at what it should be (sometimes straightforward, mostly very difficult). If you are overlaying your own data, you should know what spatial reference you are using!
SQL Server 2008 has a standard list of spatial references built in, as do all good GIS packages. In SQL 2008, this is referred to a Spatial Reference identifier (SRID), and it is a requirement to define an SRID for Geography data types. The default SRID is 4326, which maps to the WGS84 spatial reference system.
Although an SRID is not required for Geometry data types in SQL2008, it is highly recommended to use one when dealing with multiple sources of data.
Once all your data have a spatial reference defined then most GIS will display them in the right location, by doing a spatial transformation on the fly. SQL 2008 does not provide this capability but you could use something like: http://www.codeplex.com/ProjNET or FW Tools to do the transformations and all the common ones are well documented online.
GIS software to Display SQL Spatial Data
Spatial data can only be displayed as lists of coordinates in SQL Server Management Studio, so to view results of your spatial queries in map form, you need a GIS front end.
There are a few free GIS support direct rendering of spatial SQL:
- MapServer ( using OGR/ODBC driver) http://www.codeplex.com/SharpMap
- SharpMap http://www.codeplex.com/SharpMap
- FWTools/OpenEV http://openev.sourceforge.net/index.php
- Grass: https://grass.osgeo.org/
If you already have commercial GIS software such as ESRI ArcGIS, Mapinfo, Autodesk Map 3D 2007, Manifold, etc then the most recent releases support SQL 2008.
Sources of GIS data
There is lots of freely-available GIS datasets to play around with to get you started. I list just a few here:
- The Geography network has quite a few free datasets http://www.geographynetwork.com/ (link deprecated). For example, data on Canada used in this article is here: http://www.geobase.ca/geobase/en/index.html
Bear in mind that the datasets will need converting to the correct format for SQL Server 2008, which you can do using, for example, FWTools, SAFE’s FME Workbench, or Morten Nielsen’s ‘SQL Server 2008 Spatial Tools.
If you have a GIS capable of using Open Geospatial Consortium (OGC) compliant Web Map Service (WMS) then NASA has some great data to explore:
I hope that I have got the message across that GIS is a very powerful tool for managing and representing spatial data. Now you have a little bit of knowledge I hope you feel more confident in getting your teeth around spatial data in SQL 2008. An ideal place to start would be using your favourite GIS to represent your SQL 2008 spatial data on screen…have a play…