Joe Celko is one of the most widely read of all writers about SQL, and was the winner of the DBMS Magazine Reader's Choice Award four consecutive years. He is an independent consultant living in Austin, TX. He has taught SQL in the US, UK, the Nordic countries, South America and Africa.
He served 10 years on ANSI/ISO SQL Standards Committee and contributed to the SQL-89 and SQL-92 Standards.
He has written over 800 columns in the computer trade and academic press, mostly dealing with data and databases. He is the author of eight books on SQL for Morgan-Kaufmann, including the best selling SQL FOR SMARTIES.
Joe is a well-known figure on Newsgroups and Forums, and he is famous for his his dry wit. He is also interested in Science Fiction.
How do you record locations in SQL? Most relational database systems support spatial and geographical data, generally using the round-earth system based on the SQL specification of the Open Geospatial Consortium (OGC). However, this is not the only approach, as Joe Celko explains. … Read more
In the real world of business or scientific reporting and analysis, data can prove to be awkward. It can be plain wrong or it can be altogether missing. Sure, we have the NULL to signify unknown, but that doesn't play well with regular business reporting. There are a number of ways of dealing with missing information, and methods of estimating data from existing data has a long and respectable history. Joe Celko gets to grips with a data topic that is often treated with some trepidation. … Read more
How does one get a truly random sample of data of a certain size from a SQL Server database table. Well, there are simple non-portable tricks one can use, such as the NewID() function, but then refining those can be tricky. Take the Rand() function for a start. Can it really provide you with a truly random number? Why doesn't the TABLESAMPLE clause give you a set number of rows? Joe Celko scratches his head a bit, explains some of the issues and invites some suggestions and tricks from readers.… Read more
When we have to deal with and store a lot of data, it makes sense to aggregate it so that we store only the information we actually need. If we get this right, this works well, but the design of the system takes care and thought because the problems can be subtle and various. Joe Celko describes some of the ways that things can go wrong and end up providing incorrect, inaccurate or misleading results.… Read more
If the design of a relational database is wrong, no amount of clever DML SQL will make it work well. Dr. Codd’s Information Principle is that you have, inside the entity tables, the columns that model the attributes of that entity. The columns contain scalar values. Tables that model relationships can have attributes, but they must have references to entities in the schema. You split those attributes at your peril. Joe Celko explains the basics.… Read more
Many practical database problems can be tackled more simply and intuitively by graphs or networks, which in this sense are graphs in which attributes can be associated with the nodes and edges. It is a natural way to study relationships within the data. SQL databases aren't the easiest way of doing it, but it makes sense where the scale permits it. Because of the range of graphs and techniques, some Graph theory is unavoidable before you get stuck into the code, and who better to introduce graph databases than Joe Celko? … Read more
Triggers are generally over-used in SQL Server. They are only rarely necessary, can cause performance issues, and are tricky to maintain If you use them, it is best to keep them simple, and have only one operation per trigger. Joe Celko describes a feature of SQL that 'gets complicated fast'.… Read more
The ALL, SOME and ANY predicates aren't much used in SQL Server, but they are there. You can use the Exists() predicate instead but the logic is more contorted and difficult to read at a glance. Set-oriented predicates can greatly simplify the answering of many real-life business questions, so it is worth getting familiar with them. Joe Celko explains.… Read more
When you're formatting SQL Code, your objective is to make the code as easy to read with understanding as is possible, in a way that errors stand out. The extra time it takes to write code in an accessible way is far less than the time saved by the poor soul in the future, possibly yourself, when maintaining or enhancing the code. There isn't a single 'best practice, but the general principles, such as being consistent, are well-established. Joe Celko gives his take on a controversial topic.… Read more
With the formatting of code, we sometimes do things because they've always been done that way, rather than making code easier to understand. Occasionally these habits get in the way of readability. Joe Celko goes deep into his memorybanks to explain how these deep-seated traditions started. … Read more
Views in SQL can be difficult. It isn't easy to judge when to use them, It isn't always obvious how to determine if a view can be indexed or if it is updateable. Joe Celko takes a tricky topic and comes up with some helpful guidelines.… Read more
In theory, it is easy to determine whether data is equal to, greater than or lesser than a value. Real-world data types can be surprisingly tricky, even for gauging data equivalence. Because SQL cannot remain aloof from any application's data types, you need to be aware of the ways and pitfalls of individual datatypes and how you can test for equality, equivalence, similarity, identity and all that jazz… Read more
It sometimes pays to go back and look at what you think you already know about SQL. Joe Celko gives a quick revision of the GROUP BY and HAVING clauses in SQL that are the bedrock of any sort of analysis of data, and comes up with some nuggets that may not be entirely obvious… Read more
SQL is unusual is that data is not passively stored. Instead you use declarative SQL to specify the rules that underlie the data and its integrity. When used properly, constraints can avoid having to provide a lot of logic elsewhere. CHECK() and DEFAULT can do a lot to ensure that your data is correct… Read more
There are several ingenious ways of using SQL References to enforce integrity declaratively. Declarative Referential Integrity (DRI) is more effective than using procedural code in triggers, procedures or application layers because it uses the SQL paradigm, thereby making optimisation easier and providing clearer expression of the rules underlying the data.… Read more
In SQL, you can express the logic of what you want to accomplish without spelling out the details of how the database should do it. Nowhere is this more powerful than in constraints. SQL is declarative, and Joe Celko demonstrates, in his introduction to Declarative SQL, how you can write portable code that performs well and executes some complex logic, merely by creating unique constraints.… Read more
Although we like to think that our programming techniques are progressive and in tune with the bleeding edge of software development practices, too often they are directly influenced by restrictions faced in the post-war decades when computers
first became mainstream. As these restrictions no longer apply, is it time to relinquish such things as cursors,
'tibbling', storing display formats, using short names for symbols and primary keys?… Read more
A database must be able to maintain and enforce the business rules and relationships in data in order to maintain the data model. It does this through referential constraints. They aren't complex, but are powerful, especially with the means to attach DRI actions to them. Joe Celko explains all, and pines for the ANSI CREATE ASSERTION statement… Read more
Not all data is discrete; some data types represent a continuum. In SQL, we have to approximate them and live with the special problems of handling continuous data. We need to understand the problems associated with continuous data types, when these will happen, and how it affects constraints and the results of queries. Joe Celko explains.… Read more
As a society, we have an unrealistic respect for data, especially if it has a decimal point somewhere and uses metric units. We who are in the business of data need to cultivate a renewed interest in the sceptical and rigorous science of statistics: it is too important to leave to 'Data Scientists'. If the data is wrong, or the way we analyse or report it is misleading, much of what we do is pointless… Read more