{"id":105368,"date":"2025-04-08T12:56:23","date_gmt":"2025-04-08T12:56:23","guid":{"rendered":"https:\/\/www.red-gate.com\/simple-talk\/?p=105368"},"modified":"2025-02-23T00:27:38","modified_gmt":"2025-02-23T00:27:38","slug":"data-elements-on-a-nominal-scale","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/theory-and-design\/data-elements-on-a-nominal-scale\/","title":{"rendered":"Data Elements on a Nominal Scale"},"content":{"rendered":"\n<p>\u201cOn a scale from 1 to 10, what color is your favorite letter of the alphabet?\u201d<\/p>\n\n\n\n<p>&#8211;Slogan on a T-shirt<\/p>\n\n\n\n<p>This is one of my favorite T-shirt gags because it teaches something about data types, scales and measurements. When you hear it, you automatically start to answer it as if it was a real question. The humor for us nerds comes from the absurdity of mixing different kinds of scales and data types.<\/p>\n\n\n\n<p>We could compute \u201cfurlongs per fortnight\u201d and actually get an answer that has some meaning. It would not be very useful, but at least it makes some kind of sense (distance per unit of time).The T-shirt slogan makes no sense whatsoever.<\/p>\n\n\n\n<p>The simplest possible type of scale is called a \u201c<strong>nominal<\/strong>\u201d scale and all it does is give a name to an entity or data element. It\u2019s really more important for what properties it does<em> not<\/em> have.<\/p>\n\n\n\n<p>In fact, the nominal scale is so weak that originally people did not like to even include as a scale. After all, all you\u2019ve done is name something. Its most important property is that it\u2019s elements are unique. That is, we can use a name to identify an element in the set. But it establishes a logical principle called \u201cThe Law of Identity\u201d.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-the-law-of-identity\">The Law of Identity<\/h2>\n\n\n\n<p>The <a href=\"https:\/\/en.wikipedia.org\/wiki\/Law_of_identity\">Law of Identity<\/a> is the basis of all Western thought, and states that each and every thing is identical to itself. The shorthand version of this law, thanks to Ayn Rand, is that \u201cA is A\u201d. A more complete version, coupled with the concept of existence is that \u201cto be is to be <em>something in particular;<\/em> to be nothing in particular or to be anything in general, Is to be nothing at all.\u201d Or, if you prefer the idea is that every entity in the universe has an identity which makes it unique and associates certain properties (attributes) with it. An acorn can become an oak tree, but it cannot become an elephant and each acorn is a separate entity.<\/p>\n\n\n\n<p>Identity is fundamentally important. There is an old Garfield comic strip in which Jon is talking to a friend about cats illustrates why the Law of Identity is so important.<\/p>\n\n\n\n<p>Friend: \u201cWe had three cats when I was a kid.\u201d<\/p>\n\n\n\n<p>Jon: \u201cWhat were their names?\u201d<\/p>\n\n\n\n<p>Friend: \u201cCat, cat and cat.\u201d<\/p>\n\n\n\n<p>Jon: \u201cThat doesn\u2019t seem very useful. How did you tell them apart?\u201d<\/p>\n\n\n\n<p>Friend: \u201cWho cares? Cats do not come when they\u2019re called!\u201d<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-what-you-cannot-do-with-nominal-scales\">What You Cannot do with Nominal Scales<\/h2>\n\n\n\n<p>To make this clearer, lets look at a couple of concepts.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-1-there-is-no-inherent-single-display-format\">1) There is no Inherent single display format.<\/h3>\n\n\n\n<p>Any unique string of symbols can serve as a name. In theory, if you give every employee his own emoji, instead of an employee id number. Historically, this was actually done during the renaissance by illiterate Italian stone cutters. Each of the sculptors had a glyph which looked like an alchemist symbol. These glyphs were used to label blocks of marble and hold them for various artists. If you like Ray Bradbury science fiction, in the movie \u201cFahrenheit 451\u201d (1966, Oskar Werner, Julie Christie, Cyril Cusack), official personnel records used photographs instead of text.<\/p>\n\n\n\n<p>In fact, the Unicode standard is very much concerned with display. First their concern with the font that is used, then they are concerned with the equivalent encodings which produce the same display character. Do you read this string from right to left, or left to right? And you still must pick a collation for your application.<\/p>\n\n\n\n<p>For example, German has three collation systems that are in use (DIN, federal government and IBM) it was only a few decades ago that Spanish stopped treating \u201c<code>LL<\/code>\u201d and \u201c<code>CH<\/code>\u201d as a individual, separate letters for sorting dictionaries.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-2-nominal-scales-have-no-innate-ordering\">2) Nominal scales have no innate ordering<\/h3>\n\n\n\n<p>The relational model is based on sets, and sets have no concept of an ordering. They are completed wholes. The idea of an ordering is not part of the nominal scale, but it\u2019s a very useful thing to have in the <em>representation <\/em>of the attribute is measured on the scale.<\/p>\n\n\n\n<p>Many decades ago, I have a friend who taught American culture in China. Knowing that she could not read Chinese, the school Romanized the names of the students and printed them out next to the Chinese characters on a roster for her, so she could pronounce them (well <em>try to pronounce<\/em> them).<\/p>\n\n\n\n<p>The problem was that they had no concept of alphabetical order, so each printout of the 150+ names in her class were arranged differently every time. Any ordering on a nominal scale is an ordering of the <em>representation and not of the values<\/em> of that data element.<\/p>\n\n\n\n<p><em>Editor note: check <a href=\"https:\/\/en.wikipedia.org\/wiki\/Chinese_character_orders\">here<\/a> and <a href=\"https:\/\/www.pinyinjoe.com\/faq\/sorting-chinese-in-excel-and-word.htm\">here<\/a> for further information if you are as curious as I was.<\/em><\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-3-nominal-scales-have-no-computations\">3) Nominal scales have no computations.<\/h3>\n\n\n\n<p>Yes, you can play with the <em>representation <\/em>of a data element, but it really has nothing to do with the nature of the data element itself. It would make absolutely no sense to take the squares of the employee id numbers or to average them. To quote a textbook from the 1800s, a number represents a magnitude or quantity. A name for an entity is neither of these things.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-4-digits-are-very-useful-symbols-but-they-have-baggage\">4) Digits are very useful symbols, but they have baggage.<\/h3>\n\n\n\n<p>As long as I\u2019m doing these stories, let me repeat a classic folk story that\u2019s been told and credited to several different sources over the decades. The version I first learned was that one school broke into the schoolhouse of their football rival and turned three pigs (or bulls or other large messy animals, depending on who is telling the tale) loose. Painted on the side of the animals were the digits \u201c1\u201d, \u201c2\u201d, and \u201c4\u201d respectively; The rival school spent quite some time looking for the animal with a \u201c3\u201d painted on it.<\/p>\n\n\n\n<p>The baggage with using a string of digits to represent a nominal scale is that they are often seen as ordinal numbers, or as part of a sequence. They might well be for some side purpose (like as a numeric sequence useful mind tool), but it\u2019s not required. This is why for a few centuries, we have had ordinal markers really didn\u2019t order things.<\/p>\n\n\n\n<p>You know that \u201cChanel no. 5\u201d, \u201cWD-40\u201d, \u201c#2 Pencil\u201d or \u201cGate #9\u201d are names because they include markers like \u201c-\u201c, \u201cno.\u201d, \u201cnbr\u201d or \u201c#\u201d (Computer people call it a hash mark, and Telephone people call it an octothorpe. It used to be a pound sign and several other things).<\/p>\n\n\n\n<p>In fact, there were punctuation and display conventions that tell you the digits are part of a nominal scale this without special symbols. When you see the string \u201c5\u201d you are not really sure what it means by itself. But when you see \u201c005\u201d you know it has to be a nominal scale value. It may have some other properties; such as when you see\u201d007\u201d you immediately think of James Bond novels; if you\u2019re a real fan you know there\u2019s a \u201c008\u201d and immediately think of the whole fictional British spy apparatus.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-integer-is-not-a-validated-data-type-for-a-nominal-scale\">INTEGER Is Not A Validated Data Type For A Nominal Scale<\/h2>\n\n\n\n<p>The TV show \u201cMarried With Children\u201d briefly had a kid named <a href=\"https:\/\/marriedwithchildren.fandom.com\/wiki\/Seven\">7<\/a> staying with the Bundys. One of the Borg was named \u201c<a href=\"https:\/\/memory-alpha.fandom.com\/wiki\/Seven_of_Nine\">7 of 9<\/a>\u201d on \u201cStar Trek: The Next Generation.\u201d In the Peanuts comic strip there was a child named <a href=\"https:\/\/peanuts.fandom.com\/wiki\/555_95472\">555 95472<\/a>. There seems to be a history of using numbers for names on in pop culture when you want to make something funny or unusual.<\/p>\n\n\n\n<p>Why do people think it\u2019s funny? Because instinctively you know that an entity is a totally different sort of thing from a magnitude, quantity, or ordering. Nominal scales are for entities.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-designing-an-encoding\">Designing an Encoding<\/h3>\n\n\n\n<p>The Unicode people have done us a big favor. Every alphabet or symbol system you can find in Unicode must also include a set of symbols made up of Latin letters, digits, and a few punctuation marks. They exist so that all of the standard <a href=\"https:\/\/en.wikipedia.org\/wiki\/International_System_of_Units\">SI<\/a> abbreviations four units can be represented in any text on earth. That means this character set is where you should limit your choice of symbols.<\/p>\n\n\n\n<p>The string you pick can be a mix of alphanumeric symbols, and it can be fixed or varying length. Worst example of a confusing encoding is the alphanumeric strings allowed by the British postal system. They were constructed from the names of postal offices, most of which no longer exist. Compare this to the United States ZIP Code which can be validated with the simple regular expression <code>\u201czip_code LIKE [0-9]0-9]0-9]0-9]0-9]\u2019<\/code>.<\/p>\n\n\n\n<p>Note: Please remember there\u2019s a difference between validation and verification. Some ZIP code is valid when it is a string of five digits. But you cannot be sure that this code is actually in use; you have to verify it with the Postal Service.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-identity-is-the-worst-choice\">IDENTITY Is the Worst Choice<\/h3>\n\n\n\n<p>The absolute worst choice you can make in designing a table is to use the old Sybase <code>IDENTITY<\/code> table property column. It is a non-relational vendor extension that exist only because decades ago we built SQL on top of existing file systems. Just as a deck of punch cards or sequential tape file can have only one sordid order, tables can have only one <code>IDENTITY<\/code> column.<\/p>\n\n\n\n<p>First of all, this is a <em>table property<\/em> and not a relational column. Let\u2019s drop back two absolutely basic definitions and relational theory. A column is an attribute of the entity modeled by the row in the table. but the integers in an <code>IDENTITY<\/code> column have nothing to do with the entity at all. By way of analogy, imagine you have a parking garage. The parking spaces in this garage are sequentially numbered and as cars come in one at a time, they are assigned to a parking space. Think of punch cards and magnetic tape.<\/p>\n\n\n\n<p>The sequential entry has another problem. SQL and the relational model are supposed to be set oriented. That means when I insert a set of rows into a table, it is supposed to occur all at once, and not in a sequence as with <code>IDENTITY<\/code>, I can take the same set of inputs, shuffle them a little bit and then carry that data over to a second table or the second <code>IDENTITY<\/code> column will not have the same values assigned to the new rows.<\/p>\n\n\n\n<p>An actual correctly designed, key, such as the VIN (Vehicle Identification Number) would be a property of the automobile and not the parking space, so it would remain the same no matter where that vehicle shows up. Notice that includes automobile that appears outside of the parking garage completely. The VIN will be the same on your insurance papers, auto tags, police reports and so forth \u2013 it is a proper attribute and not some local table property.<\/p>\n\n\n\n<p>Ideally, we would like a nominal scale to be as universal as possible. <code>IDENTITY<\/code> is as local as possible. We would also like to be able to validate and verify the values of something for which we\u2019ve bothered using nominal scales.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-conclusion\">Conclusion<\/h2>\n\n\n\n<p>If you look at the overwhelming majority of designs for industry-standard encodings, you will see that they use the limited Unicode character set I just discussed, are fixed length strings (so that we can design simple computer files, paper forms and display screens easily) and over half of them have some kind of check digit, regular expression or other built-in validation.<\/p>\n\n\n\n<p>Yes, doing your job correctly has some overhead and requires professional mind tools. Just because using integers for nominal scales is quick and easy, do not become \u201cLarry the Cable Guy\u201d and decide that you just need to \u201cgitt\u2019er done!!\u201d when you\u2019re programming.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u201cOn a scale from 1 to 10, what color is your favorite letter of the alphabet?\u201d &#8211;Slogan on a T-shirt This is one of my favorite T-shirt gags because it teaches something about data types, scales and measurements. When you hear it, you automatically start to answer it as if it was a real question&#8230;.&hellip;<\/p>\n","protected":false},"author":96214,"featured_media":105370,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[53,143539],"tags":[159072],"coauthors":[6781],"class_list":["post-105368","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-featured","category-theory-and-design","tag-database-theory"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/105368","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/96214"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=105368"}],"version-history":[{"count":2,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/105368\/revisions"}],"predecessor-version":[{"id":105373,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/105368\/revisions\/105373"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media\/105370"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=105368"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=105368"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=105368"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=105368"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}