The Computer that Swore

Database Developers occasionally get crazy ideas into their heads. Phil Factor should know; He confesses that he's had a few completely loopy notions that he pursued to destruction before realising his error.

Database Developers occasionally get crazy ideas into their heads. I should know; I’ve had a few completely loopy thoughts that I’ve pursued to destruction. The one that still makes me wince is the time when I caused a production server to swear at the customers and insult them.

To understand what happened, I’ll have to explain my thinking. Ordinary people have great difficulty in remembering numbers, even in the short-term.. Computer people find this hard to appreciate, and often insist that end-users must remember them so as to identify themselves, their bank accounts or their invoices. These numbers often end up being written around the edge of the monitor, or on bits of paper around the desk, which rather defeats the object. Errors abound. On websites, there was a time when we tried giving out numbers that represented the primary key for a customer, but we’ve all now abandoned that idea as being hopeless, and now use the email address to provide a unique ID for visitors.

People can remember words far more easily than numbers and it occurred to me that, whenever we wanted customers to remember an ID, we ought to translate it into a memorable nonsense-word, and give them that instead.

My idea was to create a program that would translate numbers into words, and vice versa. It meant that, instead of a number, you could give someone a nonsense word, rather like the name of an alien space-traveller in a Science-Fiction film, which they found they could remember. When it was fed back through the routine, it produced the original number. Genuis!

As it happens, it wasn’t such a good idea after all, but we’ll come to that in due course.

A base-245 numbering system

All you need to do is to create a table with the main consonant/vowel combinations of most languages:

CREATE TABLE [dbo].[Syllables] (
   [TheIndex] [int] IDENTITY (1, 1) NOT NULL ,
   [Syllable] [varchar] (4) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
) ON [PRIMARY]
GO

Then, a simple stored procedure can be written that stocks it:

CREATE PROCEDURE CreateSyllableTable
AS

DECLARE @Possibilities VARCHAR(255)
SELECT @Possibilities=’B C D F G H J K L M N P R S T V W Z SCSKKNSNSPSTBLCLFLGLPLSLBRCRDRFRGRPRTRVRSHSMGHCHPHRHWHBWCWSWTW’
DECLARE @ii INT
DECLARE @jj INT
DECLARE @iiMax INT
DECLARE @Consonants VARCHAR(2)
DECLARE @Vowel VARCHAR(2)
SELECT @ii=1, @iiMax=DATALENGTH(@possibilities)
WHILE @ii<=@iiMax
   BEGIN
   SELECT @Consonants=RTRIM(SUBSTRING(@Possibilities,@ii,2))
   SELECT @jj=1
   WHILE @jj<=5
       BEGIN
       SELECT @Vowel=SUBSTRING(‘AEIOU’,@jj,1)
       INSERT INTO Syllables(Syllable)
           SELECT @consonants+@Vowel
       SELECT @jj=@jj+1
       END
   SELECT @ii=@ii+2
   END

Execute this stored procedure so as to fill the Syllable table. The result is a list of around 245 vowel/consonant combinations:

BA, BE, BI, BO, BU, CA, CE, CI, CO, CU, DA, DE, DI, DO, DU, FA, FE, FI, FO, FU, GA, GE, GI, GO, GU, HA, HE, HI, HO, HU, JA, JE, JI, JO, JU, KA, KE, KI, KO, KU, LA, LE, LI, LO, LU, MA, ME, MI, MO, MU, NA, NE, NI, NO, NU, PA, PE, PI, PO, PU, RA, RE, RI, RO, RU, SA, SE, SI, SO, SU, TA, TE, TI, TO, TU, VA, VE, VI, VO, VU, WA, WE, WI, WO, WU, ZA, ZE, ZI, ZO, ZU, SCA, SCE, SCI, SCO, SCU, SKA, SKE, SKI, SKO, SKU, KNA, KNE, KNI, KNO, KNU, SNA, SNE, SNI, SNO, SNU, SPA, SPE, SPI, SPO, SPU, STA, STE, STI, STO, STU, BLA, BLE, BLI, BLO, BLU, CLA, CLE, CLI, CLO, CLU, FLA, FLE, FLI, FLO, FLU, GLA, GLE, GLI, GLO, GLU, PLA, PLE, PLI, PLO, PLU, SLA, SLE, SLI, SLO, SLU, BRA, BRE, BRI, BRO, BRU, CRA, CRE, CRI, CRO, CRU, DRA, DRE, DRI, DRO, DRU, FRA, FRE, FRI, FRO, FRU, GRA, GRE, GRI, GRO, GRU, PRA, PRE, PRI, PRO, PRU, TRA, TRE, TRI, TRO, TRU, VRA, VRE, VRI, VRO, VRU, SHA, SHE, SHI, SHO, SHU, SMA, SME, SMI, SMO, SMU, GHA, GHE, GHI, GHO, GHU, CHA, CHE, CHI, CHO, CHU, PHA, PHE, PHI, PHO, PHU, RHA, RHE, RHI, RHO, RHU, WHA, WHE, WHI, WHO, WHU, BWA, BWE, BWI, BWO, BWU, CWA, CWE, CWI, CWO, CWU, SWA, SWE, SWI, SWO, SWU, TWA, TWE, TWI, TWO and TWU

This then gives to a base-245 numbering system where each syllable represents a digit.

Encoding and decoding the numbers

Encoding a number is then easy:

CREATE PROCEDURE encode
/*
   Don’t forget to execure CreateSyllableTable before
   use
eg
Declare @TheCode varchar(100)
Execute Encode 694852357567584,@TheCode output
Select @TheCode
*/
@TheNumber DECIMAL(18,0),
@TheCode VARCHAR(100) output
AS
DECLARE @Dividend INT
DECLARE @BigDividend DECIMAL(18,0)
DECLARE @BigNumberSoFar DECIMAL(18,0)
DECLARE @Code VARCHAR( 255)
DECLARE @Mod INT
IF @TheNumber<2147483647
   BEGIN
   SELECT @Dividend = @TheNumber
   SELECT @Code=”
   WHILE @dividend >0
       BEGIN
       SELECT @Mod=@Dividend % 245
       SELECT @Dividend = @Dividend / 245  
       SELECT @Code=Syllable+@Code FROM Syllables WHERE TheIndex=@Mod+1
       END
   END
ELSE
   BEGIN
   SELECT @BigDividend = @TheNumber
   SELECT @Code=”
   WHILE @Bigdividend >0
       BEGIN
       SELECT @BigNumberSoFar=@BigDividend
       SELECT @BigDividend = FLOOR( @BigDividend / 245)  
       SELECT @Mod= @BigNumberSoFar-(@BigDividend* 245)
       SELECT @Code=Syllable+@Code FROM Syllables WHERE TheIndex=@Mod+1
       END
   END
SELECT @TheCode=@Code

GO

The results are hardly longer than the original numbers, and generally much easier to remember. They can then be quickly decoded.

CREATE PROCEDURE Decode
/*
   Don’t forget to execure CreateSyllableTable before
   use
eg
Declare @TheNumber int
Execute Decode ‘BA HEBO DRACRO’,@TheNumber output
Select @TheNumber
*/
@TheCode VARCHAR(100),
@TheNumber INT output
AS
DECLARE @Num INT
DECLARE @ii INT
DECLARE @iiMax INT
DECLARE @TheSyllable VARCHAR(4)
DECLARE @TheMultiplicand INT
SELECT @Num = 0
SELECT @TheCode=REPLACE (@TheCode,’ ‘,”)
SELECT @TheCode=REPLACE (@TheCode,””,”)
SELECT @TheCode=REPLACE (@TheCode,’-‘,”)
SELECT @TheCode=REPLACE (@TheCode,’,’,”)
SELECT @ii=1,@iiMax=DATALENGTH(@TheCode)
WHILE @ii<@iiMax
   BEGIN
   SELECT @TheMultiplicand=TheIndex-1,
          @THeSyllable=Syllable 
      FROM Syllables
      WHERE SUBSTRING(@TheCode,@ii,100) LIKE Syllable + ‘%’
   IF @TheMultiplicand IS NULL
       BEGIN
       PRINT ‘Your code is corrupted and cannot be decoded’
       RETURN 1
       END
   SELECT @Num=(@Num*245) + @TheMultiplicand
   SELECT @ii=@ii+DATALENGTH(@TheSyllable)
   END
SELECT @TheNumber=@Num
RETURN 0

GO

So far, so good: you can then give your customers nonsense words, and these can either be used as the foreign key (they will be guaranteed to be unique) or translated into the original Identity field. In situations where every object in your database, whether it be customers, products, purchases or whatever, has a unique name, then it is even better; you can just key in the nonsense word, and the system knows what you are referring to and can navigate straight there.

Testing the App

I then tested it out on an application. It worked perfectly. The users of the system found that they could get the correct invoice onscreen just by typing ‘Bupris Lona’ (actually 69485235), or a purchase order called ‘Bifu Glofro’ (actually 30586703). They could actually remember these names too, and the requirement for sticky notes and pencils plummeted. The users were delighted and decided that I was ‘on their side’ against the alien force of geeks in IT. In conversation. I’d hear people from accounts refer affectionately to a product or salesman as ‘Gaci Skofo’ or ‘Bub Wivro’. They were able to remember these names and write them accurately into the application.

Excellent, I decided. I’ve really discovered something here. I then put a similar routine into a commercial website I was writing. At first, the business was puzzled by the system because they expect impossible numbers from computer systems just as they used to expect flashing lights from computers. Then, when I’d demonstrated what I meant, they took to the idea with enthusiasm. As I’d tested the routine at length, I felt completely confident this time.

Swearing at Customers

My undoing was this: instead of seeding the tables from a large number, over four million, as with the previous application, this time they insisted that I seeded the customer table from 1 because the CustomerIDs started from 1.

Generally, I always choose a good healthy number to start public-facing identity fields because I dislike letting anyone know how many customers we actually have, or the number of purchase there have been; but at the time, I was too engrossed with other things to think through the potential consequences.

When I am nursing a new website, I check things every night before going to bed, just to peer around and see what is going on, and make sure that all is well. What I do is to maintain a copy of the website on a local server. I make a comparison using SQL Compare and use the application to see what has changed or been added during the day. Then I re-synchronise. When a website is just starting out, one can see almost at a glance what has happened during the day and one can pick up all sorts of problems that way, before they become a crisis.

All seemed well this particular night. We’d done good business and had got to customer no 4660. I sleepily looked at the email message queue to see the message go out for this customer’s purchase:

Dear Mrs xxxxx

Thank you for purchasing from the Kamakaze Laxative Company (Not the real name I hasten to add). Your customer ID is Fuca and your password…. (blah blah)’

Suddenly, I was jolted awake. Was I seeing this correctly, or had I fallen asleep at the keyboard and was suffering a ghastly dream. The computer had suddenly been possessed with ‘Gilles de la Tourette syndrome’ and was sending obscene words to the customer!

No! In a flash, I realised that it was my fault for insisting on memorable Ids – coupled with the fact that because I’d had to start at 1, the names were very short. I managed to stop the message going out and sat back in my chair in relief. Then I thought to myself ‘I wonder what else we’ve sent out?’

The result was not a pretty sight. The English language has a number of four-letter words that are entirely innocent but look awful. For some reason, the eye seems to convert them to the nearest politically inappropriate word. Customer 1001 had been sent an Email informing him that his user ID was ‘Buger’. Customer 4415 had been sent an email assigning the name ‘Foca’.

If I hadn’t spotted the problem when I did, there were a range of offensive words that might have gone out. Heaven only knows what would have happened when we got to N. I did not sleep well that night.

In the morning, I went to see the Boss, with the proverbial wet newspaper down the back of the trousers. (Ed: aged Boarding-School joke). I told him we’d assigned the userID ‘Foca’ to one of his customers. He pulled a face and got on the phone to the customer. A lady answered. She was very considerate but said that she had been rather surprised to receive the ID. ‘It is all right you people swearing like that in the office, but we really don’t appreciate it in the home’ she laughed. My boss was most charming to her; it was a revelation, as it was a side to his character I hadn’t seen.

I hurriedly reset the identity fields to a higher number, and made the routines more complex so that they couldn’t come up with the commonest offensive syllables. I didn’t know many of them myself, despite having gone to a boarding school, but a half hour in Dispatch gave me an encycopaedic knowledge, which I then used to cleanse the routine.

The system soldiers on, cured of its propensities for bad language, and still much appreciated by the users. The staff of Dispatch still talk of the strange guy from IT who rushed around with a notebook, excitedly recording all their foullest language. The incident has made me even more obsessional about ‘nursing’ my websites when they are newly launched because computer systems can fail in unexpected ways.

Most of all, however, I am far more cautious of my own big ideas and less eager to bully them through to implementation.