Right now we in the States are being bombarded with Republican Presidential candidates debating on television, immediately followed by public opinion polls. You do not known how many candidates are in the game from week to week, nor who is the front runner unless you follow the debates every day.
This kind of voting appears in a lot of business situations, even though we do not usually think of it as a ballot. Any purchase choice among several products is a vote, pushing the buttons on your television remote is a vote for one program out of the gazillion channels on your satellite dish and so many other decisions.
The voting problem has been around for centuries, but the modern answer was given by Economist Kenneth Arrow. He won the Nobel Memorial Prize in Economics and National Medal in Science (2004), so we know he is smart. His theorem is known as “Arrow’s Impossibility Theorem” in the literature. He started with some simple assumptions about rules we want in a “fair election system” and then he shows that things don’t work like you think they should. Given at least two voters and at least three candidates, it is impossible to aggregate individual preferences without violating some of the desirable conditions.
The desired conditions are:
- Every voter has a strictly monotonic ordered preference function. In symbols, P(c1, c2) returns the best-liked candidate of the pair. Again, in symbols, P(c1, c2) ∧ P(c1, c2) ⇒ P(c1, c3). Or in English there is no “scissors, paper, stone” circular rankings or ties.
- Every voter votes his conscience and locks in his preference function. That means given candidates X and Y, and a preference of X over Y, then the voter cast his vote this way in every ballot. There is no political game playing allowed.
- There is no dictator. No one voter holds enough votes to control the outcome by himself. You will also see this expressed as “one man, one vote”, but this condition can be weaker than that. For example, in a stock holder’s meeting, one shareholder can have more votes (shares) than another.
- Majority rules. If every voter prefers candidate X over candidate Y, then the electorate prefers X over Y.
This seems to be a reasonable voting system. We can see that if there are two candidates, then we will get either a tie or a winner; let’s ignore ties for the rest of this article. The problems start with three candidates and three voters. Here is a look-up table of the preference functions.
|
c1 |
c2 |
c3 |
Voter A |
1 |
2 |
3 |
Voter B |
2 |
3 |
1 |
Voter C |
3 |
1 |
2 |
If you look at the table for a bit, you will see a “scissors, paper stone” relationship in this. There is no clear mandate from the electorate. This is called the Condorcet Paradox, named after the Marquis de Condorcet (1743 – 1794) who first wrote about it. It is easier to see with an election made up of pair-wise ballots.
If we hold an election between candidates c1 and c2, the winner is c1 by two votes from voters A and C. We now face off with c1 and c3 on a second ballot. The winner is c3.
Now let’s have another election. But this time the first ballot pairs c1 and c3, with an come for c3. The second ballot pairs the winner, c3 against c2. The winner is c2.
Finally, let’s hold a third election. The first ballot pairs c2 and c3 to get a victory for c3. The second ballot pairs the winner, c3 against c1. The winner is c1.
As a practical matter, a voter could have falsely cast a ballot for a weaker candidate in the first ballot so that his favorite would win in the final ballot. If that is not clear, look at the first election that starts with c1 and c2. Voter B falsely cast his first vote for c2, who wins. The second ballot of the election is now between c2 and c3 so the winner is c2, not the original winner c3. This is why we made rule #2 about voting your conscience.
Voting Systems
Let’s consider a few different voting systems. We will give examples from a nine member electorate and a four candidate slate.
Plurality Voting
The winner is a simple Plurality Voting system is the candidate that got the most votes. Not a majority, mind you. So if one candidate has a small but loyal voter bloc, the most unpopular candidate will win.
Hare Voting
This is a “last man standing” system in which least popular candidate is removed from the next ballot until there is a winner. The two places this is used in the real world are television reality show where we vote a contestant “off the island” each week and the ballot for the Science Fiction Hugo awards.
Borda Counts
John Claude de Borda was another French mathematician who was interested in voting system. His approach was to have voters assign a number of votes to each candidate and total them. Even if there was no clear winner, the candidate with the best total of votes would be the winner. The weights assigned start at zero for the least liked candidate and increment to (n-1) votes for your favorite.
The names of the voters are not important; we really care about the number of votes that each preference function has in a given election.
Examples
We are looking at blocs of votes. Now look at a four way race with a total of nine votes in play:
|
First |
Second |
Third |
Fourth |
3 votes |
c1 |
c4 |
c2 |
c3 |
1 vote |
c1 |
c2 |
c3 |
c4 |
1 vote |
c2 |
c3 |
c4 |
c1 |
1 vote |
c2 |
c3 |
c1 |
c4 |
1 vote |
c3 |
c2 |
c4 |
c1 |
1 vote |
c3 |
c4 |
c2 |
c1 |
1 vote |
c4 |
c3 |
c2 |
c1 |
If you use a plurality vote, sum up the First place column. The winner is Candidate c1 who does not have a majority, but holds a large bloc vote.
|
Vote count |
c1 |
4 |
c2 |
2 |
c3 |
2 |
c4 |
1 |
The Hare system gives us Candidate c3. Here is how each ballot works:
|
Ballot #1 |
Ballot #2 |
Ballot #3 |
Ballot #4 |
c1 |
4 |
4 |
4 |
Loser |
c2 |
2 |
2 |
Loser |
Loser |
c3 |
2 |
3 |
5 |
9 |
c4 |
1 |
Loser |
Loser |
Loser |
Now let’s do a Borda vote on the same electorate. The winner is Candidate c2, 15 points. It is easier to arrange the array a little differently. Here is the vote totals.
|
Bloc #1 |
Bloc #2 |
Bloc #3 |
Bloc #4 |
Bloc #5 |
Bloc #6 |
Bloc #7 |
Total |
c1 |
3*3 |
3 |
0 |
1 |
0 |
0 |
0 |
13 |
c2 |
3*1 |
2 |
3 |
3 |
2 |
1 |
1 |
15 |
c3 |
3*0 |
1 |
2 |
2 |
3 |
3 |
2 |
13 |
c4 |
3*2 |
0 |
1 |
0 |
1 |
2 |
3 |
13 |
If you arrange the pair-wise votes in the sequence c1, c2, c3, c4 the winner is c4
c1 |
4 |
c2 |
5 |
c2 |
6 |
c3 |
3 |
c2 |
4 |
c4 |
5 |
SQL and Voting Experiments
Now that I have destroyed any hope of having a provably fair voting system, we get to write code. Let’s start with tables for voters and for candidates, but not worry too much about the details since their existence is all we need for holding an election.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
CREATE TABLE Voters (voter_id CHAR(3) NOT NULL PRIMARY KEY CHECK(voter_id LIKE 'v[0-9][0-9]), ..); INSERT INTO Voters ('v00'), ('v01'), ('v02'), .. CREATE TABLE Candidates (candidate_id CHAR(3) NOT NULL PRIMARY KEY CHECK(candidate_id LIKE 'c[0-9][0-9]), ..); INSERT INTO Candidates ---at least three of them, please ('c00'), ('c01'), ('c02'), .. |
The preference function is a relationship, so it gets its own table.
1 2 3 4 5 6 7 8 9 10 |
CREATE TABLE Voter_Preferences (voter_id CHAR(3) NOT NULL REFERENCES Voters(voter_id) ON DELETE CASCADE, candidate_id CHAR(3) NOT NULL REFERENCES Candidates(candidate_id) ON DELETE CASCADE, PRIMARY KEY (voter_id, candidate_id), pref_rank SMALLINT NOT NULL CHECK(pref_rank > 0)); |
Now we need queries and views for the various voting methods. It would also be nice to have some data with more than two candidates. Let’s try a simple plurality vote to get things started
1 2 3 4 5 6 7 8 9 10 11 12 13 |
WITH Ballotbox (candidate_id, vote_cnt) AS (SELECT candidate_id, COUNT(*) FROM Voter_Preferences WHERE pref_rank = 1 GROUP BY candidate_id) SELECT candidate_id AS winner_candidate_id FROM Ballotbox AS B1 WHERE B1.vote_cnt =(SELECT MAX (B2.vote_cnt) FROM Ballotbox AS B2); |
This will give us ties. How do we want to handle them? Skipping that question, let’s write a simple function to do pair-wise ballots.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
CREATE PROCEDURE Pairwise_Ballot (@c1 CHAR(3), @c2 CHAR(3)) AS ;WITH Single_Voter_Prefs (voter_id, candidate_id, pref_rank) AS (SELECT voter_id, candidate_id, ROW_NUMBER() OVER (PARTITION BY voter_id ORDER BY pref_rank ASC) AS pref_rank FROM Voter_Preferences WHERE candidate_id IN (@c1, @c2)), Single_Voter_Winners (voter_id, candidate_id) AS (SELECT voter_id, candidate_id FROM Single_Voter_Prefs WHERE pref_rank = 1) SELECT candidate_id, COUNT(*) AS vote_cnt FROM Single_Voter_Winners GROUP BY candidate_id; |
There are other voting systems and you might want to research them. I happen to like the Russian system where you vote against candidates, not for them. Yes, it is possible for any or all candidate to go negative.
I used to favor adding “None of the above”, but then I realized that the government printing office would put this option on the first line of the ballots.
Homework
Now it is time for homework assignments:
- Write a Borda voting procedure.
- Write a Hare voting procedure.
The best answer gets a copy of the Fourth edition of SQL FOR SMARTIES and a $100 Amazon Voucher. Start your SQL engines, Gentlemen!
The best answer will be given a prize of a $100 Amazon voucher. A month after the challenge is sent out, the judge’s decision and comments will be sent out in the newsletter, and published on the site.
Joe Celko will judge the answers to this puzzle. Your answer should :
1) Solve the problem — Duh!
2) Avoid proprietary features in SQL Server that will not port or be good across
all releases, present and future.
3) Be clever but not obscure.
4) Explain how you got your answer.
Load comments