Sir Tony Hoare: Geek of the Week

After inventing the QuickSort algorithm, and designing the compiler for the Algol 60 programming language, Tony Hoare went on to apply axiomatic semantics to compiler design and his work and writings have since had a great influence on software engineering, and the way we specify, design, implement, and maintain computer programs. Now, at 75, he is working at Microsoft research on projects that will filter through to .NET languages.

Sir Charles Antony Richard (Tony) Hoare’s contribution to the development of software over the last four decades has been immense. He developed the efficient and widely used Quicksort algorithm and his career has included many pivotal points such as the advent of the formal language Communicating Sequential Processes (CSP) and he inspired the Occam programming language.

His grand narrative has always been about realising what a user ‘really’ wants, recognising how difficult it is for the technically inexperienced to visualise the impact of technology that is merely available in a description. His great intellect has given us Hoare logic specifically for reasoning about programs.

779-TonyHoare.jpg

One of the elite band of 50-odd computing researchers to have won the Association of Computing Machinery’s Turing Award ( computer science’s equivalent of the Nobel prize) Tony Hoare was born in Colombo (Ceylon now Sri Lanka) in 1934 and studied philosophy (together with Latin and Greek) at Oxford University where he went on to study graduate-level statistics. Following his national service in the Royal Navy where he learned to speak Russian he studied machine translation of languages at Moscow State University. In 1960, he took up a position as Programmer at Elliott Brothers in London, where he implemented ALGOL 60 and began developing algorithms. Promoted to the rank of Chief Engineer, he then led a larger team on a disastrous project to implement an operating system. After managing a recovery from the failure, he moved as Chief Scientist to the computing research division, where he worked on the hardware and software architecture for future machines.

These machines were cancelled when the Company merged with its rivals, and in 1968 Tony took a chance to apply for the Professorship of Computing Science at the Queen’s University, Belfast. His research goal was to understand why operating systems were so much more difficult than compilers, and to see if advances in programming theory and languages could help with the problems of concurrency.

In 1972 he co-authored with Ole-Johan Dahl and Edsger Dijkstra, ‘Structured Programming’, the thinnish black tome that became known as a call to arms, and the best academic book on software in that decade – an altogether rare example of a successful European collaboration, between a Norwegian, a Dutchman and an Englishman. In it, Hoare’s chapter covers the Verifying Compiler, a computer science ‘Grand Challenge’ in the style of the mathematician David Hilbert, where a program compiler would check that Boolean assertions in the code being compiled are always true as part of the compilation process.

In 1977 he moved to Oxford University, and undertook to build up the Programming Research Group, founded by Christopher Strachey. With the aid of external funding from government initiatives, industrial collaborations, and charitable donations, Oxford now teaches a range of degree courses in Computer Science, including an external Master’s degree for software engineers from industry. Under Hoare’s leadership the computer lab was involved with a number of European projects including the Z notation with Jean-Raymond Abrial which went on to be described as the Oxford ‘house style’ for the use of predicate logic and set theory in the specification of discrete digital software (and sometimes hardware) systems. During the next decade the laboratory’s projects, included ‘ProCoS’ on provably correct systems, which evolved into a working group, with many partners, including the University of Oslo.

“In my experience,
recursive processes
too often end in
vertical descent,
rather than take-off.”

In the 1980s Tony Hoare wrote a paper called ‘Programming: Sorcery or Science?’ in which he delineated the “progress” from craft to mathematical and engineering methods, from pre-enlightenment to the Enlightenment.

In October 1999 after a distinguished Oxford career and just weeks before he gained a knighthood in the New Year honours list, he joined the formidable Roger Needham at the Microsoft Research lab team in Cambridge, England. It is a world far from the oak panelling, sherry, footsteps echoing across the quad people expect. The lab’s duty is to develop technologies that may turn into products in five to ten years’ time but also to help out with more immediate projects when it can.

Throughout more than thirty years as an academic, Tony Hoare has maintained strong contacts with industry, through consultancy, teaching, and collaborative research projects. He is now Principal Researcher at Cambridge and Emeritus Professor at Oxford University.

RM:
In 1965 you were designing the first comprehensive type system for references for ALGOL W and you implemented a null reference which you said was a billion dollar mistake. Why did you put it in there in the first place and were there solutions to non-null you rejected at the time?
TH:
In ALGOL W there was already a way of declaring a class containing ‘variant records’. Each variant could have a different set of attributes. So if you want a null reference for a class, you could include a variant that had no attributes at all. Every access to an attribute of a record (object) of a class with variant records has to be protected by a conditional test or case-clause, checking that the reference pointed to a record which included the relevant field. Thus the syntax of the language (checked compile-time) would force every access to an attribute to be accompanied by a test of validity and an action for the null case. Problem solved! But only by passing it on to the user. There are pragmatic arguments against this kind of solution. The tests would often be unnecessary, and this would have an adverse effect on efficiency. Furthermore, what if the tested reference variable is changed between the test and the access? Finally, a scan-mark garbage collector requires that every reference has an initial value, and null is the right choice for that. There is no easy alternative in the case that cyclic data structures are required. I was not able to predict the enormous cumulative cost of null reference errors in the decades to come. My only other excuse was that if I had not invented the null reference, somebody else would have.

My friend Edsger Dijkstra had a very principled objection to null references, because of possible misinterpretation of the model of reality: For example, a bachelor is often encoded by setting the wife field to null. This could give the impression that all bachelors are polyandrously married to the same wife.

RM:
While we are on the subject of ALGOL W why was it never adopted in industry?
TH:
Industry in general never adopted any of the languages designed in the ALGOL tradition, until PL/I . One reason was (and still is) that industry needs to continue to use the languages in which their existing applications programs are written. And it is equally important to use a language for which compatible implementations will be fully supported into the indefinite future, and only IBM could inspire the necessary confidence.
RM:
The development of software is changing swiftly. What do you see as major developments in say, the next three years?
TH:
For the reasons explained above, most of the practices of software engineers will remain unchanged for a long time. But I hope to see gradual increase in the power of compilers and other programming tools to reduce the frequency and consequences of programming error. Increased use of standard design patterns may help in this.
RM:
Do you think Spec# will become part of C# eventually and therefore part of the everyday programmer’s toolkit?
TH:
Spec# is a ‘research vehicle’, used for trying out ideas that may eventually be included in C#. For this reason, Spec# must remain less constrained by legacy than C#; and like any prototype, its primary function, it should remain several steps in advance of the existing status of C#.

It may be some time before the typical ‘everyday programmer’ uses C#. A much earlier and more relevant development for most programmers will be the inclusion of ‘contracts’ (generalised assertions) in a whole range of other more widely used languages, including Visual Basic.

RM:
Would you agree that with C#, it feels like we’re getting back to having grown-up design languages, rather than languages just emerging from short-term necessity? Why did it take so long for this to happen do you think?
TH:
I have the same feeling, that C# and Java are languages designed to help programmers to design and write correct programs, and to know that they have done so. Of course, it is possible for a language and compiler to go a lot further in this direction; and this continues to be the topic of much useful and interesting research.

In general, I don’t think programmers are any more resistant to change than other professionals have been. Think of surgeons adopting antiseptic procedures in operating theatres. Or drivers adopting seatbelts. Or food manufacturers adopting sell-by dates, and full labelling of contents. The factor that will promote change in programming practice is the easy availability of more sophisticated design automation tools for programming.

RM:
When Microsoft began the Singularity research project it started with the question: ‘what would a software platform look like if it was designed from scratch with the primary goal of dependability?’ How is the company setting out to answer this question?
TH:
The ideal of ‘correctness by construction’ is one that has inspired long-term scientific research (including my own) for many years. In addition to internal projects like Singularity, Microsoft continues to promote and collaborate with continuing academic research towards this ideal. The benefits of success in the research will stretch far beyond Microsoft to all programmers and users of computers.

In parallel with this idealistic and long-term research, Microsoft has at least ten years’ experience of the development and practical use of tools to detect and remove vulnerabilities from existing legacy code. The scientific theories and methods that are relevant to this shorter term goal are often the same as those relevant to ‘correctness by construction.’ That is my grounds for hope of fruitful interaction between these complementary approaches to correctness research, based on academic ideals and on immediate industrial need.

RM:
Does the recruitment by Microsoft of leading academic Computer Scientists a symptom of inadequacy of Universities as environments for cutting edge research?
TH:
Not in my case! I did not make my move to Microsoft until approaching the University age for compulsory retirement, after which I had no choice.

In general, the attractions (and frustrations) of academic and industrial research are complementary. The teaching role of Universities is highly attractive, providing an essential stimulus for constant simplification of theory and for comprehensive summarization of practice. Well educated graduates also give an essential route to application for the results of advanced research.

Microsoft and other industries offer a more direct and more immediate route to the technological marketplace, which proves highly attractive to many researchers. And for research in software development itself, Microsoft provides an enormous fund of experience, and experimental material, as well as a direct route to beneficial application. These are the factors that attracted me to Microsoft, and which continue to keep me fully engaged.

In my answer to your previous question on correctness by construction, I have already suggested complementary roles for researchers who are more attracted towards academic ideals, and those who are seeking shorter-term industrial application.

RM:
Most innovations in software have emerged from experience and experiment, without any appeal to an underlying theory. When John Backus invented Fortran, he had no theory to work by. What is stopping the same thing happening now?
TH:
In all branches of engineering science, the engineering starts before the science; indeed, without the early products of engineering, there would be nothing for the scientist to study! So cathedrals and bridges were built (and still remain standing!) long before the theories of civil engineering explained why they fall down; and similarly aeroplanes flew before aerodynamics, ships sailed before hydrodynamics, etc. But when the theory eventually became available, it was exploited to make all these products safer, more efficient, and much more cost-effective.

I would go even further, and suggest that computing software has surprisingly often been based on prior theory. Taking your own example, FORTRAN was firmly based on the long familiar concept of the mathematical formula, and this was surely its secret of success. Relational data bases were based on the theory of relations. Spreadsheets were based on accountancy theory and practice. Perhaps a basis in sound theory is just what enables the quality and structure of software to survive the long process of evolution which follows its first delivery.

RM:
Do you still think that computing needs a ‘grand challenge’ to inspire it much in the same way that the lunar challenge in the 1960s sparked a decade of collaborative innovation and development in engineering and space technology?
TH:
Yes, we have much to learn from other branches of Science, which have made occasional ‘breakthrough’ progress as a result of externally or internally promoted Grand Challenges.

At the present time there is a risk that research in Computer Science is too fragmented to make any real impact on the current scale of the problems of software construction. In the early days of any new branch of Science, research is conducted by individuals or by small teams. But as the subject matures, researchers should embrace opportunities to co-operate more widely, in larger teams, and on longer term goals, like ‘correctness by construction.’ They also need to consider challenges to compete on a grander scale, with more objective criteria of success.

Worldwide collaboration is now the norm in physics and astronomy; and more recently in genetics and molecular biology. It was the international Grand Challenge of the Human Genome project that triggered the necessary shift in the research ethos of biologists towards the larger scale and the longer term accumulation of knowledge. After twenty years, its results are beginning to trickle through to medical practice. A similar breakthrough is now in prospect for the verification of software.

RM:
On this topic do you think we’ll have to wait much longer where machine intelligence surpasses that of humans and takes off near-vertically into recursive self-improvement?
TH:
I doubt that this is in immediate prospect. In my experience, recursive processes too often end in vertical descent, rather than take-off. More seriously, I think we should never trust the intelligence of a computer unless some responsible human understands at some level what is going on. Computers cannot take responsibility for what they do (how would you punish them?).
RM:
Do you agree with Ray Kurzweil that one day we will develop the ability to reprogram our biology through nanotechnology using nanobots – blood-cell sized devices in our bloodstream that will keep us living for however long we wish to live? Or is that thinking just too outlandish?
TH:
I’m afraid it is too outlandish, at least for me to think about.
RM:
A little while ago I interviewed Bjarne Stroustrup and he said Kierkegaard was an influence on his conception of C with classes and later the C++ language. Has any philosopher influenced your work?
TH:
I have always been a follower of William of Occam, whose razor has been an excellent guide in the design of programming languages and other software.

My earliest hero as a philosopher was Bertrand Russell. His logical atomism was based on the view that external reality is a mathematical construction, built on the basis of more primitive data, for example data given by the senses. This may not be a realistic model of the real world; but it certainly applies to what goes on inside a computer! Everything that programmers deal with is a logical construction from just two numbers, zero and one – you could hardly get more primitive data than that.

Other philosophers and logicians whom I have quoted in my published articles include Aristotle, Plato, Leibnitz, Boole, Popper, and Quine. It is a pleasure to acknowledge their inspiration again.