Brian Kernighan: Geek of the Week

When anyone mentions 'Kernighan and Ritchie', we all know what they are referring to: that brief book that introduced the C language to programmers, and set a high standard for all subsequent books on computer languages. Now over thirty years later it is still in print and translated into over 20 languages, being required reading for undergraduates. We sent Richard Morris to interview Professor Brian Kernighan

Brian Kernighan’s name became widely known through co-authorship of the first book on the C programming language with Dennis Ritchie and maintains that C has still the best balance he’s ever seen between power and expressiveness in a computer language.

1033-brian_kpic.jpg

‘You can do almost anything you want to do by programming fairly straightforwardly and you will have a very good mental model of what’s going to happen on the machine; you can predict reasonably well how quickly it’s going to run, you understand what’s going on and it gives you complete freedom to do whatever you want. C doesn’t put constraints in your way, it doesn’t force you into using a particular programming style; on the other hand, it doesn’t provide lots and lots of facilities, it doesn’t have an enormous library, but in terms of getting something done with not too much effort, I haven’t seen anything to this day that I like better. There are other languages that are nice for certain kinds of applications, but if I were stuck on a desert island with only one compiler I’d want a C compiler.’

He began working with computers in the mid to late 1960s, and it was entirely by accident. He saw his first computer in 1963, an old IBM 650 but didn’t do any serious programming until the following year.

‘When I went to graduate school there was a Computer Science program in the Electrical Engineering Department at Princeton. This was fairly typical of a lot of places: computer science was not a separate academic field it was just part of some department that might have a computer or people interested in computation, so I just backed into it, entirely by accident. This has been a lucky accident, because obviously the field has had a lot of interesting things happen.

“Much programming today
seems to involve gluing
together existing
components, using fairly
high level languages
with weak type systems…”

Dennis is also the coauthor of the AWK and AMPL programming languages (the ‘K’ in AWK  stands for ‘Kernighan) and in collaboration with Shen Lin he devised well-known heuristics for two NP-complete optimization problems: graph partitioning and the travelling salesman problem. He coined the term Unix in the 1970s as well as the expression ‘What You See Is All You Get (WYSIAYG)”, a sarcastic variant of the original ‘What You See Is What You Get’ (WYSIWYG). Kernighan’s term is used to indicate that WYSIWYG systems might throw away information in a document that could be useful in other contexts.

His ‘Software Tools’ series spread the essence of ‘C/Unix thinking’ with makeovers for BASIC, FORTRAN, and PASCAL and most notably his Ratfor (rational FORTRAN) was put in the public domain.

He now works as a professor at the Computer Science Department of Princeton University. Princeton is the fourth-oldest college in the United States, Chartered in 1746,

RM:
What were the fundamental flaws in other languages that, you believe, drove the development of C?
BK:
You really ought to ask Dennis Ritchie about this, since C is entirely his work.  My sense is that C was not really a reaction to the flaws of other languages but more just a natural extension of B, mainly by adding types.  Other languages around at the time surely provided inspiration, whether positive or negative – for example, PL/1, used in Multics, showed the value of a high-level language, though PL/1 itself was large and complicated.  BCPL was quite simple and spare by contrast, and B was even more so; both managed to be expressive with minimal syntax.  BLISS was also in use at that time, again showing the benefits of high-level languages for system programming, though BLISS, like BCPL and B, was word-oriented and didn’t really match machines with different sizes of data types.
RM:
Do you remember how soon after your book on C did you see the language begin to take over the industry?
BK:
I have no memory of this, though it must have been quite a few years before C ‘took over’ anything.  The book was first published in 1978.  C was used in various workstation and minicomputer systems, and it got a second wave of acceptance when the IBM PC and clones arrived in the early 1980’s.  But it was a long time before it became significant.
RM:
Do you find that writing and programming are similar intellectual exercises?
BK:
In one sense, yes: they both involve writing and rewriting to find a better way to express something.  When I write prose, I rework a lot; the same is often true of my code, though perhaps less so because once something ‘works’, there’s less reason to massage it further.  On the other hand, I don’t usually have to debug my prose, but my programs always need to be fixed, so in that respect, programming is different.
RM:
What gets you started writing? Is it an idea, an image, a situation or event, a phrase, something else? Do you have a particular reader in mind when you sit down and begin or write?
BK:
It depends.  I write a monthly opinion column for the local student newspaper, and that is very often inspired by some random observation about campus life for which I can see a natural title, and then I just fill in the details until it’s long enough.  A big project like a book takes a long time to figure out, puzzling over organization and content and examples, before writing much text.  But the target reader is usually clear; it’s likely to be a working programmer since the books tend to be descriptions of some kind of programming language or environment.
RM:
AWK and AMPL are two languages you have been involved in developing. How do you design your code and how do you structure it?
BK:
I would distinguish between designing the language and designing the code.  For the former, there’s no substitute for trying to write things in the proposed language, both personally and by dragooning others, writing as many things as possible, until the language starts to feel comfortable.  Of course this is easier if there’s a real implementation, which is why tools like compiler compilers like Yacc are so valuable – one can make syntax changes very easily.  As to design of code itself, I don’t do nearly as much design as I probably should.  When I don’t know where I’m going, it’s usually easier to write something that gets me started, and then revise it repeatedly as I better understand the problem and the solution.  Unfortunately this often leads to pretty awkward code, which, as I suggested above, may not get as much cleanup as it needs.  On the other hand, it’s realistic for doing a totally new language.
RM:
What languages influenced the design of AWK and AMPL?
BK:
AWK was inspired by a combination of languages, including the shell, RPG for its data manipulation features, and of course C for the surface syntax.  In many ways, AMPL looks like a transcription of standard mathematical notation for sets and iterators into a more or less Algol-like language.  It was a combination of a strong reaction against matrix generators, and was to some extent influenced by GAMS, though more in a negative way than positive.  GAMS shows its Fortran heritage very strongly, and we wanted to get well away from that.
RM:
Do you think programming and therefore the kind of people who can succeed as programmers has changed? Can you be a great programmer working at a certain level without ever learning assembly or C?
BK:
Much programming today seems to involve gluing together existing components, using fairly high level languages with weak type systems; a big part of the programming task is finding the right method in the right enormous library, rather than creating the detailed logic of what some piece of code must do.  I personally find this library-search coding not as much fun as trying to write more or less standalone code, without the huge libraries.  But the new world brings many advantages – a programmer isn’t always worrying about memory management or indeed much about resources at all, and the giant libraries mean that we can avoid writing a lot of tedious code – it’s been done for us, so a few lines, mostly library calls, can accomplish a great deal.

As to whether one needs to know C and assembly language, it’s hard to say.  It’s quite possible to write a lot of good code without knowing either, but at the same time, it’s valuable to understand what’s really going on underneath.  For that, assembly language (a surrogate for how computers actually compute) and C (a high-level surrogate for assembly language) are helpful, and I think that serious programmers ought to understand them.  I’m sure that great programmers do.

RM:
Let’s talk about the nitty-gritty of one language – C++. It was designed to cater to everybody’s perceived needs and its implementations have become rather complex and bulky. Do you think it will remain in use because the more complex an object, the larger the investment in learning to use it, and the greater the resistance to abandon it?
BK:
C++ is a large and complicated language; my guess is that all but the most expert C++ programmers understand and use perhaps 20 percent of the language, and my 20 percent might well be different from yours.  But there’s no fat in it – everything that’s there is there for a good reason.  It does not cater to everybody’s needs; if I want to manipulate strings with regular expressions, I’m going to use Awk or Python, not C++.  But if I need to write large-scale infrastructure code that has to run fast, I’m going to use C++ because I need the combination of efficiency and ability to control access to information.  C doesn’t scale to really big programs and no other language is going to be as efficient, so C++ is likely to remain in use for a long time.  Indeed, few “major” languages have died; there’s a large investment in working code that can’t just be rewritten in something else.
RM:
That leads me to another topic, maintaining software. How do you tackle understanding a piece of code that you didn’t write? Where do you start? Page one and read linearly?
BK:
I don’t have a single strategy.  As you suggest, starting at the beginning and just reading is not a bad way to get rolling, though I’m more likely to start with main or the equivalent.  If I have a good idea of what the program is supposed to be doing, figuring out how it does core operations is useful.  If there’s a specific task to be performed, grepping around for function names that might be relevant is a good start, as is searching for text found in program output.  All of this works reasonably well for a program that isn’t very big – say a few thousand lines -but it fails for big programs.  I don’t know how I would try to understand something really big like Linux or Firefox; realistically, I would probably just say “Forget it.”
RM:
Is there a key skill that programmers should have, curiosity for instance?
BK:
It’s obviously a big help to be able to think clearly about breaking tasks up into steps, and certainly one has to be detail-oriented and meticulous.  But I think the main skill is an ability to move very quickly from a big picture to a very narrow focus and back again – there seems to be no other way to keep track of what’s going on in a big program while still making sure that the individual pieces work correctly.
RM:
You’ve been around the development of some of the formative influences on the Internet such as UNIX.  What do you see as the driving influences of contemporary computing and the way the world connects?
BK:
As the price of hardware has come down, computers have become pervasive to a degree that few would have predicted – there’s hardly a piece of electronic equipment now that doesn’t have some kind of computer control.  That’s below the surface and not often even known about, let alone studied carefully.  At the same time, there’s a great deal of highly visible computing, ranging from web services like Google and Amazon and Facebook through to gadgetry like the iPhone.  There is so much of this and it’s so intertwined with our day to day lives that it’s hard to remember that most of this is barely a decade old.  I expect that there will be even more computing in our lives, both visibly and invisibly, and we’ll be even more reliant on it, even as it becomes less and less visible.
RM:
The technology industry has developed out of all recognition since you began your career what do you think are the good things about this sea-change and what are the bad?
BK:
Technology is a two-edged sword, bringing good and bad.  The good outweighs the bad, especially over the long haul, as it makes life better for more people.  For instance, the world is much more connected than it was even a few years ago, which enables people to maintain relationships over great distances.  But there are definitely big downsides as well; for example, the same technology makes it possible for the bad guys to maintain their relationships as well.  Giant databases make it possible to easily access information in a way that was inconceivable even a few years back; we could not live without Google.  But that same technology makes it easy to create giant databases about our personal lives, which is definitely worrisome.  And so it goes, for pretty much any aspect of technology.
RM:
Are there things in your technological life that you would have done differently? And what are you most proud of?
BK:
I think the biggest technical error I made was to work on Software Tools in Pascal.  Pascal had merits and at the time was a popular teaching language, but it’s disappeared entirely.  If Bill Plauger and I had written Software Tools in C, it would have been a lot more useful and might well still be in print.  We just plain misjudged where the field would go.  On the flip side, I’ve been pretty happy with several specialized languages that I did myself, like Ratfor and Pic, and some where I had great collaborators, like EQN (Lorinda Cherry), AWK (Al Aho and Peter Weinberger), and AMPL (Bob Fourer and Dave Gay).  Little languages have always been fun to work on, and if one finds a good combination of syntax and semantics, the language can be very productive.  And I’m happy with my books; I’ve been very lucky indeed to have such wonderful co-authors.