On Spaghetti

Most programming languages have flow control features of some kind. Yeah, I know there are some languages that lack them, for example early programmable shader languages, some macro languages, and I think some programmable calculators just run a program straight through from beginning to end. But by and large, programming languages provide ways to jump around within the code and write decision-making logic.

Fairly early on, people realised that the only things you really need for flow control are a way to make a comparison, and a way to conditionally jump to another point in the program based on the result of a comparison. On top of these primitives, you can build flow structures that are as complex as you like. If you look at the native machine code that computers run, you can see that this has really been taken to heart: most CPUs provide a way to store the result of a comparison and one or more conditional jump instructions. Early programming languages like BASIC and Fortran had flow control based entirely on these primitives, too. If you learned to program on an 8-bit personal computer, you’ll no doubt remember writing statements like “IF condition THEN GOTO line” all the time.

But in 1968, this form of flow control was about to get a major setback (at least in high-level languages), because Edsger Dijkstra had written what was to become a highly influential letter entitled “A Case Against the Goto Statement”. You probably don’t know it by this name, though, because it was published in CACM under the title “ Go To Statement Considered Harmful” (Niklaus Wirth, a CACM editor at the time, changed the title for publication). This letter criticised the goto statement and the form of flow control associated with it, instead advocating structured programming.

Most modern high-level programming languages are designed with structured programming in mind: simple statements can be grouped into compound statements (with begin and end delimiters in Pascal, or with curly braces in C or Java), and flow control is based around these compound statements. For example, and if statement can be used to conditionally skip over a compound statement. Essentially, compound statements are the basic building block from which you build your structured code. Poor goto has survived to varying degrees: it’s present but rarely used in C and Pascal, and is a reserved word with no function in Java, for example.

The crusade against the goto statement has continued unabated. Most programming lecturers will advise against its use, or neglect to mention its existence. Programs that make use of it are referred to as “spaghetti code” because flow control can conceivable jump from any given point to any other given point. This can make code difficult to understand, debug or modify. However, in spite of this, I think Dijkstra’s message is being largely ignored.

You see, the beauty of structured code, and part of what makes it easy to understand (and consequently easy to debug and modify), is that code blocks only have a single entry point and a single exit point – program flow enters the block at the beginning, continues through it linearly, and leaves it at the end. All flow control statements operate on entire blocks – an if statement skips an entire block if the condition is not met. The goto statement obviously violates this principle, as program flow can be made to jump to an arbitrary point. And that’s why Dijkstra criticised it: because it causes program flow to deviate from the program’s structure. However, despite the ongoing crusade against goto, several other flow control structures that effectively do the same thing are being encouraged. These include loop control statements (continue and break), C-style return statements and exceptions. Let’s have a look at each of them.

First up, let’s think about loops. A loop will have some kind of condition that must be maintained in order for it to run, and a block of code that runs while the condition is maintained. Now this block, like any other block of code in a structured program, will have a natural entry point at the top and an exit point at the bottom. The loop itself has an exit point after the loop condition is evaluated. But when you add a continue statement, you’re adding another exit point to the loop body. You can no longer say that the loop body will be entered at the top and left at the bottom. In fact, continue may as well just be shorthand for doing a goto that jumps to the end of the loop body. A break statement is slightly worse: it’s like a goto that jumps to a point just outside the loop – it’s not only adding an additional exit point to the loop body, but adding an exit point to the loop itself!

C-style return statements are similar: they add exit points to functions (effectively a goto that jumps to the end of the function body). I do realise that there isn’t really much you can do about them, though – there isn’t any other way to return a value from a function. The best you can really do is to only ever place one return statement in a function, and to place it at the end of the body. (Pascal lets you return a value by assigning to the name of the function, so there’s no excuse there.)

Now neither of these are really any better or worse than goto statements. They’re just like shorthand goto statements where the destination is implied. If you use them, fair enough – just be aware of the consequences, and think twice before you criticise goto again, because your code is starting to look like spaghetti, too.

But exceptions are the worst of all. Exceptions are like a goto where you don’t know the destination! Think about it: throwing an exception could jump to somewhere in the same function, or somewhere up the call stack. You just don’t know where it will land (The only thing they can’t do that a goto can is to jump backwards within a code block.) Use of exceptions means you don’t know whether a function call will return, or jump somewhere else. Worst of all, in C++ simply unwinding an object could cause an exception to be thrown, which will most likely lead to a memory leak.

To deal with exceptions properly, you need to write ugly code. Languages like Java help you out a bit with finally blocks. But you still have to remember to wrap anything that needs cleaning up in a try block and to place the clean-up code in the corresponding finally block – the language can’t make you code properly. So now you’re code is littered with tryfinally constructs.

Of course, the C++ way has to be the ugliest: RAII. If you need to clean something up, you need to make a small class with the thing that needs to be cleaned up in the constructor, and the clean-up code in the destructor, and remember to be very careful to ensure you won’t throw an exception from the destructor, because then you’re really screwed. Now create an instance of this class and make sure it’s unwound at the point where you need the clean-up to occur. Now your code is littered with these small classes that are only really there in case an exception is thrown. You also can’t see the program flow properly, because it will be jumping to all these little destructors. And you want to hope you don’t need to try stepping through it in a debugger, because that’s an absolute nightmare.

There’s one place that exceptions are even more evil, if that’s possible: Objective-C++. There is no proper way to deal with exceptions in Objective-C++ because C++ frames will not be properly unwound when an Objective-C exception is thrown. (Objective-C exceptions are generally only thrown in truly exceptional circumstances, so it isn’t such a big problem in practice, but that’s beside the point.)

So why don’t we bring back goto? We’re doing all the things that make it harmful – we’re just kidding ourselves by refusing to call it what it is.

This entry was posted on Sunday, 27 July, 2008 at 11:42 pm and is filed under C, Development, Technology. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

7 responses to “On Spaghetti”

Torsten Curdt says:

Funny – I have had the same thought lately when dealing with some of the uif2iso4mac code. What are really the options if you exit on error conditions? Nesting to death or exceptions. Releasing resources really becomes a pain without “goto”s. I guess again it really just is a question of how you use them. Source bloat just because you try to avoid them I would consider harmful as well. Ideally you could break the code into smaller pieces and that will help. But sometimes that means passing around context information and delegating resource releasing responsibilities. So …just let’s just use “goto” properly and we should be good. I would still avoid it – if possible. But I don’t think it’s a big ugly no-no anymore.

Rob says:

I would have to agree completely with Torsten. Exceptions are great when used properly. For example, suppose you go over an array index, due to a typo. What should happen?
C++ answers this by saying “undefined behaviour”. You might get the thing called “segmentation fault” which is the same as saying “goto end of program”, or you might get other data overwritten which leads to nasty bugs.
Java answers it by throwing an exception. Therefore you know exactly where and why your program is crashing and can fix your typo.

Sure, you’re right in saying that exceptions introduce a “goto” where you don’t know the destination, but how else do you propose we compensate for the fact that we’re human and don’t write perfect code every time?

vastheman says:

IBM has it right with System Z: going over an array boundary should cause your program to die – that teaches you to code properly. An exception is better than undefined behaviour in this case. I don’t really have a problem with using exceptions to signal truly exceptional conditions, I just don’t think they’re really the best way to deal with normal error conditions.

Rob says:

I’ll agree with you on that one.

Personally I prefer the C way of doing things, where you return error codes depending on what happens. I’ll disagree with your original article where you say that the only exit point should be at the end of the function. As Torsten said, you either use return statements in the middle of the function, or have to nest to death. What’s worse? I’d rather have a nice simple return statement instead of a mess of indentations and curly brackets.

vastheman says:

That’s not really the point of the article. It was more a criticism of the hypocrisy that Dijkstra’s letter has bred: people wage war on the goto statement itself while encouraging the use of other constructs that are effectively the same thing.

And just to clear things up, when I code C++ (which is mostly at work – I do trading systems on Solaris) my usual way is to return status information, and throw exceptions under truly abnormal circumstances. I do use return statements throughout functions, and I do think loop control statements are appropriate at times (like using continue to skip an iteration of a loop when you encounter a value that should be ignored, for example).

Rob says:

Ah, I understand you now. Yes, I agree with you that it is a bit hypocritical, although I think a lot of the constructs are much more limited versions of goto and so therefore not quite so bad.

Carl says:

I never liked exceptions… probably because I’m lazy and exceptions mean I have to write catches for them all… isn’t writing code that throws an exception an example of avoiding having to write some code that deals with a failure condition… I guess if you want to do all your error handling in a particular layer, and have all the catches their to do logging or something… but otherwise, just a different kind of pasta dish… fettucini perhaps…

I can’t remember the last time I wrote code that went out of bounds of an array… seriously…

Leave a Reply