Why sequential is good.


Why class oriented programming is bad.

Start by examining these 2 graphs:


Which graph is most complex?

A is obviously the most complex, since it has longer and more randomly pointing purple pointers.

However, A and B have exactly the same structure. The only difference between them is the placement of the black circles. They have the same random pointer structure, but in B the circles are placed to make the purple pointers as sequential as possible.

And the point is...?

Structures can be made simpler by making them sequential, as in B.

In random bigraphs, as in the figures above, the complexity is approximately halved when put in a sequence.

This general principle works on most elements of programming languages, such as the order of execution, the definition of class hierarchies, call structures, etc. It also works on data structures and accesses, such as lists, trees, garbage collection, object structures, etc. All these have in common that their structures have multiple references, such as jumps, if-else constructs, general program flow constructs, gotos, function calls, pointers, object references, classes with multiple references and inheritances.

For all these things have the property of the figures:
They can be made complex or simple, depending on how they are laid out, because they all have referencing and branching.

Why does complexity matter?

Because it makes things more difficult. It increases the probability of errors since there are more places for them to be. It also decreases the chance of understanding, since it may become too difficult to understand. Complexity also wastes time, since it takes shorter time to understand something simpler.

This is reflected in principles like Ockhams razor, the K.I.S.S. principle: "Keep It Simple, Stupid!", in Einsteins principle:"It should be as simple as possible, but not simpler", etc.

Simplicity is better for people, but it is also better for theories, math, computer programs, and similar.

Sequentialness is better for programs?

Yes. Both programs and data structures get faster and smaller when they are sequential and simpler. The DDRAM memory of today's computers are much faster (20-400 times) at sequential accesses than random, and the caches of the processors also work better for sequential accesses, especially prefetching. And then there are stuff like jump tables, which guess at how the program flow jumps, which also gets faster and better if things are laid out more sequentially.

But you only gave one particular example of this, not a proof.

The pictures represent very general structures, namely bigraphs, which are the same as is used in LISP, Scheme, and other languages. They can represent any structure, and the example in the pictures is therefore a general example.

The graphs could justs as well have represented program flow, or class hierarchies, or call back structures, or gotos, or state machines, etc. All these structures have in common that there are many ways of representing them, and the simplest ways are quite sequential.

As for LISP, the green pointers can be the CAR, while the purple can be the CDR.

Sequential is half as complex, you say?

Approximately. It depends somewhat on the particulars of the structure, and of how random it is. So the worst case is typically a little bit above half, as in the example, which is random. Typical real life improvements can be much better than a halving.

What about the "class oriented" stuff in the title?

I am thinking here of one particular kind of this unnecessary increase of non-sequential references.

I am referring to a popular programming paradigm which is often confused with object oriented programming: Instead of making programs into a limited set of useful well thought out objects, functions, and structures, the programmers make big nets of inter dependent classes with lots of unnecessary functions.

Splitting up sequential programs into separate methods which are called only once, is an often used method for making the code less sequential.

As you can understand now, this is wrong, because it is unnecessarily complex, which introduces errors, incomprehensibility, obfuscation, and wastes time.

A concept which is somewhat related to this is coupling, which can be low or strong, but this refers more to the number of references than to how random or sequential they are. The examples above have 2 references for each node since this is general, but more references is of course more complexity per node.

(c) Kim yhus 2007-06-03