Framework as an anti-pattern

March 14, 2009 by Lev

Framework is certainly a pattern, but it often becomes an anti-pattern. I’ve heard someone say that this antipattern is inevitable in large systems, but I believe that avoiding it is a matter of keeping two simple rules in mind all the time:

Make interfaces as unrestricted as possible.

Create utilities for working in a certain way rather than requiring the user to work in that way.

As an example, consider an application like Photoshop. How do you design filters, keeping in mind that an image doesn’t necessarily fit in memory? An obvious solution would be this:

class IFilter
{
public:
  ImageBuffer Transform(const ImageBuffer& buf) = 0;
};

(Actually, the output buffer is smaller than the input buffer, and you have to express that somehow, but that’s not the point.) So now you have either a class or a function that uses IFilter to convert a complete image, and all is well.

This works for some time, and then you want to create a geometric morphing filter, and the IFilter design doesn’t suit it. So you create IGeometricTransformation, that converts a Point2d to another Point2d. But then you need to create some new filter that doesn’t work in any of these two ways, and you start twisting the design to accommodate for it. Gradually you get a mess.

The corrected design would be this:

class IFilter
{
public:
  ImageDocument* Transform(const ImageDocument& doc) = 0;
};

class LocalizedFilter : public IFilter
{
public:
  ImageDocument* Transform(const ImageDocument& doc);
private:
  ImageBuffer Transform(const ImageBuffer& buf) = 0;
};

class MorphingFilter : public IFilter
{
public:
  ImageDocument* Transform(const ImageDocument& doc);
private:
  Point2d Transform(const Point2d& pt) = 0;
};

That is, specific partial filter implementations are given as utilities rather than rigid rules, and IFilter is the least restrictive possible interface for a filter.

In the next post I’ll take this simplification one step further.

Leave the campsite cleaner than you found it

March 12, 2009 by Lev

In Clean Code: A Handbook of Agile Software Craftsmanship I’ve read this very useful rule:

If we all checked-in our code a little cleaner than when we checked it out, the code
simply could not rot. The cleanup doesn’t have to be something big. Change one variable
name for the better, break up one function that’s a little too large, eliminate one small bit of
duplication, clean up one composite if statement.

Or delete some unused or commented-out code.

Don’t modify your language privately

February 28, 2009 by Lev

When I was 16, I worked at a company that very soon went bankrupt (but that wasn’t my fault). We used PowerBuilder, an environment similar to Delphi, but with a language of its own.

In that language we were allowed to use minus in identifiers. Which means that we had to surround the subtraction operator with spaces (a-b would be considered a single identifier). At that time I wasn’t in the habit of surrounding operators with spaces. In the options I found a checkbox, that disabled using minuses in variable names, checked it and continued coding happily the way I was used to. Until at the next weekly code merge (which deserves its own post) my code didn’t compile.

How NOT to apply the Single Responsibility Principle to functions

February 27, 2009 by Lev

As I said earlier, I was rather surprised when I heard Uncle Bob’s version of the Single Responsibility Principle. His way of splitting code into functions, explained right after that, also made me think “WTF?” The idea is to have only one set of curlies per function. Only one loop, only one “if”, and so on.

Not making a big ball of mud out of your class is surely important, but this approach is extremist programming. It seems rather strange to explain obvious things, but since Uncle Bob has an experience of 40 years (and I much less), I’ll do it.

As Scott said on the same show, it’s important how much one sees on the screen simultaneously. And this is a reason not to split code into too many small functions. Each function adds 3 lines (closing curly, empty line, opening curly), so if the functions are very short, you see considerably less on the same screen. But what’s worse, you don’t see your code in the order it executes, and that’s confusing. Code split into many small functions, whose relationships are not clear, is confusing. I’m not theorising here, this is my experience.

The same reason, that seeing all of “something” on the screen together helps understanding, is one reason to split long functions. But this splitting should be made according to some logic, such as the SRP, not the arbitrary rule “Don’t use curlies”.

What Uncle Bob is doing here is using functional programming instead of structured programming. In the previous post I discussed his use of object-oriented programming instead of functional programming. This seems like a pattern, though I don’t quite get its significance.

Single Responsibility is not Functional Decomposition

February 20, 2009 by Lev

There is only so much to be said on the principles vs. advice topic, and having said that, it’s time to go and learn from the advice. Except that, in some cases, the advice is bad. Specifically, when Uncle Bob was “demystifying” the Single Responsibility Principle at Hanselminutes, he (unintentionally) gave some examples of harmful coding practices. One of these leads back to functional programming, and the other to unreadable and therefore unmaintainable code. I will discuss the former here and the latter in the next post.

Bob Martin gives an example of an Employee class that has methods like Payroll(), WriteToDB(), GenerateReport(), etc. So he suggests to split it into EmployeePayrollCalculator, EmployeeReportGenerator, etc., because the method of calculating salary can change independently of persistence and report generation. The benefit of this separation somehow misses me. Sure, Uncle Bob explains that otherwise any change causes a major recompilation in C++, but really it’s only a relink.  The harm, on the other hand, is obvious: It’s called the Functional Decomposition Antipattern.

A class represents a concept. That is why a class should have an obvious name. Design is about enabling oneself to program in the same terms one thinks in. Classes correspond to terms that are nouns, functions to terms that are verbs. A noun can have several verbs associated with it, that don’t necessarily all correspond to the same responsibility. EmployeePayrollCalculator is a bad class because it is a very artificial “noun”; it doesn’t represent a concept.

In too many cases, responsibilities can’t be separated for various reasons. Sometimes joining responsibilities in a class is dictated by an interface that joins them. You will say that this means the interface has been badly designed. But an interface corresponds to the needs of the code that uses it. For example, the article on SRP on objectmentor.com (where Uncle Bob is the president) gives the example of a modem class that has dialing / hanging-up functionality as well as sending / receiving functionality, and suggests splitting those into separate interfaces:

However, there are two responsibilities being shown here. The first responsibility is connection management. The second is data communication. The dial and hangup functions manage the connection of the modem, while the send and recv functions communicate data.
Should these two responsibilities be separated? Almost certainly they should. The two sets of functions have almost nothing in common. They’ll certainly change for different reasons. Moreover, they will be called from completely different parts of the applications that use them. Those different parts will change for different reasons as well.

I can envision a function that dials up, sends data and hangs up. Why not? And if it does, a single interface would be better. Actually, the separation suggested in the article will usually be the case, but for a completely different reason: because send and recv will comprise a more generic stream interface. This is because of abstraction levels, not responsibilities.

The other reason not to separate classes in such cases is that, contrary to the quotation above, both kinds of functionality can have a lot in common, like maybe a private function for low-level hardware access. The article later concedes that the separation may have to be between the interfaces only, not the classes, for “reasons having to do with the details of the hardware or OS”. Similar reasons will exist in most cases.

Moreover, some data will be common, too. If we replace the modem by a socket, we won’t be able to split the class in two, because the socket handle (or port number) will be common.

Another example given in the article is a rectangle class, which consists of geometric and GUI functionality. Accoring to Object Mentor, we should split it into GeometricRectangle, that has no GUI functionality, and GUIRectangle, that owns a GeometricRectangle. This is undoubtably correct, but why do we need the SRP to understand it? I could have given any number of reasons for this separation before I had heard of the SRP: That geometry and GUI are very different levels of abstraction, that geometry usually resides in a separate library, that some rectangles are never drawn, etc.

So is the SRP totally useless? No, but it should be applied differently. The similarity between the Employee class and Functional Decomposition is not accidental. It is because the Single Responsibility Principle is a good rule of thumb for splitting code into functions. Each function should have only one reason to change.

However, even when applying the SRP to functions, Uncle Bob takes it to the extreme. But that will be the subject of the next post.

On programming principles

February 13, 2009 by Lev

I’ve decided to add my 2¢ to the Joel and Jeff vs. Scott Hanselman and Uncle Bob argument on code quality, SOLID principles, TDD and principles in general.

First, here is where I think each side is wrong:

People that say things like this have just never written a heck of a lot of code.

This sentence, said by Joel, is a personal attack. No wonder Uncle Bob got all heated up. And then he allowed himself to fight back by calling Joel a “business wonk”. Apart from being offensive, this attack went partly amiss: while it is true that one can’t dedicate a lot of effort to running a business and improve their programming skills at the same time, running the business gives an important perspective: Sometimes business considerations are more important than writing clean code.  For example, deadlines should be met.

Another mistake on Uncle Bob’s part is the use of the word “principle”. Many commenters on Jeff’s post have pointed out that the principles are not intended to be used all the time, but are more like tools. As tools they are useful, but if this was indeed Uncle Bob’s intent, then he shouldn’t use the word “principle”. “Principle” is defined, e.g., as “a basic truth or law or assumption”. So maybe they should have been called “design patterns”.

In every text that makes several statements, people tend to select the statement or idea they consider the most important. My digest of Joel’s post is this: Don’t overengineer. Don’t blindly use the SOLID or other principles just because they are there.

The real principle is this: Don’t let design get in the way of project requirements or common sense. I’ve learned it from experience after participating in building a set of libraries, which had several problems:

  • Interface complexity: The interface was very generic and allowed to do complicated things. Simple things were a special case of complicated ones, so that it was not simpler to do something simple than something complex.
  • An inner platform
  • Unneeded frameworks

… and more.

I consider this a very important experience and I’m glad I had it right at the start of my career.

I admit I don’t know what “agile programming” or “extreme programming” actually mean, but the project leader was an extreme programming geek, and what I understood is that “extreme” in XP has exactly the same meaning as in “extreme left” and “extreme right”. So maybe the correct term is Extremist Programming.

A good general rule is that most “principles” of programming are useful in some cases, but useless in others. So while knowing them is beneficial, one should beware of dogmatists who will break functionality to fix compiler warnings. (No, that was not a hypothetical example. And if you say “with unit tests, that wouldn’t have happened”, you have a point, but that’s no excuse for breaking functionality.)

In general, as one of the epigraphs in Stroustroup’s book says, “keep things as simple as possible, but not simpler”. For example, if your whole project is 20 lines long, you needn’t even bother to split it into functions. Of course, you should do so as soon as it becomes longer. After all, refactoring is agile, isn’t it? When the code grows to 100 lines long, it will contain several functions, maybe 2 or 3 classes, but most probably no design patterns yet. And so on.

And a final word. While experience certainly does matter, don’t forget that we are not living in a world where old people are the most revered because of their experience. Instead, this is a world of start-ups, where everyone is judged according to their success. And anyone who doesn’t question authority should switch to literate programming.

Statements about genetics I just don’t understand, part 2

November 29, 2008 by Lev

(Part 1)

We can work against our genes (e.g., by using contraception)

It’s like saying “We can violate the law of gravity, e.g., when we put an apple on a table, it doesn’t fall down.”

The weird part is that Dawkins “himself” states this, after explaining that child-bearing is not always evolutionarily optimal. This statement is part of the “free will” problem: on one hand, we think one should act according to some rules called “ethics”, on the other hand, the world is more or less deterministic, and everyone always acts according to the laws of physics, so “should” is meaningless.

“It hasn’t evolved yet” is an invalid argument

This one is really quite reasonable: Sometimes, when a researcher is confronted with a phenomenon that he can’t explain in evolutionary terms, he says “the evolutionarily better behavior/form/whatever hasn’t evolved yet.” Clearly, this is not an explanation, as it can explain anything.

On the other hand, in some cases the conditions have changed only recently, and a species hasn’t adjusted itself yet. I think in this case the author carries the burden of proof that the change is recent.

Statements about genetics I just don’t understand, part 1

November 28, 2008 by Lev

I just can’t understand how anybody can believe in one of the following statements (unless they believe in genetics rather than accept it as a scientific theory).

Natural selection doesn’t apply to humans

This is some sort of snobbish belief in the modern human’s being the last of creation. There is no reason to think we are special in this regard. It is true that (in developed countries) natural selection doesn’t push us toward physical strength and against diseases, as it used to, but still people with certain traits have a higher chance to survive than others, and some people are more sexually attractive than others.

The interesting thing, though, is that the changing of the qualities favored by the evolution of modern man is much faster than evolution itself. For example, immunity to pneumonia has stopped being important only about three generations ago. So evolution keeps changing directions all the time. However, this has started only quite recently, so it’s hard to say whether and when this will end.

Variant: Natural selection has been replaced by meme selection

Memes are important, no objection to that, and evolve much faster than genes. But that doesn’t mean genes have lost their significance. Memes are on a higher level than genes, just as biology is on a higher level than physics. But nobody says that, since animals are governed by biology and genetics, they don’t obey the laws of physics!

There is no kin selection (rejecting the gene-centric view)

For example, why do worker bees attend to their sisters rather than having children of their own? According to the idea of kin selection, because they have common genes with their sisters. Zahavi, however, rejects kin selection. Although he agrees that animals “should” rear their young, he doesn’t extend this to siblings. I suppose this opinion must be based on a very specific formulation of the idea of evolution and natural selection, that doesn’t contain the word “gene”.

(Part 2)

Serializing non-default-constructible objects with boost

November 18, 2008 by Lev

Boost serialization is not trivial for an object that has no default constructor. Consider this:

class Foo
{
public:
	int v1;
	int v2;

	Foo(int v1_) { v1 = v1_; }

	template <typename Archive>
	void serialize(Archive& ar, unsigned int version)
	{
		ar & v1;
		ar & v2;
	}
};

//...
binary_iarchive ar;
Foo foo(0);
ar >> foo;

We only suffer a minor inelegance because of redundantly initializing foo. But what if we need to deserialize a pointer?
Hint: it won’t compile

Maintaining version numbers

November 14, 2008 by Lev

In our company, we have the following policy about version numbers:

  • Each release has an ID in the form major.minor.release.revision
  • major, minor and release are numbers, arbitrarily invented for marketing purposes. They are product-specific.
  • revision is the SubVersion revision number the release is built from, returned by SubWCRev.
  • Each executable or DLL in the same release of a product has the same 4 numbers in it, visible in its Properties dialog in Explorer in the “Version” tab. This tab takes the numbers out of a resource.

I set out to fulfill the requirements as non-redundantly as possible. Read the rest of this entry »