Thursday, January 31, 2008

Evolving the Object Oriented Paradigm Part II: Concepts

In the last article, I talked about problems of properly modeling conceptual types (in that case, Shape, Rectangle, Rhombus, and Square) to actual types in a typical object-oriented language. I wrote about the rigidity of its inheritance and difficulty of evolving your code or using third-party code.
In this article, I'll propose two simple changes to the existing OOP model that fix all the problems I addressed. In future articles, I'll be proposing more new features but for now I'll be focusing on just two.

The "concept"

In the last article, I used the words "concept" and "conceptually" a lot. It is my opinion that the conceptual idea/model should be as close as possible (which is a tenant of something called "concept programming" which takes this idea to a whole new level that I will not).

For this first proposal, I have thrown out classes and interfaces as the base units of the object-oriented paradigm. Now you're asking, "Thrown out? How can you have OOP without classes and interfaces?" Calm down. They're not gone. But they're no longer seperate units. Instead, classes and interfaces have been merged into a new unit called the "concept" (may be a temporary name. not to be confused with concepts from concept-oriented programming or concept programming which are also not the same thing).

A concept therefore consists of two pieces: the "interface" and the "implementation" (Not the same as the Objective-C equivalents of these words either. It's hard to come up with original names).

The "interface"

Let's take a look at a Shape concept and explain by example:

   abstract concept Shape
{
interface
{
int numSides();
float getArea();
}
}


An interface is a series of public methods that must be implemented by all concepts that inherit Shape. It is similar to interfaces in language like Java only interfaces except these are not a separate type (so you can't name the interface). A concept's interface is enclosed in a block as seen above.

An "abstract" concept either has no implementation or an incomplete one that can't be instantiated. More on implementations in a bit. Like the last article, Shape is abstract and will have no implementation so this is all we need to specify for Shape.

Concept inheritance

As with any OO language, there has to be some type of inheritance. Concept inheritance describes an "is a" relationship.

We need to declare our Rectangle concept which inherits Shape so let's declare just the interface part for now:

   concept Rectangle : Shape
{
interface
{
int getHeight();
int getWidth();
}

// implementation to be filled in later
}


(Concept inheritance is represented by ":" in this example but that's irrelevant. Used for simplicity.)

A Rectangle "is a" Shape. But concept inheritance is "interface inheritance" not class/implementation inheritance. Rectangle inheriting Shape means Rectangle is agreeing to implement Shape's interface (in addition to Rectangle's own interface).

Now some of you are asking, "No more implementation inheritance?" Don't worry. It's here. We'll get to it later.

The "implementation"

Now, let's fill in all of Rectangle:

   concept Rectangle : Shape
{
interface
{
int getHeight();
int getWidth();
}

implementation
{
protected int height, width;

public Rectangle(int h, int w)
{
height = h;
width = w;
}

public float numSides() { return 4; }
public float getHeight() { return height; }
public float getWidth() { return width; }
public float getArea() { return height * width; }
}
}


An "implementation" is the default implementation of a concept's interface within that concept. Its like the typical definition of a "class" except its bound to the concept like the interface and must implement all of its concept's interface (unless its abstract). Rectangle fulfills its inheritance requirement because it implements the two methods from Shape and the two methods in Rectangle's interface.

Going in the same order as the last article, we'll add Square next. Remember in the last article that we had to create a IRectangle interface and modified Rectangle to implement the new interface?

   concept Square : Rectangle
{
implementation
{
protected float length;

protected Square(int l)
{
length = l;
}

// Methods to implement the combined interface from Shape and Rectangle
public int numSides() { return 4; }
public float getArea() { return length * length; }
public float getHeight() { return length; }
public float getWidth() { return length; }

// We'll talk about this in a moment
public float getPerimeter() { return 4 * length; }
}
}


There's a couple things to note here. First, Square does not specify an interface block because it has nothing of its own to add to the Rectangle interface. Second, we didn't have to create any new types. A concept always has an interface and therefore can always act like one.

Some of you are asking, "Well what if a coder creates a new concept which doesn't inherit anything and doesn't specify any interface but specifies an implementation?" The answer is only the methods specified in the interface are publicly callable on a concept variable. For example, in my new Square concept, there is a public getPerimeter() method in the implementation but not in its inherited interface. Well then its not visible on the Square type. So the second line:

   Square s = new Square(6.0);
float p = s.getPerimeter();


will throw a compile time error saying getPerimeter() is not a method of Square's interface.

But sometimes you'll want to call getPerimeter() publicly. Sometimes you want to know if a variable is part of the concept just by the interface ("conceptually a Square") or whether it uses (or inherits) the concept's implementation ("physically a Square"). I'll use the type modifier "actual" to denote "physical" relationships.

   Rectangle r1 = new Rectangle(3.0, 4.0); // Legal
actual Rectangle r2 = new Rectangle(3.0, 4.0); // Legal

Rectangle r3 = new Square(6.0); // Legal
actual Rectangle r4 = new Square(6.0); // Illegal. Square does not inherit Rectangle's implementation.

Square s1 = new Square(6.0); // Legal
float p1 = s1.getPerimeter(); // Illegal. getPerimeter() not in Square interface

actual Square s2 = new Square(6.0); // Legal
float p2 = s2.getPerimeter(); // Legal


So this "actual" keyword lets the programmer distinguish between which of the two inheritance types the object really is if needed. More on implementation inheritance right after this bold header:

Implementation inheritance

Implementation inheritance is a wonderful thing. Sometimes. When it's not abused. Everyone's seen it. Blindly inheriting file streams, collections, etc. and not changing any of its behavior rather than using them normally. People overriding one method with no idea about the base class internals and what effects this will have on the object and on other methods. It happens.

With this new concept design, I'm hoping it will encourage implementing the interface as opposed to automatically inheriting. But implementation inheritance is still necessary, so of course, it's included in this little design.

But first, let's define a Rhombus concept like the last article and show the Square implementing Rhombus's interface first.

   concept Rhombus : Shape
{
interface
{
float getSideLength();
}

implementation
{
protected float length;

public Rhombus(float l)
{
length = l;
}

int numSides() { return 4; }
float getArea() { return length * length; }
float getSideLength() { return length; }
}
}

concept Square : Rectangle, Rhombus
{
implementation
{
// rest of the implementation is the same as previous Square definition

public float getSideLength() { return length; }
}
}


Notice again that we didn't have to define an IRhombus type. We have 4 actual types representing 4 conceptual types.

Also note that Square can inherit multiple concepts easily because we're just inheriting the interfaces and combining them.

Now we'll change Square to inherit the implementation of Rhombus in order to show off implementation inheritance. Implementation inheritance is much like class inheritance in other languages. To change over, all we need to write in the inheritance syntax and remove all fields and methods from Square that Rhombus already has (which leaves only the Rectangle-specific methods).

   concept Square : Rectangle, Rhombus
{
implementation : Rhombus // says inherit Rhombus' implementation
{
public Square(float l)
{
super(l);
}

public float getHeight() { return length; }
public float getWidth() { return length; }
}
}


We just switched from implementing the Square interface manually to inheriting from Rhombus to do most of the implementation for us very easily without creating any new types or making old interface types useless.

Behold the flexibility of a concept.

Factories

Up until now, I had solved most of the problems yesterday except the using of the factory pattern to ensure:

   IRectangle rect = new Rectangle(6.0, 6.0);


really became a Square. So we settled for this factory method:

   IRectangle rect = Rectangle.create(6.0, 6.0);


But why must we settle? Why can't I do this with concepts:

   Rectangle r = new Rectangle(6.0, 6.0);
Square s = (Square) r;


Well you can do this by using the factory pattern built into this little language here. First we'll add a private constructor to the Rectangle implementation to make our next step legal:

   private Rectangle() { }


Simple. Now we'll introduce the "factory constructor". This is a special type of constructor that returns an object. That object must be a non-null object of its declaring concept (in this case Rectangle).

   public factory Rectangle(float h, float w)
{
if(h == w)
{
return new Square(h);
}
else
{
actual Rectangle r = new Rectangle();
r.height = h;
r.width = w;
return r;
}
}


The factory constructor uses the keyword "factory" to distinguish itself from a typical constructor. Unlike the typical constructor, the factory constructor allocates the memory it needs during the constructor, not before. And then at the end of the factory constructor will be a built in check to make sure the returned value is not null. The private Rectangle constructor is a still typical constructor.

So the factory constructor looks like a normal constructor and is very simple to use:

   Rectangle r1 = new Rectangle(3.0, 4.0); // Rectangle
Square s1 = (Square) r1; // Type conversion error. Not a Square

Rectangle r2 = new Rectangle(6.0, 6.0); // Square
Square s2 = (Square) r2; // Works fine.


See how much that simplifies things? Because one does not plan on the factory pattern until they need it and then have to go back and change their code to use factory methods instead of constructors.

And you know what else it's great for? Object pooling/caching. For example, did you know that in Java 5, autoboxing an "int" to an "Integer" calls the static Integer.valueOf(int) method and not Integer's constructor (the other Number wrapper classes do the same thing)? Why does it do this? Because the Integer class has a static array of Integer objects (Byte, Short, and Long do this too) for values -256 to 255, and if the passed in value is in that range, you get the corresponding object from that array, otherwise it allocates a new one. So Integer.valueOf(0) always returns the same object (for that class loader). And if they had factory constructors, they could move that logic into the constructor syntax and save some memory for people who don't know to use Integer.valueOf().

Review

I did not introduce a single revolutionary thought about the object-oriented paradigm in this article. I glued two existing standalone constructs together into the new standalone construct called a concept. I added a special syntax and a tiny bit of new constructor behavior to make factory constructors. That's it. And look at all the things from the last article I solved. These two little ideas provide a great deal more flexibility than the old model.

So why has the object-oriented paradigm stopped evolving? Why are other languages not looking for things like this? Why are they just carbon copying everything?

In upcoming articles, I'll be adding more features to this new model. Some of them new. Some of them reused and modified to fit this model. But all of them about simplifying and adding flexibility to the OO model.

-Kaja

Wednesday, January 30, 2008

Evolving the Object Oriented Paradigm Part I

Object oriented programming is a great paradigm. It's biggest problem is it's dead. "What?" you ask. "Object oriented programming is the majority of the industry and continues to grow with new OO languages with new features all the time!" This is true. But the OO paradigm itself has stopped evolving. The cornerstones of OOP like classes, interfaces, inheritance, etc. are virtually all carbon copies of each other in different languages. Not many languages work towards improving these cornerstones. What was the last language to really improve upon the idea of a "class"? We hide the missing features and rigidity of OOP with design patterns, mixins, delegates, functional programming concepts, etc. which are all good to have but we shouldn't need to rely them.

Proving by example

In this article, I'm going to walk through piece by piece a supposedly simple relationship of just 3 types: Shape, Rectangle, and Square. And for the sake of this article we'll be talking mostly about statically-typed languages like Java, C#, D, etc. and all three types are immutable (meaning you can't change the shape's internal data after instantiated). Sounds easy right? Well let's see and at the end I'll give how I think OOP could be improved to easily accommodate for the problems.


So let's start with the basic type: Shape. A Shape in this example only two methods: numSides() which returns its number of sides and getArea() which returns its area. The Shape class has no data itself at the moment and can't be instantiated so the designer has to choose between an abstract Shape class or a Shape interface (to some this is an easy decision but at the point its more about personal preference).


For this example, I'll assume Shape will never have any data or method implementations so I'll make an interface:


   interface Shape
{
int numSides();
float getArea();
}

Okay. One type done. Easy right? Keep reading.

Now it's Rectangle's turn. I want rectangles to be a concrete (or non-abstract) type so I make a class:

   class Rectangle implements Shape
{
protected float height, width;

public Rectangle(float h, float w)
{
height = h;
width = w;
}

public int numSides() { return 4; }
public float getHeight() { return height; }
public float getWidth() { return width; }
public float getArea() { return height * width; }
}

NOTE: I'm using Java syntax just to be uniform but it doesn't matter.

Now you're saying, "Two types done and I fail to see a problem." Keep reading.


Now it's time for Square. Square represents a fun quirk of OOP. Conceptually, a square is a rectangle where all of its sides are equal. When represented in a programming language, it needs less internal data than its parent class. Squares really only need the side length. Now you could just have a Square constructor that takes in a length and passes it on Rectangle's constructor as both the height and width. However, keeping both the height and the width is redundant. Now, one extra float per Square is trivial on a modern machine. But imagine you need to tens of thousands of them. And imagine this example is about two other classes in the same situation only the subclass needs 1 kilobyte less data than its parent class (maybe the parent uses a 1 kB array for part of an algorithm that the sub classes doesn't use). Now you are saving megabytes of memory in your program. And that can be significant. Especially if your project has a memory usage limit placed on it by your customer (those are always fun).


Since this isn't a third-party supplied class, we can make a IRectangle interface (I hate .NET for this naming convention) that Rectangle and Square will share and we'll place getHeight() and getWidth() in there:


   interface IRectangle extends Shape
{
float getHeight();
float getWidth();
}

class Rectangle implements IRectangle
{
// Implementation is the same as previous example.
}

class Square implements IRectangle
{
protected float length;

public Square(float l)
{
length = l;
}

public int numSides() { return 4; }
public float getArea() { return length * length; }
public float getHeight() { return length; }
public float getWidth() { return length; }
}
Under Java naming conventions, you may keep the interface called "Rectangle" and call the class "RegularRectangle" or "BaseRectangle" or some other fun name. No matter the naming convention you just introduced an unnecessary type. We have 4 actual types to represent 3 conceptual types. And if you had any code before Square was created that used the Rectangle class, you have to go back and change them.

And if Rectangle was from a third-party library that we couldn't change, the best non-inheritance method would be to create an adapter/wrapper class that took in a Rectangle and implemented our IRectangle interface:


   class RectangleWrapper implements IRectangle
{
protected Rectangle rect;

public RectangleWrapper(Rectangle r)
{
rect = r;
}

public int numSides() { return rect.numSides(); }
public float getArea() { return rect.getArea(); }
public float getHeight() { return rect.getHeight(); }
public float getWidth() { return rect.getWidth(); }
}

Hooray for unnecessary classes. Don't get me wrong. The Adapter pattern is great but it's ridiculous sometimes that you have to produce things like what's above. And we introduced yet another type. It took 5 actual types to represent 3 conceptual types.


But wait. There's more.

Okay, going back to our previous 4 type solution. Think that's all of the problems? Thinking it's only one extra type, what's the big deal? Read on.

Okay, let's imagine we have our 4 types: Shape, IRectangle, Rectangle, and Square. And someone (maybe the customer) decides this little library needs a Rhombus type. Piece of cake right?

Conceptually, a rhombus is a 4 sided polygon with all sides of an equal length. By definition, a square is a rhombus. But a square is also still a rectangle. From the current setup you have two options.

The first option is to declare a IRhombus interface, a Rhombus class and make Square also implement Rhombus. This can be seen here:

   interface IRhombus extends Shape
{
float getSideLength();
}

class Rhombus implements IRhombus
{
protected float length;

public Rhombus(float l)
{
length = l;
}

public int numSides() { return 4; }
public float area() { return length * length; }
public float getSideLength() { return length; }
}

class Square implements IRectangle, IRhombus
{
// rest of the implementation is the same

public float getSideLength() { return length; }
}

This solution works. Its flexible. But had to add 2 types. Up to 6 real types representing 4 conceptual types.


The second option is to not have an IRhombus interface and just have Square inherit Rhombus cause they have the same internal data. But conceptually, is a Square a Rhombus acting like a Rectangle? Or is it a Rectangle acting a Rhombus? It's neither because a Square IS both at the same time. And since most statically types OOP languages now disallow multiple inheritance, one must sacrifice true definition for the "next best" definition. In this very simple example, its clear which to pick but that's not always the case. If these Shape classes were more in depth (maybe so they could be drawn into a screen), a Rhombus would have extra data needed like the internal angles of the shape which Square would not need.

And all we did was add a Rhombus. We completely ignored plenty of 4 sided shapes: trapezoids, parallelograms, rhomboids, kites, and quadrilaterals that don't fit into any other category. Then we need round shapes, triangles, pentagons, hexagons, etc.

It gets worse.

Imagine someone using these classes wrote the following code:

   Rectangle rect = new Rectangle(6.0, 6.0);


This is perfectly legal. It's still a valid rectangle. But it's also a square. However any type checks of this object against a Square will fail because someone declared it with Rectangle's constructor. This can happen a lot if the passed in height and width are unknown at compile time and stored in variables.

Typical solution to this? Factory pattern. The factory pattern is a either a method or a class of methods that takes in input and decides which class to allocate and return. We'll put it in the Rectangle class (cause we can't put it in IRectangle where it belongs) as a static method to avoid creating a factory class and for another reason you'll see in a moment.

   // The following is part of the Rectangle class

static public IRectangle create(float height, float width)
{
if(height == width)
return new Square(height);
else
return new Rectangle(height, width);
}

static void test()
{
IRectangle r1 = new Rectangle(3.0, 4.0); // Rectangle
IRectangle r2 = new Rectangle(6.0, 6.0); // Rectangle

IRectangle r3 = Rectangle.create(3.0, 4.0); // Rectangle
IRectangle r4 = Rectangle.create(6.0, 6.0); // Square
}



And to truly prevent the "public" and inheriting classes from using Rectangle's constructor to make Squares like r1 and r2 above, we need to make the constructor private so they only use the factory method. If we had used a seperate factory class, we would have to keep Rectangle's constructor public which takes away the added safety of the static method has. So now that we have the factory method, any old code using "new Rectangle" (and preferably "new Square" too for uniformity) needs to be changed to use our factory method.

And we need to do the same thing to Rhombus.

The Point

Why is good object-oriented design this complicated to change? Unless you design everything perfectly before you write any code (good luck with that in the real world), your code will be constantly evolving. New features and new/different/misunderstood customer requirements evolves the software past the initial design. Changes made today may be taken back out in a month.

The object-oriented paradigm needs a language that can adapt to these things. In the next article, I'll present a proposal for a simpler OOP that deals with all these problems. And it represents Shape, Rectangle, Rhombus, and Square without redundant data (and regardless of third-party or not) in exactly 4 types.

See you tomorrow for Part II.

-Kaja

Sunday, January 27, 2008

Numbers in an Object-Oriented World, Part II

Yes, I have a blog again. I need a place to rant as I design PX. Not spending money on getting my domain name back though when I don't know if this will last.

Over a year ago (on my old site), I wrote a post on representing numbers in an object oriented language. It talked about how most languages ignore the fact that every integer is a rational. Every rational is a real. Every real is complex. And all these are numbers. You could invent a hierarchy it would have to consist of interface or data-less abstract classes because an integer requires far less data than a complex number and having to inherit the complex number's two floating point fields (or whatever they are) isn't what people want to hear.

At least Java made a Number base class. A tiny step in the right direction. C# ignores number hierarchies completely (and don't even get me started on what a copout boxing and unboxing is). But Java misses out on so much more that can be put in there. Java has five integer-data classes: Byte, Short, Integer, Long, and BigInteger. What if I just wanted to know that my Number object was an integer and didn't care which precision it has? Do they share a common Integer interface? Of course not.

Operator overloading exists only to justify mathematical operators in an OO language. But really I only like operator overloading for numeric types. C++'s operator abuse is the prime reason. Operator overloading also presents a fun challenge for number classes. The seemingly simplistic type promotion for operations that we take for granted (such as converting a int to a float because adding it to another float) makes truly representing the difficult when a number's true type isn't known until run time. Also you need to account for everything. For example:

   Number x = someFunc();
Number y = x + 32.0;

If "x" is an Integer, the typical rules say it gets promoted to Float, and then added. Should the Integer operator's know this? Cause is so, Float's operator must know to promote Integers too when if the expression were reversed to "32.0 + x". Should Integer delegate to Float's operators and trust they will know what to do with Integers? Suppose x is an object from some custom number class, the Float needs to know to delegate the operation to the custom number class. And in turn, if the customer number class doesn't know how to add to floats, it may delegate back to Float and it repeats in never ending cycle. So many dilemmas from so simple of a mathematical problem.

Another unfortunate side effect of numbers in OOP is "static classes" (meaning classes with all static data and methods, no constructors, and can't be inherited). It's like the language designers thought they were done and then wondered "what the hell do we do with the math functions like pow and sin?" Good OO practice would say throw them in a normal class that can be instantiated and have a Math interface or something for usability. But the Math object wouldn't need a state so nobody wanted to make a stateless Math class that you needed to make a Math object before using a math function. So they threw them in a "static class" which serves no purpose other than being a namespace. I mean come on. A Java book I own defines a class as "a group of objects that share common state and behavior. A class is an abstraction or description of an object." Well, the Math class can have no objects. It fails this definition on so many levels. Is really that important that every method be in a class? Couldn't they settle for modules?

So why am I ranting? Why is this important to me? Because in my PX design decisions, this is what I'm thinking about. How can I make a (statically-typed) language flexible enough to support true object oriented numbers.

In the future, I'll probably be showing off various PX ideas like replacing the typical Java/C# "class" and "interface" with a more flexible model.

-Kaja