How to create a habit of reuse

In OO languages like Java, people tend to make new classes more than they reuse existing ones. In Clojure and other FP languages, we tend more on the side of reuse. How do we develop that habit?

Transcript

Eric Normand: How do you create a habit of reuse instead of making new stuff? Hi, my name is Eric Normand and these are my thoughts on functional programming.

In functional programming, at least the way I'm used to seeing it, we tend to reuse pieces a lot more than in typical object-oriented programming. Why is that? I don't think it has anything to do with the paradigm or the language. I think it has a lot more to do with the habits, habits that the programmers have in those various communities.

I'll give an example. Actually, Rich Hickey gives a really good talk where he's talking about how, in the Java world, there's a thing called servlets. The servlet defines how to process an HTTP request. There's a class called HttpRequest, something like that. It defines all the stuff that you would expect in an HTTP request.

He puts the slide up and the slide has all the methods in the class, just their signatures, and it's pretty clear what they do. They're well named, but he says, what are the hashmaps? Where are all the hashmaps? It's kind of a weird question. You're not used to asking that, but sometimes things are pretty clearly hashmapping.

There was a method called getHeader. It took a string and it returned a string. It's kind of like getting out of a hashmap. It's got a key and a value. Then there was also getProperty, and you would pass in a string and get a string back. If you could do getPropertyNames, it would give you a list of all of the property names that were defined on that request.

You could see, yeah, these are hashmaps. Why did they define their own custom little protocol for that? Then he's like, "But look, there's actually deeper hashmaps that you don't think about." Because all these other things that are just defined methods on the class, they're just getters, so they're just stuff like getPort, and getHost, and getIP.

Why not just make those key? Those are just the keys, the IP, the host, the port. The whole thing should be a hashmap and these are the keys. What he was getting at was that in Clojure, we have this thing called Ring that defines the hashmap format for representing an HTTP request.

That's really cool because it means that you don't need a new type. You just need this new spec for what goes into the hashmap. Everyone already knows how to use a hashmap.

I mean, if you're a Clojure programmer using hashmaps, and there's all sorts of functions for dealing with them already, and they can be serialized. Basically, we're getting all this reuse out of hashmaps. When in Java, they have hashmaps. They could have done that.

Instead, they chose to define a new class that required more documentation, required re-implementing a lot of this functionality. Maybe it even uses the hashmap internally to represent those headers, they might, it might. The whole point was that in Clojure, this is what we do. We reuse. It looks like a hashmap, just use a hashmap. Why would you create a new type just for that?

In Java, they don't do that. To me, it's just a habit. Now, how do you develop that habit? That was the question I started with. The habit is really about understanding the two parts. You have to understand, number one, what you've got already. What are the things you've got that you could reuse? You've got to understand your data types.

You got to know you got vectors, you got hashmaps, you got sets, you got your sequence abstractions. You got all sorts of things that you got to learn and have indexed. You have to have these data types indexed by their access patterns.

If you look at a thing and it says it's a getProperty, and it takes a string and it returns a string, now you got to be thinking, "Well, that's kind of a key value thing. That's probably a hashmap." You have to have that automatically.

The second thing that you have to be doing, just constantly be accessing that index. Instead of thinking, "What new thing do I need to make?" You should be thinking, "What thing can I be reusing?" Then you start to look at the thing in terms of these access patterns.

The access patterns are going to depend on the language. Clojure defines a number of them. They're standard interfaces that come with the language. They include stuff like the sequence abstraction, which is how you access items one at a time, whether you're going to be remembering duplicates. That's an access pattern.

There's accessing a value, given the key. It's adding stuff to a collection. Where are you going to add it? You can add to the front, to the back, if it maintains order. These are all the types of things you have to be thinking about your data structures.

You can't just think of...Unfortunately, I was taught this way. Like a list — that's just an ordered collection, but it's not quite. To think of it that way is missing a little bit of the important information. I was taught Java and you have a list interface and it has these methods on it.

One of the methods is like get and you give it an index, an integer, and it'll give you the item at that index, which sounds reasonable. The problem is that when you implement that interface, get might change algorithmic complexity.

If you have an array list, which is another type in Java, it can quickly give you the value based on the index because it's implemented with a big array. Arrays can do that random access into the array.

If you use the linked list, which is another type in Java, another class, then you can't do that. You have to actually iterate through the list, counting how many you've seen and then return the last one, the one at that index, so it's linear as the list gets bigger.

To say that they are both implementing the same interface, it's kind of a lie because your algorithm could go from constant time to linear, or worse, it could go from linear to quadratic. It's accidentally quadratic just because you didn't realize that that was going to be accessed.

What Clojure has done is to a large extent — it's not perfect at this because you can still get an element at an index — it just uses a different function from get. If you're in a linked list you can do nth, but nth is known to be potentially linear, whereas get on a list doesn't work, but get on a vector does work because it's constant time.

These operations, part of the contract is that they maintain algorithmic complexity. When you implement those interfaces, part of the contract is you shouldn't be implementing this if you cannot be constant time.

Like I said, to a large extent, it's like 90 percent, this is true. There are some exceptions. They're unfortunate but to a large extent, I believe that Clojure does it right. These are the things you need to be thinking about.

If you're a programmer, you're doing anything at any kind of reasonable scale, you have to start thinking about the algorithmic complexity, and it should be part of the interface. Those are the things that need to be the first choice you make. How am I going to access this stuff? Then, of course, that leads to this kind of reuse.

Why re-implement the hashmap or even wrap it up if I'm going to have to define all these new methods? Why not just use what's already there?

My name is Eric Normand. You can reach me on Twitter. I'm @ericnormand with a D. You can also email me eric@lispcast.com. I hope to hear from you because I love getting into discussions with people. Awesome. Rock on.