Why is the associative property important?

We look at several examples where the associative property gives us expressive power.

Transcript

[00:00:00] Why is the associative property important?

[00:00:06] Hello, my name is Eric Normand, and this is another episode of my podcast. Welcome.

[00:00:12] So I know several programmers who have an aversion to abstract mathematical ideas and the associative property and by extension, monoids, terms like that, kind of scare them. When I pick at them about it , try to pull on the thread like what's going on, what they tell me is that they just can't see how it's important.

[00:00:40] Like why, why should they learn it? They're good programmers and you know, this kind of language just sounds complicated and doesn't help them in their day to day. And they're smart programmers, like they're doing really hard stuff like machine learning.

[00:01:02] They're doing complicated math. So why this aversion? And while this, this episode is for them, for that kind of person, I've been there before too. I've wondered why this complicated math, category theory, algebra, why will it help me in my day-to-day programming life?

[00:01:28] The associative property is one of those things that you probably already know and you do all the time, and this is just gonna give you a name for it. That name lets you communicate better and it lets you have a better handle in your own mind of what you're actually doing and why it works and et cetera. And maybe you can master that instead of it just being some intuitive thing that sometimes you do and sometimes you don't. You can master it by having a name for it, being able to plug it in with what other people have come up with around the same topic.

[00:02:05] Let's do a brief intro to the associative property. The associative property is cool because it lets you break down problem into sub problems, and then you can solve those sub problems more easily. Very often it's easier to solve a little piece of the problem and then take those solutions from all the different pieces you've solved and compose them back together. It is like a reductionist strategy: divide the thing up, conquer each of the subparts, and then you get to put them back together.

[00:02:45] The associative property is what lets you combine them back together. You can do it if you have an associative operation. So that's why it's important because it lets you solve bigger problems than you could solve if you couldn't break them down.

[00:03:07] What is the associate property? The associative property means that the order of operations doesn't matter. When you're in math class and you had to learn we're gonna multiply first and then we're gonna add, or you could look just at the parentheses, right?

[00:03:32] So the parentheses take precedence and in Lisp and a lot of other languages, the parentheses indicate the order in which the operations are going to happen. So you do the inner parenthesis first. And then you pass those in their argument positions to the outer parentheses and you keep going out and out and out.

[00:03:58] What the associate property is saying is you can parenthesize it any way you want, doesn't matter which way you do it. So let's look at an example. You have an addition problem. Let's just make a really easy one. One plus two plus three. Addition is associative, which means that you can do one plus two first and then add three to it. You can do the two plus three first and then add one at the beginning, right? So then that, that two plus three gives you five, and then you do one plus five, and you get six. You'll get the same answer.

[00:04:38] You can do one plus two, which is three, and then three plus three six, or you can do the two plus three, and then it gives you five, and then one plus five gives you six. That's the associated property. You can put the parentheses around the first two or around the second two, and you'll get the same answer.

[00:04:59] That's all it means. Let's say you're merging hash maps, you have three hash maps you wanna merge, it doesn't matter what order you merge them in, in the sense of order of operations. The argument order does matter, but the order of operations where you put your parentheses does not matter.

[00:05:20] And you can check that. You can check it at a REPL you can write a little program to to test whether merge is associative. And that's an important property of it because it means that you can always break down whatever problem you want into sub maps that you're gonna merge back.

[00:05:42] These things can be done in parallel. You can break it down almost arbitrarily. So it doesn't matter. Let's give another example. You're going to be merging two sorted lists into a larger sorted list. You're gonna pick the lowest one on each list. That's what Merge Sort is, right? Merge sort is: You take a unsorted list, you cut it in half, and then you recur onto that smaller list until you have a trivial case, which is an empty list or a list with one element. Both of those are sorted by default. And then you're gonna start merging them back together in the same way you broke them down.

[00:06:35] What the associated property says is that you can break it up pretty much arbitrarily. Instead of cutting the list in half, you can break it up just by cutting the first element off of the list and then recursing down the rest of the list. You could do it in some like three quarters or one fifth or however you want to break it down. That is the magic of the associated property. It gives you a kind of flexibility: I don't care how I break it down.

[00:07:12] Another example is you're tracking many hours are you working every day? You can break down that big data set. Let's say you have a whole year of data and you can say, well, I'm going to aggregate these by week and then I'm gonna sum up all the weeks. You're breaking it up by week. Sum it up. You can say how many you had in the whole year.

[00:07:38] Or you could break it down by month and add it up, right? You're gonna get the same yearly number of hours worked, doesn't matter how you break down the problem. This gives us tremendous amounts of flexibility and all of that is because plus is associative.

[00:07:56] Merge to sorted lists is associative or merge to hash maps is associative. You can break it up in arbitrary ways. And we do this all the time, and we probably don't even think of it as a named property.

[00:08:15] What I contend is that it's something that you can design. You can ensure from the beginning, just by setting the goal of having your operations be associative because you know that you're going to want to compose them up from smaller subparts, so you're gonna wanna break up a problem into smaller problems, just like merge sort. When you're designing your domain model, you can make your operations associative .

[00:08:53] It's very similar to the closure property, because the closure property, what it says is you're going to return the same type as your arguments, which lets you nest. But if you have this additional thing, not just the same types, but also that the order of operations doesn't matter. It gives you a lot more flexibility, a lot more guarantees about how you're able to operate on large data sets and break them up.

[00:09:26] In the book , I'm gonna give an example of a shopping list. You have a restaurant and you need to source all of these ingredients that you're gonna use at your restaurant to cook your food. How do you build up that shopping list? Basically like a inventory, a plan for where you're gonna buy everything, how much you need. How do you construct that?

[00:10:00] One thing you could do is assume that this week is gonna be very much like last week. I'm gonna need the same amount of ingredients this week as I had last week. I'm not sure if that's a good business decision, but let's just assume that that's how we're gonna do it. So what can you do? You can take all the pizzas that you sold last week, and figure out the ingredients list for each one. So that's how you're breaking it up. You're turning each one of those pizzas into an ingredients list.

[00:10:36] Each pizza uses so much flour, so much oil, so much salt, so much tomato sauce, so much cheese. Each one gets converted into this list. And now you can sum up all of these lists into a single list. And that will give you your weekly needs. And maybe you can even do something else like, add a 10% margin to make sure we don't run out because weeks aren't exactly the same. We're gonna multiply each of the needs by 1.1, and that'll give us a better estimate of what we might wanna buy.

[00:11:15] You can do this in a way that is very composable. Convert each pizza into an ingredient list and you sum them up. Okay? So we're gonna create a little algebra of ingredient lists. It is my preference, but it's much better than doing a for loop over all of the pizzas and manually summing into the different ingredients in the list. You just make this operation called add ingredient lists and it'll take two ingredient lists and add 'em together. No for loop involved, right? It's just merge, but merge with plus on the values when the keys collide.

[00:12:01] It's associative, so it doesn't matter what order of operations you do it in, doesn't matter how you break it up. You wanna break it up by week. You wanna break it up by month, by day by, by type of pizza. However you wanna break it up, you can because it's all associative.

[00:12:21] My name is Eric Normand. This has been another episode of my podcast. Thanks for listening, and as always, rock on!