Timelines are a system I developed for modeling time in a distributed system. You will find timelines whenever you have multiple machines, multiple processes, multiple threads, or multiple async callback chains. Since virtually all software is distributed these days, modeling time is more important than ever. Functional programming may not have all the answers, but it is asking the right questions. In this episode, we go over what timelines are and how you can start to use them to model time in your software.
Eric Normand: What is a timeline and what does it have to do with functional programming? By the end of this episode, I hope to give you a new perspective on distributed systems and at least one thing that makes them complex and also how functional programming can help reduce or manage that complexity.
My name is Eric Normand and I help people thrive with functional programming. This topic of timelines and distributed systems is more vital than it has ever been. Right now, basically every piece of software we write, virtually everyone, is a distributed system. It’s distributed programming. For program on the Web, we have a client and a server, just automatically by definition, it’s the distributed system.
Even the server, itself, is going to be talking to a database process. It could be on a different machine, but even on the same machine it’s a different process. We’ve got a third-party API that you’re talking to. Your mobile app is, probably, communicating back with the server, it’s using some cloud syncing, something like that.
We’re talking about distributed systems all the time. I hope that’s really clear and it’s awesome.
Distributed systems are cool, but the problem is that we’re still using a sequential, one step at a time, paradigm. Our programming paradigm, all the models of programming that we use…OK, I don’t want to say all, but the most common, most popular ones we use are all sequential. One thing at a time and anything that is dealing with the distributed system is bolted on.
Functional programming has been dealing with distributed systems for a long time and has…I don’t know if I would say it has all the answers, but it certainly been asking the questions for a long time. It knows the questions to ask.
That’s why I feel like functional programming is going to be helpful and that’s why functional programming is becoming more popular today.
Let’s talk about timelines. Timelines is my concept. I believe it is what experienced functional programmers who deal with distributed systems eventually get to. It’s just a kind of a more formalized process and model.
I believe that, when you’re dealing with a distributed system, you have to have an explicit model of time. You have to be in modeling time. Time becomes the most difficult thing because this computer’s time, not like the clock, but this computer’s time, meaning the order that things happen in, and that computer’s time are different. They’re different timelines.
My stuff executes in order, your stuff executes in order, but when you run them at the same time, we don’t know which ones are going to happen first between the two computers. You have to actually model that.
Your thing might happen before my thing happens, or my thing could happen before your thing happens. Are we going to get the same answer? What if they happen at the same time? Is that going to be a problem? We have to start thinking about this and modeling that these things could happen in different orders.
Our normal imperative programming does have a model of time, but it’s implicit. You might not notice it until you realize that there could be multiple models of time. The model of time for an imperative programming is one thing happens at a time. Everything happens after the thing that came before it. There’s not really an idea of other things happening at the same time.
I have the CPU. I’m going to do some stuff. Now, you have the CPU. You’re going to do some stuff. I can work in a sequential manner. It’s a very nice model of time. The problem is, it doesn’t work at all in a distributed system. What we have to do is manipulate this time. We need a model that lets us change and control how things happen, what order things happen in.
Timelines are my answer to that. They’re a tool for visualizing what happens on multiple threads, multiple machines, modeling, how they can interleave with each other, and then manipulating it. Once you figure out what are all the things happening, how can we make it so that it’s safer, it’s going to give us the right answer.
Let’s go into the model. A timeline is a sequential series of actions. That means each timeline has the semantics of an imperative program, a single-threaded imperative program.
Each thing happens and finishes before the next thing happens. Then you can have multiple of these, all at the same time.
Where can you find them? You find timelines…Let’s call it in the obvious places. If you have two machines running and communicating, you have two timelines. Each machine has its own timeline.
You could have multiple processes on those machines talking to each other, then you have those timelines, each process has its own timeline. You could have multiple threads in each process so each one of those would be a timeline.
Imagine if I make an Ajax request and that callback, I make another Ajax request, let’s call it A and B. From the callback of B, I make another Ajax request for C, when the callback for that, I print it out, I print out the answer.
I’m doing A and then B and then C and then print it out. Because it’s a sequential series of actions, that also is a timeline. I could start two Ajax requests at the same time. That is now two timelines that are running because I can’t determine which answer is going to come back first, which Web response is going to come back first.
I need to model that indeterminacy of the ordering of things because the order might matter.
Timelines, if you were going to draw them, what would you draw on them? You would draw the actions. You don’t have to draw the calculations or the data. Calculations, by definition, “are outside of time.”
They do not depend on anything like when they run, how many times they run, what else is running at the same time. They are timeless. Because they are timeless, they don’t have to show up on the timeline, which is a very valuable thing. It means that all you have to deal with on the timelines are the actions.
If you do more stuff, you push your code into calculations by removing what functional programmers often call “side effects,” you avoid having to deal with the messiness of actions.
I draw it like a line with boxes on it going down the paper. Each box is an action and the lines form between the actions and down is the flow of time, in the direction of flow of time.
When you have two timelines next to each other, they can interleave. The actions of each timeline can interleave, meaning all those lines between the boxes are allowed to stretch. If I make an Ajax request, I can’t control how long it’s going to take for the response to come back. It could be fast. It could be slow. The network could be really slow at that moment, or it could be really fast.
The actions of sending and the action of receiving, they could be separated by a lot of time. Those lines in between the boxes, they can stretch out, or they can compress. When they stretch out, that means there’s room for other actions to come in, other actions not come into their timeline but come in between them in wall-clock time.
If you could have some kind of master clock of when things are happening, another timeline’s actions can slip in there. You really have to deal with, at any moment, all the possible interleavings. And there are a lot. It is factorial in the equation for determining how many possible combinations there are. That’s a lot.
Just to give a quick example I know off the top of my head, if you have two timelines with 12 actions each, 12 steps in each timeline, there are already a million different ways that they could interleave. The question is, can you figure out if they’re always going to give you the same answer?
I’ve developed some concepts to help you figure that out. First of all, if they don’t share anything, if they don’t use the same resources, if they don’t communicate, then they’re always going to get the same answer relative to each other.
Like thread A and thread B, if they don’t share anything, the interleavings don’t matter. The thread A might interact with thread C. Then the interleavings matter. The thread A and B relative to each other, they don’t matter.
One way of reducing the complexity, the amount that you have to think about in your head, is to isolate two timelines, meaning get rid of any shared resources. That means the thread A and thread B both use the same global variable. Make it so that they don’t use a global variable. Make it so that they don’t have to do that.
The more you can do that, the less you have to worry about the interleavings between the two. The ideal would be that they have no shared resources. You can just totally forget about the interleavings, but maybe you have one or two that are left.
Once you got rid of all the ones you can and they’re not isolated. Maybe there’s some left because you’re always going to have actions. You need actions in your program where it won’t have any effect on the world.
Once you have that, what do you do with the actions that are left? The first thing is you can start coordinating. You use concurrency primitives to coordinate. One thing you could do is to use a queue. The order still matters. Thread A might get into the queue before thread B does, but at least they’re not happening at the same time.
It lets you do multiple actions. It’s my turn. I can do multiple actions and get a transactionality to it. I can do four things and then it’s released to turn to the next thread. Then it can do its three or four actions, and then release it.
Putting stuff in a queue is one way to coordinate different threads. Another thing to do is when you’ve got your timelines, you can start to shorten them. I call it a cut. It’s where you have like a checkpoint between the two threads.
I like to use simple examples. Imagine you’re counting a big pile of books. It’s too many books for yourself. You know you want to get to lunch, so you invite a friend to come over and help you count the books. You have some books in one room and some books in another.
You’re not coordinating between. You don’t know who’s going to finish first, so you make a little agreement before you start counting. You say, “Hey, look, when you’re done, come into the living room. Whoever’s done first, we’ll wait for the other. Then we’ll go to lunch together.”
You make a little coordination between you. You don’t know who’s going to finish first. You don’t want to spend your whole time asking, “Hey, are you done yet? Hey, are you done yet?” You just want to get the work done and then talk, but you don’t know who’s going to finish first.
This is a good way that each person is going to wait for the other. That’s a way to coordinate. What it lets you do is you can divide the timelines by what happens before the coordination point and what happens after the coordination points. You’re actually able to totally analyze them completely separately.
Before the timeline, the books have not been counted, or they’re impartially counted. After the timeline, they’re totally counted. You can just take the two answers and add them up, and you have the answer.
You have Promise.all. It will wait for all the promises before continuing forward. That’s a way of cutting, to say the stuff that happens before, there might be coordination I have to do. There might be other stuff I have to do, but I can deal with half the timeline at a time instead of the whole thing.
Let’s say you do put stuff in a queue. You’re still going to have ordering problems. The time you get into the queue is going to determine the effect of your actions. One other thing you can do is to make the actions not depend on order. This is a thing that we call commutativity.
We want to make it so that these two actions can happen, A first then B, or B first then A. We guarantee we get the same result. These things are very situation-dependent, but there are a bunch of data structures that have commutative properties. Here’s the example, it’s kind of a silly example.
Let’s say that we want to count people who are all…they’re like in a conference. They’re moving around in different rooms and I’m afraid that if I start counting people, someone’s going to leave and go to the other room and you’re going to count the same person and what do we do?
If we scan badges…every person has a badge and that badge has an ID. I can scan a badge and that person can leave the room and then you scan the same badge in the other room but we’re not going to count it twice. We’re just going to make a big set of all the IDs we counted and the IDs we scanned. Then once we are done, we think we’re done.
We counted everybody, then we are going to count how many unique IDs did we count. Now notice, if I scan them first, or you scan them first, it doesn’t matter. It’s all going into the same big bucket. That’s a way of making the order not matter.
Finally, there are things that you only want to happen once. You want a guarantee that it happens exactly one time. An example of this is sending an email. If I want to send an email to one of my customers, let’s say they buy something. I want to send them an email, with the confirmation, with their receipt. I only want that to happen once.
I don’t want it to…like, what happens in a distributed system is the network might go down. My server sends a message to the email server saying, “Hey, please send this person an email.” I never hear a response back. I don’t know positive or negative.
“What happened?” Probably something in the network broke or the program on the other side crashed, or something. I don’t know. I just didn’t hear back. What do I do? If I send it again, the same message again, what if they already sent it the first time? What if the program crashed after it sent the email? I don’t know. Maybe it crashed before. I have no idea because I didn’t get a confirmation.
If I send that again, it might send a second email. If I don’t send it, what if it had sent it zero times? From my perspective, the server, I’m in a bad place. What you’d want to do is introduce idempotency. Meaning, I can send the same request twice. I can have the same action, I execute it twice, but it only has the effect once.
I can send it a hundred times, if I’m really paranoid that you’re not going to send it, or that the network is so bad, I need to keep retrying, retrying, and retrying . I’m sure that it’ll eventually get through.
This is idempotence. It’s a way of guaranteeing that something won’t happen more than once. That lets you avoid the zero case. If the email server does come back up, I can keep sending the same request over and over, it only happens once. This is another thing that we do as functional programmers.
I’ve gone on quite long with this timeline idea. I’m developing this as part of the book that I’m writing. There will be a much better introduction to this in that book. It’s actually the chapters I’m working on right now.
Let me recap. We work in distributed systems now. All software is going to have multiple pieces to it that are communicating. We just can’t deal with this sequential model of time anymore. We need a way of visualizing and developing new models of time that are bespoke for the systems that we’re dealing with now.
The timelines are my attempt at making that. Timelines come about when you have multiple machines, multiple processes, multiple threads, or asynch callback chains. In all those cases, you’re dealing with timelines and you want to be able to manipulate those.
We can make timelines easier to work with by isolating them, meaning remove shared resources. If they still have shared resources, which many will, then you can start coordinating between them. Shorter timelines are easier to deal with than long timelines.
You want some way of being able to cut these timelines into pieces and deal with them at a smaller level. Actions that don’t depend on order are much easier to work with and reason about. Idempotence will allow you to deal with failures in a graceful way, without sending someone a thousand emails by accident.
I want to give you a win. I want to give you some takeaway in this. Go check out some bug that happens very rarely. I want you to map out all the pieces that you believe are involved in this bug. Draw out timelines. Draw boxes for every action in each of those pieces. See if you can figure out an order that generates that bug.
Just try it. If it’s too complicated, make up a scenario that’s simpler, that’s related. All right. Now do me a favor, please. If you found this useful, please like, plus one, thumbs up, up vote, share, heart, favorite, whatever you can do. All of it really helps. Please comment and, certainly, subscribe.
If you subscribe, you will get updates as they come out. If you found this one useful, I make more just like this. Also, if you want to ask any questions…I love answering questions in the future episodes. You can get in touch with me by email, eric@LispCast.com; on Twitter, @ericnormand, with a D; or find me on LinkedIn and ask me a question there.
Awesome. This has been pretty long, so I’ll cut it off right here. Thank you.