Why does time move forward?

Last week, we talked about how humans measure and experience time. We determined that, although time is ultimately defined by the cyclical events we use to mark its passage, it has an undeniable forward quality as we move through our lives. We called this quality the “arrow of time,” and rooted it in a simple observation: of the large set of events that are technically allowed under the laws of physics, only certain events ever seem to occur.

Today, I’d like to discuss why time’s arrow only moves forward – why some things happen and others don’t. Why, for example, does a drop of food coloring always spread out to mix with a glass of clear water? Why doesn’t it ever condense back into a single drop? The answer, as I alluded to last week, boils down to a word you’ve probably come across before: entropy.

To understand entropy, we need to leave the human realm and take a trip to the quantum realm. Recall that what distinguishes this realm is its small size scale, on the order of individual atoms and molecules. Ultimately, the keys to understanding entropy, and indeed time’s arrow itself, lie here at the bottom. Before we talk about entropy, though, let’s define two useful terms that come into play in this realm, microstates and macrostates

What are microstates and macrostates?

Imagine you and your friend are playing a game*. The game is one that she invented. “It’s called Penny Encoder,” she tells you, “and it goes like this:”

I’m going to leave the room. Take these pennies and place them heads up or tails up in a row on the table. Take a picture of your arrangement and then put the pennies away. Then, using as few characters as possible, write down a description of your arrangement, so that when I come back in, I can reconstruct the row of pennies just as you had it. We both win if my arrangement is equivalent to your picture. 

Your friend leaves, and you lay out a row of five pennies in the following arrangement:

You take the above picture, put the pennies back in their special box, and think about how to encode your configuration. After a minute, and thinking yourself to be quite clever, you write “HTTHH” on a piece of paper. Of course, when your friend comes back, she is easily able to replicate the pattern of pennies, and you both win the game. 

Now it’s your turn to leave the room. When you return, your friend hands you a piece of paper that just reads “H1.” Somewhat perplexed, you make the following guess about how the pennies should be ordered:

Your friend says, “correct!” and shows you the picture of her original configuration:

“Hang on,” you say, “That’s not the same as what I guessed.”

“Sure it is,” she says. “My arrangement has one heads and four tails, and so does yours. They’re equivalent. And I was able to able to describe it in two characters, where you needed five.”

You retort, “Sure, but I thought the point of the game was to correctly reconstruct the position and orientation of each individual penny!”

Who’s correct here? Well, the real problem is that the rules of Penny Encoder do not specify whether the microstate or the macrostate needs to be reconstructed to constitute a correct answer. Okay, let’s deal with those terms in detail.

A microstate is what you described with “HTTHH.” It completely encodes the state (heads or tails) of every penny in the group. A macrostate, on the other hand, is what your friend described with “H1.” It describes the pennies as a single system (“one is heads-up; the rest are not”) without regard for the state of any individual coin. Every possible configuration of the pennies is associated with both a microstate and a macrostate, but you’ll notice right away that the relationship between them isn’t one to one. What do I mean by that?

Well, clearly every microstate has exactly one associated macrostate. For example, “HTTHH” could be rewritten as “H3,” but it could not be rewritten as “H2,” because the number of heads-up coins in “HTTHH” is 3, not 2. However, as we’ve seen, a macrostate like “H1” maps to multiple possible microstates – five, in fact: “HTTTT,” “THTTT,” “TTHTT,” “TTTHT,” and “TTTTH.” 

Some macrostates have more associated microstates than others have. For example, “H5” has only one:

(If you’re snarky, you might argue that we could swap around the order of these pennies, and that would constitute a new microstate. My answer to that is that we’re assuming that each penny is indistinguishable from any other and so it is completely described by its heads/tails orientation and its relative position in the line. So swapping any two heads-up pennies does not change the microstate, but swapping a heads-up penny with a tails-up one does. This argument will be more convincing later when we talk about molecules.)

What is entropy?

We’ve established that every microstate of a system corresponds to exactly one macrostate, and every macrostate corresponds to at least one microstate. Well, entropy is a characteristic of a macrostate. Essentially, entropy is the number of microstates that a given macrostate could map to**. Let’s return to Penny Encoder, and consider all of the game’s macrostates. Since each coin has only two possible states, this is easy; the state will have H values between 0 and 5. What is the entropy of each macrostate?

That’s it! That’s really all entropy means. You’ll no doubt notice the pattern in the above table: at least in this example, macrostates “in the middle,” –that is, those under which the difference between the number of heads-up and tails-up coins is small– have higher entropy. For more complicated systems, this table can be described as an (extremely pointy) bell curve peaking sharply in those middle states.

In fact, this peak is important to our intuitive understanding. Entropy is maximized when a system is more “mixed.” Put another way, in a macrostate with high entropy, it is difficult to predict the status of any constituent part of the system. With “4H,” for example, we can guess with 80% accuracy that a given coin is heads up, because ⅘ of them are. With “3H,” that accuracy drops to 60%. Clearly, the most difficult predictions would be in a larger system evenly split between heads and tails coins, from which we could not predict the state of individual coins with any accuracy better than random guesses.

Why does entropy matter? Let’s think back to the game we played last week with the projectionist, where we tried to guess whether we were seeing films playing normally or in reverse. Suppose we watch a video of a single coin being flipped, except we don’t see the mechanism that actually flips the coin. We just see the string of outcomes. It would take some pattern of tails, tails, heads, tails, heads, tails… We would have no way of determining the time-direction of this movie! The random sequence of a single coin’s orientations makes no irreversible progress. 

Suppose, now, that we watch a second movie, just like the previous one, except there are 100 coins in a 10 by 10 grid. They have no discernable pattern of orientation; heads- and tails-up coins are scattered throughout the grid. Several times a second, a random coin in the grid is flipped. After a few minutes, we see that all of the coins are tails up and then the movie stops. Of course, we cannot know with absolute certainty whether this movie is playing forward or backward. But common sense tells us that we’re far more likely to see the coins start in a state of 100T and move to a more mixed macrostate than the other way around.

Why is that? Essentially, since the coins are flipped at random, then over a sufficiently long time period, every microstate is equally likely. But the macrostate 100T only has one microstate associated with it, whereas a macrostate like 50T has hundreds! With all likelihood, enough random changes in the system will yield a many-microstate (or high-entropy) macrostate.

Put another way, the entropy of an isolated system over some significant time period will never decrease. Physicists call this the second law of thermodynamics, and I will call it 2LTD for short. Notice that this does not preclude random fluctuations in a system’s microstate that happen to decrease its entropy temporarily. Such events are certainly allowed, but they will only be noticeable in extremely small systems. 

This is why we had to visit the quantum realm to have this discussion: time essentially has no direction at this size scale. From the perspective of a single atom or molecule, every event is a random fluctuation with no detectable direction. Zoomed out any further, though, the universe can be described as a single system with untold microstates slowly converging toward the most probable macrostate. Even in the human realm, we do not observe such small changes; all we see is the steady increase of entropy.

How is entropy actually related to the arrow of time?

Hang on, you might be thinking. We just made a pretty huge leap from a statement about a weird coin movie to one about the entire universe. Surely the configuration of, say, a group of molecules is more complicated than the binary heads-or-tails states of pennies. That’s true, so at this point, it’s worth stepping back into the human realm and asking what entropy has to do with the conclusion we drew last week, that there are certain events we know to be physically possible but never observe.

Think about some of the clearest examples of time’s arrow in the human realm. Many of them involve some kind of diffusion, or spreading out, of either energy (in the form of heat) or matter. Butter melts in a pan, dye spreads out in water, smoke from a bonfire mixes with the surrounding air. We can think of all of these diffusions as a transition to a macrostate with greater entropy. 

Take the butter in the pan, for example***. Initially, the butter molecules are colder (bluer) than pan’s (hotter, redder) molecules, rendered here in beautiful 2-D Art™:

The molecules exchange heat via random collisions with one another – let’s say that each molecule only collides with the one directly across from it. Because the total amount of energy must be conserved, every collision between two molecules makes one warmer and the other colder. Thus, it is technically possible for the butter to give heat to the pan and become colder…

…but that would be a more “polarized” macrostate, with fewer possible configurations (because we are approaching the macrostate where all of the butter is “maximally” cold, and the pan “maximally” hot, which only has one microstate). It is much more likely, then, that the pan’s hot molecules will grant heat to the butter, cooling the pan slightly and warming the butter, causing it to melt. 

When the two objects are the same temperature (equilibrium), they have reached a system-wide state of maximum entropy. Put another way, as we said above: given the temperature of a random molecule from the whole system, it gets more and more difficult to guess whether it is a pan molecule or a butter molecule.

The diffusions of matter in space work the same way. The starting configuration of a drop of dye in a glass of water is one of minimum entropy. Given the position of a random molecule, we can know right away whether it is a dye molecule or a water molecule. But after enough random collisions, the system will approach the maximally entropic macrostate –that of the dye molecules being evenly spread throughout the water– because there are so many corresponding microstates. Not coincidentally, this kind of diffusion would be easy to spot as going forward in time, just as the melting butter would.

So we now see that the processes we described last week (that is, the ones we only see happen in one direction) are inextricably linked to an increase in entropy. Can we say that changes in entropy allow us to observe events moving forward in time? In fact, the truth is even more stark: the increase of entropy is the only physical reason that time moves forward.  

Note the use of “only” here. Why are we so sure that some other phenomenon isn’t contributing to time’s directionality? Because entropy is the only quantity in nature bound by a one-way law like 2LTD. Other behavioral laws of nature, such as the conservation of certain quantities, are symmetric. Every fluctuation has some opposite consequence, and fundamental constants do not change. But 2LTD uniquely mandates a steady, asymmetric increase in the universe’s entropy. Because the universe is an isolated system (and the only one we’re aware of, which is why we call it a “universe”), we should think of the statistical laws that force overall entropy to increase**** as the sole cause, not merely an observable effect, of time’s arrow.


Phew. Once again, thanks for sticking with me through what turned into a really long post. As before, I’ll acknowledge that you may have heard some of this before, but my goal is not to highlight what you didn’t already know. In fact, my goal is the opposite. It’s to highlight the importance of what sits intuitively in your mind.

I spoke in my first post about the heavy use of analogy in physics and my view that, at the end of the day, we must accept that the analogy is not merely describing the reality of an external world; in fact the analogy is the reality of a world we can only interact with by sensing and reasoning with our metaphor-hungry human minds. So when we say that entropy’s increase causes time to move forward, I really do mean it. I’m sure there are physicists out there who would take issue with my use of the word “cause” (and I’m eager to hear from them!), but the way I see it, if increasing entropy is the only way for us to conceptualize time’s arrow, it truly is the only cause.

Next week, we’ll take our most exotic journey yet, to the cosmic realm. Up there, everything we thought we knew about time will be called into question.


*I have been –and will continue to be– referring to thought experiments in the form of “games.” Generally speaking, these games are no fun and I do not endorse playing them. That said, I have often proven to be a poor judge of what other people will enjoy, so please let me know if you ever spend your Saturday night playing Penny Encoder.

**As with many things I’ll informally define on this blog, the scientific definition of entropy is slightly more complicated, but only slightly. Essentially, entropy is formally defined as the natural log of the number of microstates multiplied by a number called the Boltzmann Constant. This fact doesn’t change any of the subsequent analysis we will do.

***One caveat here is that in an actual cooking situation, the pan would be receiving heat from the stove, which means the pan-butter system is no longer “isolated,” as 2LTD requires. However, the analogy works just as well if the stove is turned off after the pan has been heated but before the butter is added.

****You may notice that 2LTD says that entropy cannot decrease, but when I describe time’s arrow, I talk about entropy increasing. Can entropy stay the same over time? The answer is yes, but only for a completely isolated system, which doesn’t really exist except at the scale of the entire universe. Eventually, the universe may reach total thermodynamic equilibrium, in which case time would stop.