Coffee cools, eggs scramble, the past never comes back. Yet the laws underneath run just as well in reverse. Here is the simplest machine I know that holds both truths at once, and shows where the missing order really goes.
Coffee cools on a table and never gathers heat from the room until it is steaming again. An egg breaks into a pan, its yolk and white stirred into streaks, and the streaks never climb back into a smooth shell. A deck is shuffled, and although no card has vanished, the crisp order that once ran from ace to king does not reassemble itself in one miraculous cut. A smell released in the corner of a room spreads outward and does not later collect itself into the bottle. The past leaves traces everywhere, but the past itself does not return.
This is the everyday arrow of time. It is not an abstract philosophical ornament placed on physics after the fact. It is the asymmetry we live inside: melting rather than unmelting, mixing rather than unmixing, forgetting rather than spontaneously remembering. And then comes the old difficulty, still sharp each time I meet it. The microscopic laws underneath these events do not seem to contain such an arrow. The equations for molecules, fields, and particles run backward as well as forward. Reverse all the velocities in an ideal movie of the molecules in a cooling cup, and the equations do not object. Nothing in the microscopic rule says that one temporal direction is the proper one.
I can write that down. I can prove the small version of it. I can make the machine run backward in front of my eyes. And something in me still refuses to accept what the proof is saying without flinching. The refusal is not mathematical. The mathematics is clean. The trouble is that the world made by the mathematics looks as if it has forgotten itself, and then the mathematics calmly says no, nothing was forgotten. That calmness is part of what I find maddening.
So why does time point one way? Or, stated with less grandeur and more danger, why do reversible laws so often give us irreversible-looking lives? The question is easy to ruin by making it too vague. I want a toy world where every microscopic detail is visible, every update can be checked, and the whole motion can be reversed on command. A line of black and white cells is the cleanest toy model I know for exactly this puzzle. It is a stripped-down stand-in for reversible microscopic physics: a world whose law has no preferred direction, yet whose visible history looks like an approach to equilibrium.
I also want one object to follow, otherwise the whole business becomes too smooth. Near the top of the picture there is a sparse, almost-white row, a little scatter of black cells against the white. It is not important because it is large. It is important because it is legible. You can point at it. You can say, there, that piece. Then the rule starts, and the question becomes sharp enough to hurt: after the scatter has been swallowed by the woven noise below, could it ever climb back out?
I want to begin with the picture, because the picture is doing much of the thinking for us before any words have a chance to get in the way, and because there is a small danger, in a subject like this, that the formal language will make the phenomenon seem more remote than it is. The phenomenon is right there: a few marks at the top, a spreading tangle underneath, and a strange feeling that something irreversible has happened on a page made by a rule that will turn out to be reversible.
Take a line of cells, each cell black or white. Start with a row that is almost all white, with a few black cells scattered through it. Now apply one fixed rule again and again, drawing each new row underneath the previous one, so that time becomes vertical and the future accumulates downward on the page. This is one of the pleasures of cellular automata: time, which in life is something one inhabits, becomes a visible direction, and the history of a system becomes a texture one can inspect with the eye.
The change starts at once. There is no long quiet prelude in which the black cells wait politely for the rest of the row to notice them. Each black cell opens into a widening diagonal wedge, as if it has been given a small local alphabet with which to write across time. The wedges cross, interfere, and tangle; after only a few dozen rows the lower part has become a dense woven fabric, a coarse tartan of diagonals that the eye reads as static. If I hid the top and showed you one row from near the bottom, you would call it random, and the row itself would give you no easy way to recover your mistake.
There is a detail in the picture that is easy to miss because the whole lower region has the air of undifferentiated noise. The pattern is not a smooth grey fog. It is woven. It contains slanting strands, collisions, small rhombi, repeated fragments, then fragments of fragments. The eye can follow a line for a while, then loses it where other lines pass through. The little scatter does not fade like ink in water; its identity is pulled into strands, copied into relations, made hard to name. That loss of track is already close to the central issue. Order may remain present in a form that no short description, no glance, and no local sample will catch.
This is the ordinary story of mixing, at least as it first presents itself. Ink spreads through water. A shuffled deck ceases to display the arrangement it had. A warm room opened to a cold night drifts toward a common temperature. Special-looking beginnings become generic-looking middles, and the return trip never appears in experience unless someone has done an implausible amount of arranging beforehand. Physics gives the generic condition a name, equilibrium, and the second law of thermodynamics gives the drift toward it the status of a law. After seeing enough examples, the law begins to feel less like a statement about heat than like a statement about the direction of time.
Yet even in those familiar examples there is a hidden qualification. When I say the ink spreads, I am describing a density field, a coarse colour seen by an eye with limited resolution, rather than all molecular positions and velocities. When I say the deck is shuffled, I am describing its failure to match some simple human pattern, rather than a literal loss of the order of the cards. The words we use for disorder are often words for what certain observers can no longer exploit. That thought is easy to say, and harder to keep in view when the picture looks so convincingly like a one-way process.
But I chose the rule behind the picture for a reason. It has a property that makes this comfortable account start to creak the moment one takes it seriously. The rule is designed so that, if one had enough patience and enough precision, the entire tangle could be unmade. The visible relaxation is produced by a dynamics that conserves the microscopic state perfectly. In this little world, the microscopic law has no arrow, while the visible history seems to have one. That is the whole thermodynamic puzzle in miniature, with nowhere for the discomfort to hide.
The rule is plain enough to say in one breath. The cells move on a checkerboard schedule. On one step, only the even-positioned cells are allowed to change while the odd-positioned cells stay fixed; on the next step the roles swap. When a cell gets its turn, it looks at its two neighbours. If they differ, the cell flips. If they match, it stays as it is.
The checkerboard detail matters. If every cell tried to update at once, the neighbours being consulted would themselves be changing, and the clean undoing would be obscured. By letting only half the cells move at a time, the moving cell sees neighbours that are held fixed during that instant. The update becomes a small reversible gate placed repeatedly along the line. One may think of the even cells as taking a breath while the odd cells stand still, then the odd cells doing the same while the even cells stand still.
Now the small miracle. Suppose a cell has flipped. During that move its two neighbours did not change, so the question that caused the flip, are my neighbours different?, still has the same answer one instant later. Give the cell the same turn again and it flips back. The local move is its own inverse. In symbols, if a cell with neighbours $a$ and $b$ updates by $c \mapsto c \oplus (a \oplus b)$, then doing it twice gives $\big(c \oplus (a\oplus b)\big) \oplus (a \oplus b) = c$, because anything XORed with itself cancels.
The notation is compact, so it is worth unpacking it for a moment. The symbol $\oplus$ is addition mod 2: white and black can be treated as 0 and 1, and adding two equal bits gives 0 while adding two different bits gives 1. The expression $a \oplus b$ is therefore the statement that the neighbours differ. If they do, it contributes 1 and flips $c$; if they match, it contributes 0 and leaves $c$ alone. Applying the same contribution twice adds it twice, and in mod 2 arithmetic twice any bit is 0. The algebra says in one line what the local picture says in a small motion: the move has a handle by which it can be pulled back.
Once each local move can be undone, the whole row can be undone, step by step, all the way back to the beginning. That changes the nature of the picture. Irreversibility usually enters when different pasts are folded into the same present, so that the present no longer contains enough information to choose among them. This rule never performs such a folding. Every row has one successor and one predecessor. The past has not been discarded; it is encoded in the present, and the decoding procedure is the same rule run in the opposite temporal order.
This is where my intuition starts to complain. The bottom of the picture looks like something into which many beginnings could have flowed, the way many pours of ink become the same smoky blur. But the rule says that no two beginnings ever arrive at the same row. If I know the entire row at the bottom, cell by cell, with no blurred pixels and no rounded densities, I know enough to reconstruct the top. There is no many-to-one compression in the dynamics. The visual many-to-one-ness is being supplied by me, by my way of grouping rows that look the same while differing in microscopic detail. Dry little aside: this is an awkward thing to discover about one's own eyes.
A tiny example makes the point less mystical. Suppose a moving cell has neighbours 0 and 1. Since they differ, the cell flips: 0 becomes 1 or 1 becomes 0. If the neighbours are still 0 and 1 and I apply the move again, it flips back. If the neighbours are 1 and 1, the cell does not move, and doing the move again leaves it where it was. In both cases, the second application erases the first. The rule does not need a stored memory of what happened. The present neighbours contain the instruction for the inverse.
I find it difficult to hold that fact in mind while looking at the woven static near the bottom, so it is better to let the machine demonstrate it. Run the rule forward and the sparse start will dissolve in the familiar way. Then click a row deep in the apparent noise and run backward from there. The counter records how many cells differ from the original starting row, and the graph draws that number against depth. Forward, the count rises and levels out. Backward, it retraces the same path and reaches zero, cell for cell.
The last row of that backward run is the original row, with no averaging and no tolerance. The side curve is symmetric in time, which gives the measured disorder an unsettling character: it rises in one direction and falls in the other under the same deterministic law. So the same picture carries two true descriptions. The row approaches what looks like equilibrium. The row has thrown away no information and can be returned to its special beginning. Nothing was lost, everything was scrambled, and I still don't know how to feel about that being true. The interesting work is to make those statements occupy the same world without making either one weaker.
One could try to escape by saying that the apparent equilibrium is an illusion, but that evades too much. The bottom row is harder to predict locally than the top. Its single-cell bias has changed. Its short summaries have changed. If I sample a handful of cells, I get statistics that are closer to those of a fair coin. The appearance is not a hallucination. The reversible microscopic law and the coarse equilibration are both facts about the same computation. The arrow of time, if it is going to survive here, has to be an arrow in what can be seen and used, not an arrow in the microscopic rule itself.
I want to keep the tension visible for a little longer, because the answer becomes clearer when the contradiction is made sharp. Reversibility is a statement with no mercy in it: the entire state at one time determines the entire state at any other time. Equilibration, as we first meet it, sounds like a loss of distinction: many special beginnings become one generic-looking condition. The paradox is born because both statements seem to be speaking about the same object, the row of cells, while measuring different things about it.
If the rule loses nothing, in what sense is the bottom of the picture more equilibrated than the top? Something has changed. The top row is easy to characterize: mostly white, with rare black cells. The little scatter has a name because it has a place. The bottom row resists that kind of summary. Reversibility does not erase the visual asymmetry. It says only that the asymmetry cannot come from information being destroyed. The arrow, if there is one, must lie in the relation between the row and the descriptions available to us.
There is a useful distinction between a state and a feature of a state. The exact row is a state. The fraction of black cells is a feature. The list of all neighbouring pairs is another feature. The question of whether a cell is more likely black than white is another. A reversible rule may preserve the exact state information while changing which features are easy to see. It can take a simple feature of the beginning, the low density of black cells, and convert it into a complicated feature of the later row, such as a pattern of parity relations extending across long distances.
A second clue comes from changing the beginning. Below are a dozen random starts, all evolved by the same rule. Their microscopic histories differ, and yet after a short time they become hard to distinguish by eye. Equilibrium, in this sense, is a shared appearance reached by many distinct states. That is already telling us that it belongs to a coarse description, a way of looking that ignores most of the microscopic distinctions the rule continues to preserve.
Look at the twelve strips as if they were twelve gases in twelve boxes. Their exact microstates are different. They have to be, because the rule never merges them. Yet after a while the eye classifies them together. The classification is coarse: dense, mixed, no large blank region, no obvious bias, no memorable shape. Many exact rows fall under that description, and a sparse row is special because it falls under a smaller and more restrictive description.
So the question becomes more precise. What part of the order can be seen locally, and what part has become invisible without having ceased to exist? If I want a satisfying answer to the arrow of time, I need to watch one piece of information travel, rather than speak of the whole row as though it were a uniform cloud. I need to watch the scatter stop being an object and become a constraint.
The word that needs care is lost. When a shuffled deck has lost its order, the cards have not evaporated into vagueness. They still form one definite sequence. What has gone is a certain usable simplicity: the ability to predict, compress, or see a pattern without inspecting the whole arrangement in detail.
In the cellular automaton, the analogous simplicity is visible in the first row. Since black cells are rare, I can predict a cell rather well by saying white before I look. I can describe the row economically by giving the locations of the exceptions. If there are $N$ cells and only a small fraction are black, a list of black positions is shorter than a full black-white transcript. The top row has a bias, and bias is a form of order because it gives a successful shortcut.
Information theory makes this into a number. The information in a thing is the amount one could not have guessed in advance. A row that is almost all white is easy to predict: point to a cell, guess white, and the guess usually succeeds. One way to measure the remaining uncertainty is to ask how many yes-or-no questions, on average, are needed to specify a single cell. That average is the entropy, and for a row that is black with probability $p$ it comes out to
$$ s \;=\; p\,\log_2\frac{1}{p} \;+\; (1-p)\,\log_2\frac{1}{1-p}\quad\text{bits per cell}. $$Put the starting row into it. The picture above began about 8% black, so $p = 0.08$, and
$$ s \;=\; 0.08\,\log_2\frac{1}{0.08} + 0.92\,\log_2\frac{1}{0.92} \;\approx\; 0.40\ \text{bits}. $$Less than half a bit per cell. A row with black and white equally likely, with no bias to exploit, reaches the maximum of one bit. The starting row is cheap to describe in this single-cell sense, and easy to predict with a crude guess.
The number $0.40$ is modest, but the meaning is concrete. If I ask for a cell from the starting row and know only the density, I have much less uncertainty than I would for a fair coin. The best first guess is white. Errors occur, but not often. If instead the cell were black half the time, no bias remains and one yes-or-no question is needed on average. Entropy per cell is therefore measuring the value of the bias as a predictive resource.
Now the pressure appears. Run the rule and look at one cell by itself. As time passes, that cell depends on more and more of the original row, and the calculation below shows that its bias is driven rapidly toward fifty-fifty. Taken alone, the cell becomes a fair coin, with entropy one bit. Yet the whole row cannot have gained true entropy, because the map from past to present is reversible. Whatever information was needed to specify the whole initial row is still enough, in principle, to specify the whole later row. A single cell has become maximally uncertain while the complete configuration has not become more expensive to describe. The missing order must have moved into dependencies among cells.
This is the first place where the word entropy splits into two uses that must be kept apart. The entropy of one cell after we ignore the rest can rise. The entropy of the entire row under an invertible transformation cannot rise, because an invertible relabelling of states preserves the number of possible exact states. In a finite system, if every possible row is mapped one-to-one onto another possible row, the map is a permutation. A permutation can rearrange uncertainty, but it cannot create more exact alternatives than were present before.
If $X$ is the whole initial row and $Y$ is the whole later row, the reversible rule gives a one-to-one correspondence between them, so their full entropies agree:
$$ H(Y) = H(X). $$But if $Y_i$ is one cell of the later row, the entropy $H(Y_i)$ can be larger than the entropy of one initial cell, because taking one coordinate discards the rest of the row. The operation "look at this cell and forget the others" is many-to-one, even when the physical evolution before it was not. In practice, that forgetting is where the apparent increase enters. The arrow begins where a complete state is replaced by a marginal view.
The bookkeeping is worth doing, because it turns the qualitative story into something one can watch being forced by the algebra. Instead of asking vaguely where the order went, I can pick one cell at time $t$ and ask which cells at time $0$ are responsible for it. The answer is a light cone, a region of possible influence widening at one cell per step on each side.
Each update XORs a cell with its two neighbours. Unwind that across $t$ steps and a single cell turns out to be nothing more than the running XOR, the parity, of a whole block of the original cells, a block that widens by one site on each side at every step:
$$ c_i(t) \;=\; \bigoplus_{\,|j-i|\,\le\, t} c_j(0) \pmod 2. $$A late cell is therefore asking a question about a wide piece of the starting row: was the number of black cells in this block odd or even? Parity has a peculiarity that makes it a powerful mixer. If you take $n$ independent cells, each black with probability $p$, the chance that an odd number of them are black works out to
$$ \Pr[\text{odd}] \;=\; \tfrac{1}{2}\big(1 - (1-2p)^{n}\big). $$Since $|1-2p| < 1$ for any $p$ strictly between $0$ and $1$, the term $(1-2p)^n$ collapses toward zero as the block grows, and the probability races to exactly $\tfrac12$. The bias is not being shaved away by a vague visual process; each widening light cone converts a biased collection of starting cells into an almost fair parity bit. The single-cell entropy is driven to one full bit.
I like to derive the parity formula because it contains the mixing mechanism in miniature. For one cell, the probability of even parity minus the probability of odd parity is $1-2p$: with probability $1-p$ the contribution is even, and with probability $p$ it is odd, so the imbalance is $(1-p)-p$. For two independent cells the imbalances multiply, because parity imbalances combine by multiplication. After $n$ cells the imbalance is therefore $(1-2p)^n$. Since the even and odd probabilities add to 1, and their difference is $(1-2p)^n$, solving those two equations gives the odd probability above.
A numerical check is useful. With $p=0.08$, one cell is black only 8% of the time. A block of 5 cells has odd parity with probability
$$ \tfrac{1}{2}\big(1 - 0.84^5\big) \approx 0.291. $$A block of 25 cells has odd parity with probability
$$ \tfrac{1}{2}\big(1 - 0.84^{25}\big) \approx 0.494. $$So a sparse beginning does not need a long time to lose its single-cell bias. Once a later cell depends on a few dozen initial cells, the parity question has nearly forgotten the original density. It has forgotten it only in the sense that this one output bit no longer reveals it.
One might worry that parity is too special, that it is a clever algebraic trick rather than a model of mixing. But the specialness is the point of this small example. The rule is simple enough that we can see the transfer of order exactly. In more complicated reversible systems the same qualitative phenomenon is harder to write as one line of XOR algebra, but the structure is familiar: information that was once accessible in small observables becomes encoded in many-body correlations.
Now return to the little scatter. At the start it occupies neighbouring cells, so it has a boundary, a location, a visible identity. A few steps later, no single late cell is the scatter. Each late cell is a parity question about a widening block that may include part of it, all of it, or none of it. Its identity has been shredded across a widening block of equations. This is not poetic language for a loss; it is the literal algebra of the light cone. The scatter has become something one must reconstruct from overlaps.
You can inspect this dependence below. Click a cell and the picture marks the block of starting cells whose parity determines it, along with the corresponding odds. Then toggle one of those starting cells. The cells whose determining blocks include it flip, and the others remain unchanged. One initial bit no longer occupies one visible location. It has been distributed through a widening cone, which means it now appears as correlations among later cells.
This gives a concrete answer to where the order went. The total informational cost remains the cost of the beginning. What has changed is its address. The order has left single cells, where a glance can detect it as a bias toward white, and has entered the relationships between cells. Two distant cells may each look like fair coins in isolation while still sharing pieces of their ancestry, so that learning one changes the probabilities for the other. The amount of shared information is the mutual information,
$$ I(c_i; c_j) \;=\; H(c_i) + H(c_j) - H(c_i, c_j), $$and a measurement that samples cells one at a time is blind to it. The diagonal threads in the opening picture are this hidden order made visible for a moment: long relations stretching across space and time, while individual cells along the threads have become locally unreadable.
The phrase "relationships between cells" can sound soft, so let me put it in a harsher form. If I know one late cell, I have learned the parity of one block of initial cells. If I know a neighbouring late cell, I have learned the parity of a nearby overlapping block. The overlap is information shared between the two questions. A full row of late cells is a large system of parity equations about the original row. Taken together, those equations determine the start. Taken one at a time, each equation looks almost fair and uninformative. The direction we call future is the direction in which simple local facts become distributed facts.
There is a natural temptation to imagine an observer as something outside the system, making philosophical trouble. But in the present problem an observer is only a choice of measurements. If the measurement records every cell, the evolution is reversible and no information is missing. If the measurement records a density, or a few local samples, or a blurred image, then many exact rows give the same record. The coarse measurement introduces the many-to-one step that the cellular automaton itself avoided.
Suppose I divide the row into blocks of 10 cells and keep only the number of black cells in each block. The exact positions inside each block disappear from the record. Two rows that differ by swapping black cells within a block become the same coarse state. If I use wider blocks, more distinctions vanish. If I keep only the overall density, nearly all microscopic distinctions vanish. Equilibrium is stable under such descriptions because the late-time rows all fall into a huge class of records with nearly balanced local statistics.
A small calculation captures the size of this collapse. A block of $k$ cells has $2^k$ exact patterns. If the coarse lens keeps only the number of black cells, it has only $k+1$ possible records for that block. For $k=10$, that is 1024 exact patterns reduced to 11 records. For $k=20$, it is 1,048,576 exact patterns reduced to 21 records. Coarse-graining is not a mild blur in state space. It is a vast identification of microstates under one visible label.
The reversible rule can move a row from a rare coarse label, sparse and simple, into a common coarse label, balanced and tangled, while still moving one exact microstate to one exact microstate. This is the microscopic and macroscopic story placed side by side. The microstate never branches or merges. The coarse label can become generic because many different microstates carry that same label. The one-wayness is in the migration from rare visible descriptions to common visible descriptions, not in a loss of the exact row.
The resolution is delicate, and I do not want to make it sound more soothing than it is. The row approaches equilibrium in the measurements I can make, and it retains all of its microscopic information under the reversible rule. These statements refer to different levels of description. Reversibility concerns the full configuration. Equilibrium concerns the small set of features that an observer can measure, store, and use.
That limitation is part of the physics of seeing. I cannot hold a thousand cells together with all their correlations and compute the exact inverse history from them. Real measurements give coarse summaries: a fraction of black cells, an average density, a block value, a few local statistics. Such summaries look at restricted features and discard the web of long-range relations where the initial order now lives. The rule drains the visible bias into those relations. To an observer built out of coarse instruments and finite memory, the system runs to equilibrium and then appears to stay there. Below, the separation between the keyhole view and the full view opens as the rule evolves.
The rising curve is not contradicting the flat line. It is showing the cost of looking through a keyhole. A one-cell observer asks, "What is the probability this cell is black?" and receives an answer approaching one half. The whole-row observer asks, "Which exact configuration is this?" and receives a state that is still linked one-to-one with its past. The same physical row can therefore have increasing entropy under a restricted marginal distribution while keeping constant entropy under the full distribution.
There is a useful analogy with a sentence written in a cipher. If I inspect one letter at a time after encryption, the letters may look evenly distributed. The message has not ceased to exist. It has become a relation among many letters and the key. Here the reversible rule supplies the cipher, and the key is not external: it is the entire configuration plus the known inverse dynamics. Without enough of the configuration, the local symbols look equilibrated.
The same idea becomes less abstract when one applies a coarse lens to the picture itself. Below is one run shown twice. The left side shows the exact cells, the microscopic state the reversible rule carries without loss. The right side shows the same run after neighbouring cells have been merged into blocks, each block replaced by a single shade. Widen the lens and the right-hand entropy rises, while the left-hand microscopic entropy stays fixed.
When the lens is narrow, some of the microscopic weave remains visible. As the lens widens, local distinctions are averaged into block shades, and the image acquires the familiar monotone look of equilibrium. This is a mechanical version of what a thermometer does to a gas. The thermometer does not report molecular trajectories. It reports a coarse variable, and in that variable many microscopic states are indistinguishable.
The order has not gone beyond recovery in principle. One could take a mixed-looking row, possess every cell and every correlation without error, and run the reversible dynamics back to the sparse start, as the earlier demonstration did. The difficulty is operational rather than logical: the required microscopic knowledge is the whole state, at full resolution. Even if the row is left alone, relaxation is not eternal in the strict mathematical sense. A finite reversible system must be recurrent; after enough steps it returns arbitrarily close to where it began. The scale of "enough" matters. For a row of $N$ cells the recurrence time is of order $2^{N}$, and for a few hundred cells this is a number like $10^{200}$ steps. The mixing time is small, often only on the scale of $\log N$ steps, while the return time is so large that ordinary observation lives entirely between the two.
That separation of times is part of why the second-law picture feels so robust. The rule may permit a return, but the return is hidden in a space of possibilities whose size doubles with every added cell. For $N=300$, the number $2^N$ is already around $10^{90}$. For larger rows the recurrence time leaves any laboratory scale behind. The system spends accessible time looking equilibrated under coarse measurements, while the rare exact arrangements that would visibly unmix it occupy an almost invisible fraction of its state space.
So why does time seem to run one way? In this toy world, the answer is not that the microscopic rule prefers the future. The rule itself accepts either temporal direction. A hypothetical observer able to grasp the entire correlated configuration would see one exact state changing into another, with the inverse path always available. My measurements see local biases, block averages, and limited summaries. Once the initial order has spread into long-range correlations, those summaries read it as equilibrium, even though the full row still contains the information needed to reconstruct the start.
Now the uncomfortable part is that this description has been about you all along. You are not outside this story, looking down on a toy universe with clean hands. You are the bounded, coarse-grained observer. Your arrow of time is the arrow of your attention: the direction in which you keep the summaries that matter and let the rest of the microscopic detail pass out of reach. You remember a cooled cup, a broken egg, a shuffled deck, because memory itself is a coarse record. It stores a trace, a digest, a few stable features. It does not store the full molecular configuration with the correlations needed to run the world backward.
This is why the past feels fixed. You have measured it, or been measured by it, and what remains in you is a record. But the record is not the whole microstate. It is the block count, the blurred lens, the one-cell marginal, the keyhole view. The microscopic detail that still contains the reversibility has slipped outside what you can use. The future feels open for the complementary reason: you have not yet made the measurements that will become your records. Nothing mystical is being smuggled in here. In the model, a complete row fixes both directions. But a creature that only keeps partial records will experience one side as stored summary and the other as unmeasured possibility. If that makes you a little dizzy, good. I do not see how it could be otherwise.
I end up with a picture less like destruction than like hiding in plain sight. At the top, the order is placed where a one-cell statistic can find it. At the bottom, the order is spread through parity constraints and overlaps among many cells. The reversible law has conserved the state, while the observer has lost the convenient coordinates in which the state was simple. The arrow of time is the direction in which accessible order becomes inaccessible order, while the exact past remains present inside the weave.
So take the machine away from me for a moment. Put your own scratches into the ordered row. Watch them fan out until the page denies you any obvious memory of what you did. Then reverse it. The rule will pull the little scatter, and every mark you added, back out of the noise exactly. You cannot do that by looking. The rule can. That gap is the arrow.