### Lattices and mobile qubits

Here, we describe the microscopic details and dynamics of the system. We describe the lattice and how the gauge fixing progresses. We lastly discuss the protocol over its entire duration to estimate its resource cost.

*Lattices*. In (*15*), the authors describe three surface codes on different three-dimensional lattices. We give simple representations of the lattices here that help understand the steps of gauge fixing. The first of the three copies is well represented with the standard convention that we described in the main text where qubits lie on the edges of a cubic lattice. We refer to this as the standard surface code lattice. The other two lattices are represented with qubits on the vertices of rhombic dodecahedra in (*15*). We call this the alternative surface code. We offer an alternative description of this lattice in this section.

All the qubits of the alternative surface code are unified with the qubits of the standard surface code on the cubic lattice. We therefore find a straightforward way of representing the stabilizers of the alternative code with qubits on the edges of a cubic lattice. We show the stabilizers in Fig. 4 on a cubic lattice. To represent this model, we bicolor the cubes, as they support different stabilizers depending on their color (see Fig. 4A). The white primal cubes support Pauli-*X* “star” operators, and the gray dual cubes support the Pauli-*Z* “plaquette” operators. We express their support with the following equations

(1)where *∂c* is the set of edges on the boundary of cube *c* and again, *∂e* is the set of vertices *v* at the boundaries of edge *e*, i.e., its end points. The operators *A _{c}* and

*B*

_{c,v}are, respectively, defined on primal and dual cubes only. We also note that each vertex touches four dual cubes; hence, there are four

*B*

_{c,v}at each vertex. Further, there are eight vertices on a cube, there are therefore eight

*B*

_{c,v}stabilizers for each dual cube

*c*. There is only one

*A*operator for each primal cube. We also show the stabilizers added at the smooth and rough boundaries in Fig. 4 (D and E, respectively). See (

_{c}*15*) for a more detailed discussion on the boundaries.

Last, we count the number of qubits in a single-unit cell (see Fig. 4A) as these will make up a site in the threshold theorem given in the “Error correction with just-in-time gauge fixing” section. As a function of volume in the bulk of the lattice, the standard and the alternative surface code both have three qubits per cube lying on the edges of the lattice, so over a unit cell of eight cubes, we have 24 qubits.

We also include ancilla qubits to measure the plaquette operators of each model. In the standard surface code, we make one plaquette measurement for each face of the lattice. There are three faces per cube of the lattice; we therefore have 24 ancilla qubits per unit cell to measure the faces of the cubic lattice model. For the alternative surface code, we make eight measurements per dual cube of the unit cell. We have four dual cubes per unit cell; we therefore arrive at 32 ancilla qubits for each unit cell of the alternative surface code shown in Fig. 4.

The discussion above concludes that we have 48 qubits in total per unit cell of the standard surface code and 56 qubits per unit cell of the alternative surface code. We lastly consider a unit cell of the total system with three overlapping lattices. Each unit cell includes one copy of the cubic lattice model and two copies of the alternative model. We therefore find that we have 160 qubits per unit cell in total. The unit cells at the boundary of the system can be regarded as bulk cells with some of the qubits removed. Hence, when we account for the boundary, we can take this value as an upper bound. Last, we note that each of these unit cells contributes two units of distance to the system.

*Gauge fixing*. Having specified the lattices, we now discuss how to perform the gauge-fixing process. Gauge fixing moves three two-dimensional surface codes through a three-dimensional spacetime volume to reproduce three overlapping three-dimensional surface codes over time. This motion proceeds by repeatedly producing a thin layer of three-dimensional surface code and then measuring some of its qubits in a product basis to collapse the system onto a two-dimensional surface code that has been displaced through spacetime. Gauge fixing and transversal controlled-controlled-phase gates are applied at the intermediate step where the system is in the state of a thin slice of three-dimensional surface code. We show one period of the process for two lattices in Fig. 5. Each panel of the figure shows the region in which the transversal controlled-controlled-phase gate is conducted within the black cube. The top figures show the progression of a lattice moving from left to right through the region over time, and the lower figures show a lattice moving upward through the region. Time progresses from left to right through the panels. The columns of the diagram are synchronized.

We now describe the microscopic details of a single period of the gauge-fixing process. We perform similar processes on all three surface codes involved in the gate in unison. The three surface codes only differ in the direction they move through the spacetime volume and the lattice we use to realize the surface code. Hence, we will only focus on a single surface code, say that shown in Fig. 5A.

A period of the gauge-fixing process begins with a two-dimensional surface code supported on the qubits shown at time *t* to the left of Fig. 5A, and it ends at time *t* + 1 with a displaced surface code, shown in the right column of the figure. It is helpful to label the subsets of qubits of the spacetime volume that support a surface code at time *t*(*t* + 1) with the label Q* _{t}* (Q

_{t+1}). The thin three-dimensional surface code that we produce at the intermediate step is shown in the central column of Fig. 5A at time

*t*+ 1/2. We denote the qubits that support the three-dimensional surface code at this time by Q

_{t+1/2}. The subsets of qubits we have defined are such that

and the intersection of Q* _{t}* and Q

_{t+1}is nonempty.

We map the surface code at time *t* onto the three-dimensional surface code shown at time *t* + 1/2 by measurement. We initialize the qubits in the subset

in the ∣+〉 state. We then measure all the plaquettes supported on

${\mathrm{Q}}_{t+1/2}$ that have not been measured previously. Plaquettes supported entirely on Q* _{t}* have already been measured at an earlier period. It is therefore unnecessary to measure these stabilizers again.

The plaquette measurements will return random outcomes and may include errors. We must fix the gauge of the plaquettes of the active layer of the surface code to their +1 eigenstate. This is described in more detail in the “Error correction with just-in-time gauge fixing” section. For now, we assume that it is possible to accomplish this. Once we make the gauge-fixing correction, we apply the controlled-controlled-phase gate between the qubits of subset

${\mathrm{Q}}_{t+1/2}{\mathrm{Q}}_{t+1}$of each of the three systems involved in the gate.

We lastly recover a two-dimensional surface code on the subset of qubits Q_{t+1} by measuring the qubits of the subset

in the Pauli-*X* basis. We use the outcomes of the destructive single-qubit Pauli-*X* measurements to infer the values of the star operators of the three-dimensional surface code. As measurement errors that occur when we make single-qubit measurements are indistinguishable from physical errors, the readout of the star operators of the three-dimensional surface code is fault tolerant.

In a sense, we can consider this as a dimension jump (*30*) where a two-dimensional model is incorporated into a three-dimensional model to leverage some property of the higher-dimensional system. In this case, we prepare a very thin slice of the three-dimensional surface code model where, once all the physical operations have been performed, we can collapse the three-dimensional model back onto a two-dimensional model again. The latter dimensional jump where we go from the three-dimensional surface code to its two-dimensional counterpart has been demonstrated by Raussendorf, Bravyi, and Harrington (*25*), where they fault-tolerantly prepare a Bell pair between two surface codes using the topological cluster state.

It is worth remarking that the method we have discussed here enables us to produce other three-dimensional structures that go beyond foliation (*38*). Much research has sought to map quantum error–correcting codes into measurement-based schemes (*29*, *39*) through a system called “foliation” to access favorable properties of exotic quantum error–correcting codes. Conversely, some fault-tolerant measurement-based schemes have been developed that are not expected to have a description in terms of a quantum error–correcting code. Really though, we should expect that we can implement any fault-tolerant protocol independent of the architecture that we choose to realize our qubits. The scheme presented here gives us a way to realize these models that are beyond foliation with a two-dimensional array of static qubits. Given their promising thresholds (*38*), it may be worth exploring the practicality of some of these higher-dimensional models on two-dimensional architectures.

In a similar vein, we point out that the two-dimensional surface code that is propagated by the code deformations of the alternative lattice is described naturally on the hexagonal lattice. This lattice has been largely dismissed because of its weight-six hexagonal stabilizer terms. However, we measure its stabilizers using only weight-three measurements, and the higher-weight stabilizers are inferred from single-qubit measurements. Hence, it may be worth revisiting this model as the scheme presented here offers a method of stabilizer extraction that does not require measurements of weight greater than three. Further, as no qubit supports more than four plaquette stabilizers, the topological cluster state that realizes this surface code has vertices that are no more than four valent. We may therefore expect this model to have a high threshold with respect to the gate error model.

*Implementing the non-Clifford gate*. We lastly describe the entire protocol which is summarized in Fig. 6 and discuss its spacetime resource cost as a function of the code distance of the system, *d*. Each panel of the figure shows three arrays, each of which supports a code. It may be possible to embed the qubits of all three codes on one common array, but for visualization purposes, we imagine three stacked arrays that can perform local controlled-controlled-phase gates between nearby qubits on separate arrays. Parity measurements are performed locally on each array.

The code on the lower array will move from left to right along the page as we undergo code deformations. For a strictly local system, we consider an extended array that we refer to as the long array. However, as we discuss toward the end of this section, we can reduce the size of this array by simulating a system with periodic boundary conditions. We proceed with the discussion where the process is strictly local. To evaluate the resource cost, we refer to a single unit of time as a cycle. The resource cost is measured in units of qubit cycles.

Before the gate begins, we must copy the encoded information onto the arrays where the gate is performed. We might accomplish this with lattice surgery (*10*, *40*). Figure 6A shows three surface codes that have been moved close to the edges of the arrays where the gate will be performed. One logical qubit is copied to the far left of the long array. Initializing the system will take time that scales like the code distance, ∼*d* cycles.

We might also consider using the system offline to prepare high-fidelity magic states. With this setup, we apply the gate to three surface codes initialized fault-tolerantly in an eigenstate of the Pauli-*X* operator. While this will mean that we do not need to copy information onto the three arrays, it will still be necessary to fix the gauge of the system such that all the plaquette operators of the initial face are in their +1 eigenvalue eigenstate. To the best of our knowledge, this will still take

time to prepare the system such that its global charge is vacuum.

We remark that using the protocol offline to produce magic states may offer some advantages. For instance, as we discussed in the main text, we can postselect high-quality output states by comparing the result of the just-in-time decoder with a high-performance decoding algorithm. Moreover, the required connectivity of the gate with the rest of the system will be reduced. This is because we need only copy the magic states out of the system, and we do not need to input arbitrary states into the system that may require additional routing.

Once the system is initialized, we begin performing the code deformations as discussed in the previous section. The code deformations move the code on the long array underneath the other two codes (see Fig. 6B) and out the other side (see Fig. 6C). Assuming that one step, as shown in Fig. 5, takes one cycle, moving the lower code all the way under the other two and out the other side will take 2*d* units of time. The final state of the protocol is shown in Fig. 6D.

The above discussion explains that the three arrays will be occupied for 3*d* cycles. Each array will support a code that will consist of ∼*d* × *d* unit cubes that collectively can produce a thin slice of the three-dimensional surface code. Arrays of unit cubes are shown in Fig. 5 at time *t* + 1/2. The long array must be able to support unit cubes in 3*d* × *d* locations. We include the idle qubits of the long array in the resource cost over the entire protocol. We count the qubits of each unit cube we need to realize each of the three-dimensional surface codes, including an ancilla qubit for each plaquette measurement we make on a given unit cube. We note that we have chosen the term “unit cube” here, as distinct from the “unit cell” that was defined in the “Lattices” section. The unit cell is a single element of a translationally invariant lattice that we use in the “Error correction with just-in-time gauge fixing” section. A unit cube, as defined here, contributes one unit of distance to the system in both the spatial and temporal directions.

We consider two different lattices that have been discussed in the “Lattices” section: the standard surface code and the surface code on the alternative lattice that we show in Fig. 4. Both lattices include qubits lying on the edges of a standard cubic lattice. There are 12 edges on the boundary of each unit cube, but as we see in Fig. 5, the unit cubes are such that there are ∼*d* × *d* edges that are shared between two cubes, as well as ∼*d* × *d* faces, each consisting of four edges, that are shared between pairs of cubes. We therefore find seven qubits per unit cube lying on the edges of the cubic lattice.

We also assume that there is a single qubit for each plaquette measurement needed to produce the lattices shown in Fig. 5 at time *t* + 1/2. For the standard lattice surface code, there are six plaquette measurements associate to each unit cube, one for each of its faces. However, as shown in Fig. 5 at time *t*, two of the faces have already been measured during an earlier cycle. Further, two-face measurements of each unit cube are shared with other unit cubes; we therefore count three measurement ancilla qubits per unit cube for the standard surface code. In total, including the qubits on the edges of the lattice, we find 10 qubits per unit cell of the standard lattice surface code. A similar analysis finds that we need to perform four plaquette measurements per unit cube to produce a slice of the alternative surface code at time *t* + 1/2. The alternative surface code thus includes 11 qubits per unit cell.

To conserve resources, we assume that the two stationary qubit arrays support the two alternative lattice surface codes. Each of these arrays therefore requires 11*d*^{2} qubits to produce *d* × *d* unit cells. Similarly, the resource cost of 3*d*^{2} unit cells of the conventional cubic lattice surface code on the long array uses 10 · 3*d*^{2} qubits. In total, all three arrays support ∼[30 + 2 · 11]*d*^{2} = 52*d*^{2} qubits. Assuming that the full protocol is completed in 3*d* cycles, we arrive at a total resource cost of 156*d*^{3} physical qubit cycles for a single implementation of the gate.

The conservative estimate given above assumes that 10 · 2*d* × *d* qubits are idle for 3*d* units of time. We would obtain a resource saving of 60*d*^{3} qubit cycles by making use of these idle qubits or altering the protocol such that they are not needed. An easy way to achieve this is by simulating periodic boundary conditions on the long array. We can achieve the same protocol by replacing the long array with a *d* × *d* array with cylindrical boundary conditions such that all three arrays have a size ∼*d* × *d* unit cells.

Periodic boundary conditions are easily achieved given a distributed architecture (*41*), where we are not constrained to strictly local interactions. One could also imagine approximating periodic boundary conditions with a strictly local system using a line of *L* gates that share one very long array. The very long array has size (*L* + 2)*d* × *d* and supports *L* disjoint *d* × *d* surface codes. All *L* gates proceed in parallel where all *L* codes move synchronously along the very long array. In both cases, in the latter where *L* diverges, we arrive at a resource cost of ∼96*d*^{3} qubit cycles per controlled-controlled-phase operation. Over the course of the gate, we must perform ~3*d*^{3} controlled-controlled phase gates.

At this stage, one might be willing to make speculations on how the resource cost of the gate proposed here compares with well-studied magic-state distillation protocols. Let us take a recent example (*12*) where a magic-state distillation protocol is proposed that occupies 12*d*^{′} × 6*d*^{′} qubits over 5.5*d*^{′} cycles, giving a total resource cost approaching ∼400*d*^{′3} qubit cycles. We deliberately choose to quantify the qubit cycles of this example with units of *d*^{′3} instead of *d*^{3}. This is because, without numerical simulations, we cannot accurately calculate how the failure rate of the gate presented in this work decays with *d* as compared with *d*^{′}.

Optimistically, we might hope that the logical failure rates of both protocols decay comparably in distance. In which case, we might compare resources whereby *d* ∼ *d*^{′}, and we find that the gate presented here can outperform magic-state distillation using a small fraction of the resources. In practice, gauge fixing will introduce additional errors while the controlled-controlled-phase gate proceeds. In contrast, a magic-state distillation procotol that uses only logical Clifford operations will not experience gauge-fixing errors. Hence, we should expect that *d* > *d*^{′} to obtain comparable logical failure rates. Presently, little work has been done to calculate the logical failure rate of gates that make use of gauge fixing. The extent of this problem will be very sensitive to the error rate of the plaquette measurements. In principle, errors introduced by gauge fixing are of a different nature to errors introduced by the environment. As we have discussed in the main text, an appropriately chosen decoder might be able to mitigate the errors introduced by gauge fixing.

Another reason one should anticipate that we should choose *d* > *d*^{′} is that the application of noisy controlled-controlled-phase gates on the physical qubits will introduce additional errors to the system. Of course, the noise introduced by these entangling gates depends on the implementation of these gates. For the discussion here, it is simpler to remain agnostic about the physical implementation of the logical gate. Further work needs to be done to determine the magnitude of these sources of noise.

### Error correction with just-in-time gauge fixing

Here, we prove that the non-Clifford operation will perform arbitrarily well as we scale the size of the system, provided that the physical error rate on the qubits is suitably low. We outline an error correction procedure as we undergo the controlled-controlled-phase operation. The argument requires two main components. We require a just-in-time decoder that controls the spread of an error during the gauge fixing. We then show that the spread errors are sufficiently small that we can correct them at a later stage. We first show that we can decode a spread error model globally during postprocessing using a renormalization group decoder before arguing that the error model is justified by the just-in-time decoder.

*Notation and terminology*. We suppose a local error model acting on the qubits of the spacetime of the non-Clifford process. For suitably low error rate, we can characterize the errors as occurring in small, local, well-separated regions (*32*). The just-in-time gauge-fixing decoder will spread this error. Given that the spread is controlled, we can show that a global renormalization group decoder will correct the errors that remain after the gauge-fixing process. Our argument follows a similar approach to that presented in (*32*). Hence, we will adopt several definitions and results presented in (*32*). We will also keep our notation consistent with this work where possible.

We divide the system into sites: small local groups of qubits specified on a cubic lattice. We consider an independent and identically distributed error model where a Pauli error occurs on a site with probability *p*_{0}. We say that a site has experienced an error if one or more of the qubits has experienced an error. Given a constant number of qubits per site, *N*, then, the likelihood a site experiences an error *p*_{0} = 1 − (1 − ε)* ^{N}* is constant where each qubit of the system experiences an error with constant probability ε. We consider a Pauli error

*E*drawn from the probability distribution described by the noise model. We will frequently abuse notation by using

*E*to denote both a Pauli operator and the set of sites that support

*E*.

The syndrome of an error *E* is denoted as σ(*E*). It denotes the set of defects caused by *E*. We say that a subset of defects of a syndrome can be neutralized if a Pauli operator can be applied such that all the defects are neutralized without adding any new defects. We may also say that any such subset of the syndrome is neutral.

Defects lie at locations, or sites, *u* = (*u _{x}*,

*u*,

_{y}*u*) in 2 + 1–dimensional spacetime. The separation between two sites is measured using the 𝓁

_{t}_{∞}metric where the distance between sites

*u*and

*v*, denoted as ∣

*u*−

*v*∣, is such that ∣

*u*−

*v*∣= max (∣

*u*−

_{x}*v*∣, ∣

_{x}*u*−

_{y}*v*∣, ∣

_{y}*u*−

_{t}*v*∣). We will be interested in regions of spacetime that contain a collection of points

_{t}*M*. The diameter of

*M*is equal to

. We say that a subset of points *M* is *r*-connected if and only if *M* cannot be separated into two disjoint proper subsets separated by a distance more than *r*. The δ-neighborhood is the subset of sites that lie up to a distance δ from a region ρ together with the sites enclosed within region ρ itself. Given that we have a local model in spacetime, defects appear on sites within the one neighborhood of the sites of the error *E*. The following argument relies heavily on the notion of a chunk at a given length scale *Q*.

**Definition 1** (Chunk). Let *E* be a fixed error. A level-0 chunk is an error at a single site *u* ∈ *E*. A nonempty subset of *E* is called a level-*n* chunk if it is the disjoint union of two level–(*n* − 1) chunks with diameter ≤*Q ^{n}*/2.

We express errors in terms of their chunk decomposition. We define *E _{n}* as the subset of sites that are members of a level-

*n*chunk such that

(2)where *m* is the smallest integer such that *E*_{m+1} = ∅. We then define subsets *F _{j}* =

*E*

_{j}*E*

_{j+1}such that we can obtain the chunk decomposition of

*E*, namely

(3)

A level-*m* error is defined by the smallest value of *m* such that *E*_{m+1} = ∅.

Expressing an error in terms of its chunk decomposition enables Bravyi and Haah (*32*) to prove that a renormalization group decoder will decode any level-*m* error with a sufficiently large system. The proof relies on the following lemma.

**Lemma 1.** *Let Q* ≥ 6 *and M be any Q ^{n}-connected component of F_{n}. Then, M has a diameter at most Q^{n} and is separated from other errors E_{n}*

*M by a distance greater than Q*

^{n+1}/3.

The proof is given in (*32*) (see proposition 7). We note also that all the defects created by a *Q ^{n}*-connected component of

*F*lying in the one neighborhood of the connected component are neutral. With this result, it is then possible to show that a renormalization group decoder that finds and neutralizes neutral 2

_{n}*-connected components at sequentially increasing length scales*

^{p}*p*will successfully correct an error, provided that

*Q*is much smaller than the size of the system. A threshold is then obtained using that for a sufficiently low error rate, the likelihood that a level-

^{m}*m*+ 1 chunk will occur is vanishingly small. The renormalization group decoder is defined as follows.

**Definition 2** (Renormalization-group decoder). The renormalization group decoder takes a syndrome σ(*E*) as input and sequentially calls the level-*p* error correction subroutine ERROR CORRECT(*p*) and applies the Pauli operator returned from the subroutine for *p* = 0,1, …, *m* with *m* ∼ log *L*.

The subroutine ERROR CORRECT(*p*) returns correction operators for neutral 2* ^{p}*-connected subsets of the syndrome. If the syndrome has not been neutralized after ERROR CORRECT(

*m*) has been called, then the decoder reports failure.

*A threshold theorem with a spread error*. In the following section, we will show that the just-in-time gauge-fixing process will spread each disjoint *Q ^{j}*-connected component of

*F*such that the linear size of the area it occupies will not increase by more than a constant factor

_{j}*s*≥ 1. Once the error is spread during the gauge-fixing process, we must show that the error remains correctable. Here, we show that the spread error model can be corrected globally with the renormalization group decoder. We first define a level-

*m*spread error.

**Definition 3** (Spread errors). Take a level-*m* error *E* drawn from an independent and identically distributed noise model with a chunk decomposition as in Eq. 3. The spread error takes every *Q ^{j}*-connected component

*F*

_{j,α}⊆

*F*for all

_{j}*j*and spreads it such that this component of the error, together with the defects it produces, are supported within a container

*C*

_{j,α}centered at

*F*

_{j,α}with diameter at most

*sQ*.

^{j}We use the term “container” so that we do not confuse them with boxes used in the following section, although containers and boxes both perform similar tasks in the proof.

In the proof given in (*32*) the authors make use of Lemma 1 to show that the renormalization group decoder will not introduce a logical failure. This is assured given that all the errors are small and well separated in a way that is made precise by Lemma 1. With the errors of the spread error model now supported in containers as much as a factor *s* larger than the initial connected components of the error, the connected components are now much closer together and, in some cases, overlap with one another. We have to check that the noise will not introduce a logical failure, given sufficiently low-noise parameters. We will argue that we can still find a threshold error rate, provided that (*s* + 2)*Q ^{m}* is suitably small compared with the system size. The following definition will be helpful.

**Definition 4** (Tethered). Consider errors supported within spread containers *C*_{j,α} and *C*_{k,β} with *j* ≤ *k*. We say that the error in container *C*_{j,α} is tethered to the error in a different container *C*_{k,β} if the two containers are separated by a distance no greater than Δ* _{j}* where Δ

*= [*

_{j}*r*(

*s*+ 2) + 2]

*Q*. We say that

^{j}*C*

_{j,α}is untethered if it is not tethered to any containers

*C*

_{k,β}for

*k*≥

*j*.

We include an *r* term to parameterize the separation we wish to maintain between untethered containers compared to the diameter of the containers. This should be of the order of the factor by which renormalization group decoder increases its search at each level. We defined the renormalization group decoder to search for 2* ^{p}*-connected components at level

*p*, so we can take

*r*≥ 2.

**Fact 1.** Let *Q* ≥ 3[*r*(*s* + 2) + *s* + 1]. Two distinct containers of the same size, *C*_{j,α} and *C*_{j,β}, are not tethered.

*Proof.* Errors *F*_{j,α}, *F*_{j,β} ⊆ *F _{j}* at the center of spread errors contained in containers

*C*

_{j,α}and

*C*

_{j,β}are separated by more than

*Q*

^{j+1}/3 (Lemma 1). After expansion, the boundaries of

*C*

_{j,α}and

*C*

_{j,β}are separated by a distance greater than

*Q*

^{j+1}/3 − (

*s*− 1)

*Q*. We have Δ

^{j}*≤*

_{j}*Q*

^{j + 1}/3 − (

*s*− 1)

*Q*for

^{j}*Q*≥ 3[

*r*(

*s*+ 2) +

*s*+ 1]. Therefore, two boxes of the same size are not tethered for

*Q*≥ 3[

*r*(

*s*+ 2) +

*s*+ 1].

The constant expansion of the diameter of the errors means that some large errors expand such that smaller errors are not locally corrected. Instead, they become tethered to the larger errors that may cause the renormalization group decoder to become confused. We will show that the small errors that are tethered to larger ones are dealt with at larger length scales as tethering remains close to the boundary of the larger containers with respect to the length scale of the larger container. We illustrate this idea in Fig. 7.

We will say that a decoder is successful if it returns a correction operator that is equivalent to the error operator up to an element of the stabilizer group. Given that the logical operators of the model of interest are supported on containers with diameter no smaller than *L*, we say that a decoder is successful if an error and its correction is supported on a collection of well-separated containers where each container is smaller than *L*/3. It will be helpful to define fattened containers

that enclose the *Q ^{j}*-neighborhood of

*C*

_{j,α}. The fattened containers have diameter

*D*≤ (

_{j}*s*+ 2)

*Q*. We also define the correction operator

^{j}*R*(

*p*), which is the product of the correction operators returned by ERROR CORRECT(

*p*) for all levels up to level

*p*. We are now ready to proceed with the proof.

**Lemma 2.** *Take Q* ≥ 3[*r*(*s* + 2) + *s* + 1]*. The renormalization group decoder will successfully decode a level-m error with constant spread factor s* ≥ 1 *provided D _{m}* <

*L*/3.

*Proof.* We follow the progression of the renormalization group decoder inductively to show that the correction is supported on the union of containers

. We will prove that the renormalization group decoder satisfies the following conditions at each level *p*.

1) The correction operator *R*(*p*) returned at level *p* is supported on the union of fattened containers

.

2) For the smallest integer *l* ≥ 0 such that *Q ^{l}* > 2

*, modulo stabilizers, the error*

^{p}*R*(

*p*)

*E*is supported within a

*Q*-neighborhood of an error contained in a container

^{l}*C*

_{k,α}for any

*k*such that its diameter is at least

*sQ*.

^{l}3) The restriction of *E* and the level-*p* correction operator *R*(*p*) is the same up to stabilizers on fattened containers

of diameter *D _{j}* ≤ 2

*for untethered containers*

^{p}*C*

_{j,α}.

We prove the case for *p* = 0. By definition, errors are supported on containers *C*_{j,α}; therefore, *1*-connected components of the syndrome contained within *C*_{j,α} are supported on

. This verifies condition 1. Condition 2 holds by definition as follows. Since *Q*^{1} > 1, tethered containers *C*_{0,α} of size no greater than *s* are separated from at least one container *C*_{j,β} for *j* ≥ 1 by a distance no more than Δ_{0}; otherwise, it is untethered. This verifies that all tethered containers *C*_{0,α} lie entirely within the *Q*-neighborhood of some container *C*_{j,β} since *s* + Δ_{0} ≤ *Q*. The containers *C*_{j,β} that tether the errors in containers *C*_{0,α} are necessarily such that *j* > 0 by Fact 1. This verifies Condition 2 as we have shown that containers *C*_{0,α} are only tethered to containers with diameter at least *sQ*. Condition 3 is trivial for *p* = 0 since all containers have diameter larger than 1.

We now suppose that the above conditions are true for *p* to show that the conditions hold at *p* + 1. We consider ERROR CORRECT(*p* + 1). We are interested in containers *C*_{j,α} such that the diameter of its fattened counterpart is such that 2* ^{p}* <

*D*≤ 2

_{j}^{p+1}. We first find the smallest integer l such that

*Q*> 2

^{l}^{p+1}. Since

*D*= (

_{j}*s*+ 2)

*Q*≤ 2

^{j}^{p+1}, we have

*l*≥

*j*+ 1. There are two possible outcomes depending on whether

*C*

_{j, α}is tethered or not. We deal with each case separately.

If *C*_{j,α} is tethered, then it lies at most Δ* _{j}* from another container

*C*

_{k,β}of diameter

*sQ*with

^{k}*k*>

*j*by Fact 1. Given that

has a diameter no greater than *D _{j}*, we find that the error supported on

is supported entirely within the (*D _{j}* + Δ

*)-neighborhood of*

_{j}*C*

_{k,β}. Expanding this expression, we have that

*D*+ Δ

_{j}*≤*

_{j}*Q*

^{j+1}for

*Q*≥ [(

*s*+ 2) +

*r*(

*s*+ 2) + 2]. This confirms condition 2 for error correction at level

*p*+ 1.

In the case that *C*_{j,α} is untethered, the fattened container

, which is *D _{j}*-connected, is separated from all other containers that support uncorrected errors

with *D _{k}* ≥

*D*by a distance greater than Δ

_{j}*− 2*

_{j}*Q*=

^{j}*r*(

*s*+ 2)

*Q*by the definition of an untethered container. Given that

^{j}*D*> 2

_{j}*, we have that*

^{p}*r*(

*s*+ 2)

*Q*> 2

^{j}^{p+1}for

*r*= 2 at the level-(

*p*+ 1) error correction subroutine. Therefore, ERROR CORRECT(

*p*+ 1) will not find any components of

*E*outside of the container

. Hence, a correction will be returned entirely on

${\stackrel{\sim}{C}}_{l,\mathrm{\alpha}}$, verifying condition 3.

We lastly consider the support of the correction operator. If the error is tethered, then the correction returned for *C*_{j,α} lies on some container

with *k* > *j* to which it is tethered. In the case of untethered errors, the correction for each connected component supported on *C*_{j,α}, and the correction for the smaller components tethered to it, is supported on its respective container

. This verifies condition 1.

The argument given above says that all errors are corrected on well-separated containers that are much smaller than the size of the system provided *D _{m}* <

*L*/3. Given that there are no level-

*m*+ 1 errors, all the errors supported on containers of size

*D*will be untethered and therefore corrected at the largest length scale. Therefore, we bound the failure probability by predicting the probability that an error of size

_{m}*Q*

^{m+1}occurs. Bravyi and Haah (

*32*) gives a formula stating that the likelihood that a level-

*m*chunk occurs on an

*L*×

*L*×

*L*lattice is

(4)

Demanding that (*s* + 2)*Q ^{m}* <

*L*/3, we find

*m*= [log(

*L*/3) − log(

*s*+ 2)]/log

*Q*≈ log

*L*/log

*Q*; we find the logical failure rate decays exponentially in

*L*provided (3

*Q*)

^{6}

*p*

_{0}< 1. This demonstrates a threshold for

*p*

_{0}< (3

*Q*)

^{−6}. Taking

*Q*= 87 using

*s*= 8 and

*r*= 2, and we have that the number of qubits per site is

*N*= 160 from the “Lattices” section, we obtain a lower bound on the threshold error rate of ε ∼ 10

^{−17}.

*Just-in-time gauge fixing*. We use a just-in-time decoder (*16*) to fix the gauge of each topological cluster state onto a copy of the surface code. We can deal with each of the three codes separately since the three codes are yet to interact. We suppose that we draw an error from the independent and identically distributed noise model that acts on the spacetime that is represented by the sites of the topological cluster state (see the “Lattices and mobile qubits” section for the definition of a site of the models of interest). Note that more than one defect can lie at a given site since each site supports several stabilizers. We also assume that the state of the two-dimensional surface code on the initial face is such that the plaquette operators are in their +1 eigenstate, although small errors may have been introduced to the qubits on the primal qubits of the initial face of the system. We defined the initial face in the main text (see Fig. 1B). We justify this assumption by showing how we fix the gauge of the two-dimensional input system in the “Gauge prefixing” section.

We briefly review the gauge fixing problem that we already summarized in the main text. Face measurements that we obtain by measuring the dual qubits of the topological cluster state return random outcomes. However, because of the constraints among the stabilizers, these random outcomes are constrained to form loops if the system does not experience noise. To fix the gauge of the system, we need only find a Pauli operator that restores the plaquettes to their +1 eigenstate. This correction can be obtained trivially by finding a Pauli operator that will move the loops to any smooth boundary that is far away from the initial face. Because the plaquettes at this boundary are initialized in the +1 eigenstate, we cannot terminate loops here. However, any other boundary is suitable. With the two-dimensional setup we have, it is perhaps a natural choice to move the loops toward the terminal face. The correction will fill the interior of the loop. Ensuring that the initial face is fixed means that the correction for the gauge-fixing process is unique. Otherwise, there can be two topologically distinct corrections from the gauge-fixing process that can lead to a logical fault.

In the case that errors occur when we measure the dual qubits, strings will appear in incorrect locations. Given that in the noiseless case the loops should be continuous, we can identify errors by finding the locations where strings terminate. We refer to the endpoint of a broken string as a defect. Defects appear in pairs at the two endpoints of a given string. Alternatively, single defects can be created at a smooth boundary. We attempt to fix the gauge where the errors occur by pairing local defects to close the loops, or we move single defects to smooth boundaries to correct them. We then correct the gauge according to the corrected loop. However, given that the correction may not be in the location of the error that caused the defects, the operator we apply to fix the gauge will introduce bit-flip errors to the surface code. Up to stabilizers, the error we apply during the gauge-fixing procedure will be equivalent to an error that fills the interior of the closed loop created by the measurement error and the correction. These errors are problematic after the transversal non-Clifford gate is applied. However, provided that these errors are sufficiently small, we can correct them at a later stage of the error correction process.

Correcting broken loops becomes more difficult still when we only maintain a two-dimensional layer of the three-dimensional system as it will frequently be the case that a single defect will appear that should be paired to another that appears later in the spacetime but has not yet been realized. Hence, we will propagate defects over time before we make a decision on how to pair it. This deferral will cause the loop to extend over the time direction of the system, and this, in turn, will cause gauge-fixing errors to spread like the distance the defects are deferred. However, if we can make the decision to pair defects suitably quickly, we find that the errors we introduce during gauge fixing are not unmanageable. Here, we propose a just-in-time decoder that we can prove will not uncontrollably extend the size of an error. We assume that the error model will respect the chunk decomposition described above (see Eq. 3). We find that the just-in-time decoder will spread each error chunk by a constant factor of its initial size. We give some more notation to describe the error model before defining the just-in-time decoder and justifying that it will give rise to small errors at a suitably low error rate.

We remember that the chunk decomposition of the error *E* = *F*_{1} ∪ *F*_{2} ∪ … ∪ *F _{m}* is such that a

*Q*-connected component of

^{j}*F*has a diameter no greater than

_{j}*Q*and is separated from all other errors in

^{j}*E*(see Eq. 2) by more than

_{j}*Q*

^{j+1}/3. We define the syndrome of the error σ(

*E*), i.e., the defects that appear because of error

*E*. We also have that the error supported on

*F*

_{j,α}, together with its syndrome, is contained in a box

*B*

_{j,α}of diameter at most

*Q*+ 2 to include syndromes that lie at the boundary of a given error where

^{j}*F*

_{j,α}is a

*Q*-connected component of

_{j}*F*.

_{j}We denote defects, i.e., elements of σ(*E*) with coordinates *u* according to their site. A given defect at *u* has a time coordinate *u _{t}*. We denote the separation between two defects

*u*and

*v*in spacetime by ∣

*u*−

*v*∣ according to the 𝓁

_{∞}metric. At a given time

*t*, which progresses as we prepare more of the topological cluster state, we are only aware of all defects

*u*that have already been realized such that

*u*≤

_{t}*t*. We neutralize the defects of the syndrome once we arrive at a time where it becomes permissible to pair them; otherwise, we defer their pairing to a later time. Deferral means leaving a defect in the current time slice of the spacetime by extending the string onto the current time without changing the spatial coordinate of the defect. When we decide to pair two defects, we join them by completing a loop along a direct path on the available live qubits. In both cases, we fix the gauge according to the strings we have proposed with the correction or deferral. We are now ready to define the just-in-time decoder that will accurately correct pairs of defects given only knowledge about defects

*u*where

*u*≤

_{t}*t*.

**Definition 5** (Just-in-time decoder). The just-in-time decoder, JUST IN TIME(*t*), is applied at each time interval. It will neutralize pairs of defects *u* and *v* if and only if both defects have been deferred for a time δ*t* ≥ ∣*u* − *v*∣. It will pair a single defect *u* to a smooth boundary only if *u* has been deferred for a time equal to its separation from the boundary.

The definition we give captures a broad range of just-in-time decoders that could be implemented a number of ways. We could, for instance, consider clustering decoders (*32*) or possibly more sophisticated decoders based on minimum-weight perfect matching (*3*) to implement the decoder. A greedy decoder would suffice. Here, we only need a simple rule to demonstrate a threshold within the coarse-grained picture of the chunk decomposition. We also remark that we might be able to find better decoders that do not satisfy the conditions of the just-in-time decoder proposed here. We make no attempt to optimize this; the goal here is only to prove the existence of a threshold using the simplest possible terms.

Before we show that the just-in-time decoder will introduce a spread error with a constant spread factor *s*, we first consider how the decoder performs if we consider only a single *Q ^{j}*-connected component of the error

*F*

_{j,α}⊆

*F*. We first consider the

_{j}*Q*-connected component of the error well isolated in the bulk of the lattice, and then we consider how it is corrected close to the boundary.

^{j}**Fact 2.** The correction of an isolated *Q ^{j}*-connected component of the error,

*F*

_{j,α}, that lies more than 2(

*Q*+ 2) from the boundary is supported on the (

^{j}*Q*+ 1)-neighborhood of

^{j}*B*

_{j,α}. No defect will exist for a time longer than δ

*t*∼ 2(

*Q*+ 1).

^{j}*Proof.* Consider two defects *u*, *v* contained in *B*_{j,α} at extremal points. These defects have separation at most *Q ^{j}* + 2. Let us say that ∣

*u*−

_{t}*v*∣ =

_{t}*Q*+ 2 with

^{j}*u*>

_{t}*v*. The defect

_{t}*v*will be deferred for a time 2(

*Q*+ 2) before it is paired a distance

^{j}*Q*+ 1 from

^{j}*B*

_{j,α}in the temporal direction. This correction is supported on the (

*Q*+ 1)-neighborhood of

^{j}*B*

_{j,α}. All defects of this component of the error will be paired before it is permissible to pair them to the boundary.

By this consideration, we obtain a constant spread parameter ∼3 for boxes in the bulk of the model. We next consider the correction close to a smooth boundary. We find that this will have a larger spread parameter.

**Fact 3.** The correction of an isolated *Q ^{j}*-connected component of the error,

*F*

_{j,α}, produced by the just-in-time decoder is supported on the 3(

*Q*+ 2)-neighborhood of

^{j}*B*

_{j,α}, if

*B*

_{j,α}lies within 2(

*Q*+ 2) of a smooth boundary. All defects will be neutralized after a time at most 3(

^{j}*Q*+ 2).

^{j}*Proof.* A defect *u* lies at most 3(*Q ^{j}* + 2) from the boundary. In the worst case, all defects will be paired to the boundary after a time at most 3(

*Q*+ 2). Considering a defect at an extremal location, then, the just-in-time decoder may defer the correction of a defect beyond

^{j}*B*

_{j,α}at most 3(

*Q*+ 2) in the temporal direction.

^{j}The above fact allows us to upper bound the spread factor to *s* = 8. So far, we have only considered how the just-in-time decoder deals with well-isolated *Q ^{j}*-connected components of the error. We find that, for sufficiently large

*Q*, all errors are well isolated in a more precise sense. This is captured by the following lemma. We find, given that any defect supported on a box

*B*

_{j,α}will be paired with another defect in the same box or to a nearby smooth boundary after a time at most 3(

*Q*+ 2), it will never be permissible to pair defects contained in different boxes before they are terminated. In effect, all boxes are transparent to one another. This justifies the spread error model used in the previous section.

^{j}**Lemma 3.** *Take a chunk decomposition with Q* ≥ 33*. The just-in-time decoder will pair all defects supported on B*_{j,α}*within the* 3(*Q ^{j}* + 2)

*-neighborhood of B*

_{j,α}

*to either another defect in B*

_{j,α}

*or to the boundary.*

*Proof.* By Facts 2 and 3, we have that all the defects of isolated boxes *B*_{j,α} are paired to another defect in *B*_{j,α} or to a nearby smooth boundary at most 2(*Q ^{j}* + 2) from

*B*

_{j,α}after a time no more than 3(

*Q*+ 2).

^{j}We may worry that the just-in-time decoder may pair defects within disjoint boxes if they are too close together. We consider the permissibility of pairing *u* contained within *B*_{j,α} to *v* contained in *B*_{k,β}. For *Q* ≥ 33, we find that such a pairing will never be permissible before all defects are paired locally within their isolated boxes. We suppose that, without loss of generality, the diameter of *B*_{j,α} is less than or equal to the diameter of *B*_{k,β}. Given that *B*_{j,α} is separated from *B*_{k,β} by a distance greater than *Q*^{j+1}/3 − 2, it will not be permissible to pair *u* with *v* within the lifetime of *u* before it is paired to a boundary or another defect in *B*_{j,α} provided 3(*Q ^{j}* + 2) ≤

*Q*

^{j+1}/3 − 2. This is satisfied for all

*j*≥ 0 for

*Q*≥ 33.

This Lemma therefore justifies our spread factor *s* = 8 used in the previous section.

*Gauge prefixing*. Last, we assumed that we can reliably prepare the plaquette operators on the initial face of the two-dimensional surface code in their +1 eigenstate. We can tolerate small errors on the edges of the initialized surface code, but a single measurement error made on a plaquette can cause a critical error with the just-in-time decoder as it may never be paired with another defect. This will lead to a large error occurring during gauge fixing (see Fig. 8A). It is therefore important to identify any measurement errors on the face measurements of the initial face before the gauge fixing begins. We achieve this by prefixing the plaquettes of the initial face of the topological cluster state before the controlled-controlled-phase gate begins. We run the system over a time that scales with the code distance before we commence the controlled-controlled-phase gate procedure. In doing so, we can identify measurement errors that may occur on the dual qubits on the initial face of the topological cluster state using measurement data collected before we conduct the non-Clifford operation. Figure 8B shows the idea; the figure shows that measurement errors can be determined by looking at syndrome data on both sides of a plaquette on the initial face. We need only look at one side, namely, the side of the initial face before just-in-time gauge fixing takes place.

Since we need only determine which face operators have experienced measurement errors, and we do not need to actively correct the random gauge, gauge prefixing is accomplished globally using a renormalization group decoder on the three-dimensional syndrome data of the spacetime before the controlled-controlled-phase gate is performed. A threshold can be proved by adapting the threshold theorem for topological codes given in (*32*). Measurement errors close to the initial face before the controlled-controlled-phase gate takes place can then be identified easily by the decoder. We determine which plaquettes of the initial face have experienced errors by finding defects that should be paired to the initial face in the gauge-prefixing operation. Small errors in the global gauge-prefixing procedure can be contained within the boxes that contain the error syndrome. Hence, the errors that remain after the gauge-prefixing procedure are confined within small boxes, which respect the distribution we used to prove the threshold using the just-in-time decoder. Hence, we justify our error model used to bound the spread factor using just-in-time gauge fixing, even in the presence of initialization errors caused by gauge prefixing. We show an error together with its syndrome in Fig. 8C. The goal is only to estimate the plaquettes that have experienced measurement errors on the gray face at the top of the figure. This fixes the plaquettes of the initial face as we have assumed throughout our analysis.

We remark that the proposal given in (*16*) avoids the use of gauge prefixing by orienting boundaries such that the boundary that is analogous to the initial face of the color code is created over a long time. This orientation allows for single defects created at the initial face to be corrected by moving them back to the initial face at a later time, or onto some other suitable boundary. In contrast, here, we have imagined that an initial face is produced at a single instant of time. Further work may show that we can apply the idea of Bombín to the surface code implementation of a controlled-controlled-phase gate presented here by reorienting the gate in spacetime. Such an adaptation will also require a modification of the just-in-time decoder to ensure that defects created at the initial face are paired to an appropriate boundary in a timely manner.

Conversely, gauge prefixing can be adapted for the proposal in (*16*). In this work, color codes are entangled with a transversal controlled-phase gate. The transversal gate is applied to a two-dimensional support on boundaries of the two color codes undergoing this operation. Let us call this boundary the entangling boundary, where the initial face of the second code lies on the entangling boundary. Let us briefly summarize how we can prefix the gauge of the initial face of the second of the two color codes by error-correcting the first.

We note that the entangling operation allows us to use the eigenvalues of the error detection measurements at the boundary of the first code to infer the values of the face operators at the initial face of the second code. Small errors may cause us to incorrectly read the eigenvalues of the cells of the first code. This will lead us to infer the wrong eigenvalues of the face operators of the initial face of the second code. However, error correction on the first code ensures that its entangling boundary is charge neutral, i.e., it has an even parity of string-like errors terminate at this boundary. If the first code is charge neutral at its entangling boundary, then errors in the eigenvalues of the face operators of the initial face of the second color code are necessarily created in locally correctable configurations. This means that they can be corrected without pairing any defects onto the initial face. This observation circumvents the need to orient the color code in a special configuration in spacetime. Relaxing this constraint may be of practical benefit. Moreover, the observation may allow us to remove certain rules that the decoder must otherwise respect to ensure defects are paired to the initial face as required. This may lead to improvements in the performance of the decoder.