Twist-Routing Algorithm for Faulty Network-on-Chips

This paper introduces Twist-routing, a new routing algorithm for faulty on-chip networks, which improves Maze-routing, a face-routing based algorithm which uses deflections in routing, and archives full fault coverage and fast packet delivery. To build Twist-routing algorithm, we use bounding circles, which borrows the idea from GOAFR+ routing algorithm for ad-hoc wireless networks. Unlike Maze-routing, whose path length is unbounded even when the optimal path length is fixed, in Twist-routing, the path length is bounded by the cube of the optimal path length. Our evaluations show that Twist-routing algorithm delivers packets up to 35% faster than Maze-routing with a uniform traffic and Erdös-Rényi failure model, when the failure rate and the injection rate vary.


Introduction
The transistor technology scales in microprocessors, and more and more powerefficient cores are integrated on a single chip.The communication between these onchip cores should be efficient.Therefore, Networks-on-chips (NoCs), instead of simple buses, are becoming a promising choice for on-chip interconnects for their better scalability [1]- [6].Unfortunately, the reliability of the on-chip components is reduced as critical dimensions shrink, and a NoC might be a single point of failure [7].As the silicon ages, the error rates become quite high [8], because of oxide breakdown, electromigration, and thermal cycling [7].Hence, it is critical that some failures in the network do not cause an entire chip to fail.
There are some NoC reliability solutions based on architectural protection against faults in the router logic [9] [10] [11].But not all faults can be toleranted this way [12].
In recent works, faults are modeled by disabling such links, and a complete router loss is modeled by marking all the links connected to the affected router as faulty.The goal is to route packets around faults and finally reach the destination.Recent route-reconfiguration solutions to bypass faulty links or routers can be broadly divided into two kinds, buffered solutions and deflection solutions.Buffered solutions include Ariadne [13], uDirec [12], Hermes [14], which all utilize traditional wormhole routing [15], and routing tables.Those algorithms typically take some time to update routing tables when a new fault is detected, and incur reconfiguration overhead.The deflection solutions for non-faulty chips are introduced by BLESS algorithm [16] to overcome the significant energy consumption and design complexity caused by buffer usage.Then, CHIPPER [17] and minBD [18] develop the idea of deflection routing.For faulty chips, the Mazerouting algorithm provides a deflection routing algorithm, which is the first routing algorithm which provides guaranteed delivery in a fully-distributed manner at low cost and low reconfiguration overhead [19].
The Maze-routing is the state-of-the-art solution of deflection routing for faulty chips.However, the path length which is found by Maze-routing is unbounded even when the optimal path length is fixed.We proposed a improved algorithm named Twist-routing, taking inspiration from the idea of GOAFR+ routing algorithm, which was originally proposed for ad-hoc wireless networks [20] [21] [22].Using our algorithm, the path length is bounded by the cube of the optimal path length.Our algorithm inherits the property of Maze-routing, and provides guaranteed delivery at low cost and the same low reconfiguration overhead.The experiments show that our algorithm is 35% faster than Maze-routing when the failure rate equals to 0.3, and the injection rate is 0.003, and keeps fast when injection rate increases.

Twist-Routing Algorithm
The Twist-routing algorithm is a practical routing algorithm for faulty NoCs, which is based on Maze-routing for faulty NoCs and GOAFR+ routing algorithm for ad-hoc wireless networks.The faulty model is described in Section 0. We briefly review the Maze-routing algorithm in Section 2.1.In Maze-routing, a packet is alternately in greedy and face-routing [23] mode.In Twist-routing, these two modes remains, but we use bounding circles to limit the search range in a face-routing step, proposed in Section 2.3.This enables us to prove a theoretical bound of Twist-routing in Section 2.4.The interactions of Twist-routing and deflection are described in Section 2.5.

The Model
The model of the faulty on-chip routing is a mesh, where routers are placed on each grid points, and links are available between adjacent routers.Each routers can be good or bad, and each links can be healthy or faulty bidirectionally.A bad router is modeled by disabling all of its four links.In modern chips, packets are splited into flits, and routed from source node to the destination.In the routing algorithm, each router accepts input flits from all nearby healthy links, permute them according to some rules, and send them back to all nearby healthy links.Because links are bidirectional, there are as many output links as input links, so all flits can go somewhere after the routing.

The Maze-Routing Algorithm
The Maze-routing add a header to each flits, containing some metadata of this flit.They are src , the source; dst , the destination; best md , the closest Manhattan distance to dst that the packet has reached so far assuming a fault-free mesh; mode , being one of greedy , clockwisely face-routing (  ), or counter-clockwisely face-routing (  ); trav n and trav dir , the node and direction which indicates the destination is unreachable if it is visited again.
In Maze-routing, each flit is routed to a productive and healthy output if possible.
This is called the greedy mode.If there is no such output, the flit changes itself into face-routing mode (randomly chosen from  and  ).In face-routing mode  , the flit takes the first healthy output on the left of the ray from cur to dst , and then goes clockwisely.In face-routing mode  , the flit takes the first healthy output on the right of the ray, and then goes counter-clockwisely.Effectively, the flit traverses the face underlying the ray from cur to dst .The flit changes back to greedy mode when it goes to a router that can forward it closer to its destination than the node where it entered face-routing mode, i.e., the best md in header can be reduced by a neighbor link.If the best md cannot decrease until the flit has traversed the whole face, which is detected by revisiting trav n on the direction of trav dir , then there is no path between src and dst .We can drop this flit, and report this failure to src using the same algorithm as needed.

The Use of Bounding Circles
Twist-routing is based on Maze-routing, with the extra usage of bounding circles.The bounding circle is always centered at the destination of the flit, and its radius is recorded in the header, namely c.Notice that in Maze-routing, once face-routing mode is chosen, the direction is fixed until the flit changes back into greedy mode.In Twist-routing, we draw a bounding circle with We use these values in the our experiments.

Proofs of Being Faster
Maze-routing can be very bad in some cases (see Figure 2 for one example of such cases).
Assume the big tree contains n edges.Maze-routing randomly choose between two directions when entering face-routing mode.If Maze-routing chooses the good direction, the flit will reach the destination with 4 hops.If Maze-routing chooses the bad direction, the flit has to go to the big tree and goes all the way back, and takes 2 10 n + hops to reach the destination in total.In average, Maze-routing takes 7 n + hops, which is ( ) . In this example, Twist-routing chooses between two directions, too.One direction leads to 4 hops.If we take the other direction, the flit will goes back without entering the tree because of the use of the bounding circle, and takes 8 hops to reach its destination.On average, it takes 6 hops only.
In the previous example, the length of the optimal path m is a constant, but Mazerouting needs ( ) hops.So Maze-routing cannot be bounded by any expression of m.However, Twist-routing runs in ( ) O m hops, which is asymptotically better than Maze-routing.Now we prove this bound by two theorems.
Theorem 1.If the destination of a flit is reachable from the source, and m is the length of the optimal path of this flit, the radius of the largest bounding circle used by Twist-routing without deflection is no more than ( ) , where 0 c is the initial radius of the bounding circle.
Proof.There is a case where we never enlarge the bounding circle, so the largest circle is the initial one, with radius 0 c .Otherwise, we only enlarge the bounding circle to 1 C with radius k α only if we meet a boundary of the bounding circle C with radius k.Only if we first meet the other boundary of C later, we may meet the boundary of 1 C , and enlarge the bounding circle again.So if we found an edge which leads to closer to destination within the bounding circle C with radius k, we will not meet the other boundary of C, and the radius of the bounding circle never exceeds k α .Assume that we use the bounding circles that c m c α ≤ < .We want to prove the radius of the largest bounding circle never exceeds 2 m α , and it is enough to show that it never exceeds 2 c α .Then it is enough to show that in the bounding circle with radius c α , the face routing can always find an edge that goes closer to the destination.Supposing not, then we assume in the face routing step, we go through path p.The path p splits the bounding circle with radius c α into two parts, and exactly one of them is reachable from the source within the bounding circle of radius c α .In other words, the destination is unreachable from the source within the bounding circle with radius c α .
But since the length of the optimal path from the source to the destination is m, the optimal path lays in the bounding circle of radius c α completely, i.e., the destination is reachable from the source within the bounding circle.That is a contradiction.□ and each time we enlarge the bunding circle exponentially, the total hops of one face routing step are ( ) Now consider that 0 best md m ≤ ≤ , and each reduction of best md takes at most ( )

2
O m hops, so all we need is ( )

3
O m hops in total to transport this flit using Twist-routing.□

Deflection Implications
At one router, there are at most 4 input flits.Some flits have to be buffered or deflected , in which next is the next router of this flit, and its mode are reset to greedy .This makes the header and the state of this flit consistent.
To avoid deadlocks and licklocks, our algorithm needs to work with some deflection based mechanism proposed in literature.We mostly use minBD due to its high performance.The original method to avoid livelocks in minBD is to circularly make one flit golden for a long time L.However, in faulty chips, L needs to be at least as large as the longest path in the graph, which can be ( )

2
O n large, where the chip is n by n.This renders the golden method to avoid locks not efficient.Instead of making one flit golden, we prioritize old flits to new flits globally to avoid livelocks.And we disable the buffer redirection in minBD because it is not compatible with our oldest-flit-based livelock-avoiding method 2 .

Simulations
We compared Twist-routing algorithm with the original Maze-routing using an ad-hoc simulator 3 .Note that in Maze-routing, flits are independent to each other, and multiple flits are assembled to the original packet when received.For simplicity, we assume there is only on flit per packet in our simulator.We implements Maze-routing and minBD deflection method with buffer size equals to 4 in our simulator.Meanwhile, we implemented Twist-routing with minBD, too.In both algorithms, we use oldest-flit based livelock-avoiding method and without buffer redirection.
In order to compare the performance of two algorithms, we computed the average flit latency in the network under different injection rates using a uniform traffic 4 .We use 32 32 × networks for evaluation.We use Erdös-Rényi model to generate faulty links, where the failure rate of any edge is 0.1 or 0.3.We generate 5 faulty chips for each case, and compute the average result across them.For each case, we run the simulations for 1000 cycles.
In a typical setting, the distances to deflect clockwisely or to deflect counter-clock wisely can be so much different.By backtracing and trying the other direction when running away from the bounding circle, our algorithm should provide better performance than the origin Maze-routing Algorithm.The simulation result shows the correctness of this conclusion.After careful measurement, in the case when the failure rate equals to 0.3, and the injection rate is 0.003, Twist-routing is 35% faster than Mazerouting.When the injection rate increases, Twist-routing keeps being fast (see Figure 3 for details of all results). 2 When the buffer redirection is enabled, we cannot avoid redirecting the oldest flit into the buffer, because the local information is not enough for us to determine if a flit is globally oldest or not.If the oldest flit enters the buffer, the delivery guarantee will be broken.

Figure 2 . 3 O
Figure 2. A setting where Twist-routing performs way better than Maze-routing.Theorem 2. If the destination of a flit is reachable from the source, and m is the length of the optimal path of this flit, Twist-routing can find a path with length ( ) 3 O m for this flit without deflection.Proof.Twist-routing consists of face routing steps and greedy routing steps.A greedy step reduce the dir of them are taken by other flits.If such case happenes, the flit may take a non-productive output, exit the face it is traversing, or be buffered and reappears in other input ports later.These behaviors result in inconsistency of the 1 Actually, the best md decreases in the next greedy step instead of face-routing step, but since each facerouting step is always followed by a greedy step, we may regard the next greedy step as if it is part of facerouting step, and say face-routing step reduces the best md by one.if the out