Simple Mitigation Strategy for a Systematic Gate Error in IBMQ

We report the observation and characterisation of a systematic error in the implementation of $U_3$ gates in the IBM quantum computers. The error appears as a consistent shift in one of the angles of the gate, whose magnitude does not correlate with IBM's cited errors calculated using Clifford randomized benchmarking. We propose a simple mitigation procedure, leading to an improvement in the observed value for the CHSH inequality, highlighting the utility of simple mitigation strategies for short-depth quantum circuits.


I. INTRODUCTION
Quantum error correction is essential in the development of fully functional quantum computers. Existing hardware does not meet the requirements to implement fault-tolerant quantum error correction, outside of small preliminary studies [1][2][3][4]. The accuracy of observables produced by current hardware is therefore limited, but many candidate applications require greater precision to outperform classical methods. For this reason, it is widely regarded that error mitigation will be essential in demonstrating near-term quantum advantage [5].
Error mitigation aims to reduce the effect of noise rather than remove it completely. There are many distinct approaches towards this goal, with two common methods being: optimizing quantum circuits through compilation and machine learning [6][7][8] and classical post processing. One of the most promising post processing techniques is zero noise extrapolation [9] which combines observables evaluated at several controlled noise levels [10,11], enabling extrapolation to the zero-noise limit. Recently several new mitigation methods have been developed that make use of learning from data sets constructed using quantum circuit data [12,13] demonstrating the rapid progress in this field.
Errors occur due to a multitude of factors in both the qubits themselves and the control hardware. Qubits are not completely isolated from their environment, leading to thermal relaxation and the decoherence of their state. Gate errors result from miscalibration or imperfections in the control hardware and their interactions with the qubits. Furthermore, the readout procedure can misidentify or alter the final qubit state such that the measured value does not accurately reflect the collapsed state [14].
A single qubit pure state can be represented as: which can be visualized as a point on the Bloch sphere at polar angle θ and azimuthal angle φ.
During computation a given number of one and two qubit gates are performed on a set of qubits. In the zero noise limit this has the effect of changing the state by some unitary operation U. Any unitary is decomposed into the physical gate set of the device, S. When implemented in the IBMQ quantum computers this set is given by S = {U 1 (ω), R x (± π /2), CX}, where ω is some angle. The gate U 1 (ω) is equivalent to R z (ω) up to a global phase factor and is implemented virtually within IBMQ. This is achieved by using frame changes with near perfect execution [15] and does not involve the action of any physical quantum gates. A general single qubit unitary can be decomposed as follows: where the R z gates are implemented virtually (VZ), and the R x (±π/2) by a pulse [16]. Once execution of the required gates is complete, the quantum computer measures the qubits, collapsing the state, and outputs the results. The computation is repeated and a vector of counts v exp , length 2 n (where n is the number of qubits), is obtained. Relaxation, imperfect coupling of the readout resonator and signal amplification lead to errors in the measurement process [14]. Although major improvements in this area are likely to come from improved hardware, it is possible to mitigate the measurement error through various techniques [17]. A simple strategy currently implemented within IBM's Qiskit software [18] uses data from calibration circuits to mitigate the error using classical post-processing. This is achieved using the direct construction of a calibration matrix which for one qubit can be written as: where p 0 and p 1 are the probabilities that a prepared |0 is measured as |0 and a prepared state |1 is measured as |1 respectively. This technique can be extended to multi-qubit states using a tensor product or correlated Markov noise approaches [19]. The calibration matrix can also be calculated using maximum likelihood techniques and quantum detector tomography [20]. The calibration matrix can then be used to mitigate errors associated with the readout either directly by (i) inversion or through (ii) bounded minimization.
(i) Inversion is done by inverting the calibration matrix as such: where v exp , v th are the experimental and ideal vectors of the counts.
(ii) Bounded minimization uses bounded least squares optimization: where bounds ensure the probabilities calculated from v th are positive and correctly normalised.
These techniques share the assumption that the error rate in state preparation is much lower than the readout error. This is not without merit as single gate errors cited in IBM, Google and Rigetti are all below 0.5% while their readout errors are around 1 − 5% [14,16,21]. Yet, any error in state preparation, especially systematic ones, can lead to an inaccurate calibration matrix.
In this paper we highlight a systematic error in the execution of the U 3 gate in IBM's cloud based computers, which appears as a shift in the angle θ when implementing the gate U 3 (θ, φ, λ). We propose to mitigate the previous error using an angular shift in θ in the U 3 gate We illustrate the functionality of this mitigation method by measuring the CHSH inequality on data from a real device.

II. ERROR CHARACTERISATION
A. Sweeping a meridian F To explore the reliability of the U 3 gate we applied it to the |0 state with λ = π /2, φ = − π /2 and various angles θ in the interval [0, π] (see eq. (2)). This represents a rotation about the x axis (R x (θ)) on the Bloch sphere that sweeps a whole meridian. The gate is followed by a measurement in the z basis.

|0
U3(θ, − π /2, π /2) IBM's calibration method consists in measuring the states |0 and |1 = R x (π) |0 , extracting the values of p 0 and p 1 to build the matrix M cal given in (4). The experimental |0 count for any given θ (P 0 (θ)), ignoring all errors apart from readout, can be described by We shall refer to this formula as the IBM-fit. Observe that (5) reproduces by construction the experimental data p 0 and 1 − p 1 for θ = 0 and π respectively. To test the reliability of this formula we divide [0, π] in 30 intervals and measure P 0 (θ i ) for θ i = πi/30 with i = 0, 1, . . . , 30. The results obtained for the qubit 9 of the Cambridge QC, with 8,192 shots per angle, are plotted in Fig. 1 together with the curve (5). One can clearly see a significant deviation between the experimental data and the IBM prediction. However, this deviation follows a trend that we characterize with the following ansatz Here, the angle θ is shifted by a parameter α that takes small values, as we shall see below. The probabilities p 0 and p 1 , appearing in (5), have been replaced by p 0 and p 1 to allow for a more accurate description of the experimental results in the range θ ∈ [0, π].
The numerical values of α, p 0 and p 1 are determined using a least-square fit of the set {P 0 (θ i )} 30 i=0 using (6). We shall denote this approach as the Shift-fit method. Fig. 2 shows that (6) provides a much better fit to the data than (5).
To quantify the performance of the fits we use the coefficient of determination R 2 that is defined as where P exp 0 (θ n ) is the experimental probability of the |0 counts at angle θ n , andP exp 0 its average. The R 2 estimator is customarily expressed in percentages, thus a perfect fit, implies a R 2 fit × 100 = 100% of predictabilty. The data given in Fig. 2 yield an R 2 equal to 97.6% for the IBM-fit and 99.9% for the Shift-fit.

B. Several sweeps: jobs
The results presented in Fig. 1 correspond to a single sweep of equally spaced angles θ n along a meridian. To assess the reliability of the Shift-fit method we consider a set of n s consecutive sweeps that we denote a job. The number of sweeps n s can depend on the job (see Fig. 2). A given job is run within a time lapse where the quantum computer is assumed to remain approximately under the same experimental conditions. The result of each job is a set of parameters {α s , p 0,s , p 1,s } ns s=1 , which according to the previous assumption, should be similar. Fig. 3 shows the values of α obtained for 15 jobs, amounting to a total of 100 sweeps. We notice that: i) within each job the parameter α takes similar values, ii) the average value of α presents large deviations between jobs, as shown in the histogram. Item i) is in rough agreement with the stability assumption made above, while item ii) can be attributed to different calibrations during the time delay between different jobs.
The distribution has a mean α of −0.14 (7), where the number in brackets is the standard deviation on the last digit shown. This mean does not properly reflect how α behaves within a single job, as for example the single run in Fig. 1 whose α = −0.18. We also find that overall the average R 2 for the Shift-fit and IBM fit are 99.9% and 97.0% respectively leading to the conclusion that including an α shift results in a more accurate description of the raw data in general. Finally it is worth noting that we have not found correlation between the shift observed and IBM quoted errors.
In table I    We have also explored other meridians with the Shift-fit method and found a negligible dependence on the meridian. Through testing the same qubits in the same job in all the computers with ten equally spaced φ from 0 to 2π we saw a no shifts greater than the standard deviation from the mean and there was no trend of increase with a change in φ.

C. Mitigation
As explained above, the parameter α represents a systematic error that affects the rotation angle θ of the U 3 (θ, φ, λ) gate. A naive way to mitigate it is to replace θ by θ − α, expecting that this displacement will compensate the error. The corresponding mitigated circuit is To implement the α mitigation a python software suite was written to perform these calibrations and implement the shift on subsequent experiments [22]. Fig. 4 shows a selection of results. The values of α, obtained with this type of mitigated circuit are much closer to zero that those obtained without the shift. The calibration and mitigated rotation were performed with a job with 10 sweeps. The R 2 values for the Shift-fit were above 99% in all cases. These results assess the effect of the mitigation method.
R o ch es te r 1 J o h a n n es b u rg 0 P ar is 4. Box plot of the Shift (α) determined before (white) and after (blue) mitigation for a subset of qubits from several computers. The box and whiskers encompass 50% and 95% of the results respectively, dots represent outliers. Discrepancies between the data for the Paris quantum computer displayed here and that shown in table I are due to the results being from different jobs. Furthermore, results from some qubits which are displayed in table I are not shown here as they exhibited very small α values at the time of execution, highlighting the large variance of the observed shift between runs.

D. Repeated gates and different initial states
We now explore the dependence of the α shift with the number of gates applied in a consecutive sequence. To this end we decompose a rotation R x (θ) into M rotations of angle θ/M , as shown in the circuit of Fig. 5. The results for M = 1, . . . , 10 are given in Fig. 6. We find that |α M | increases with M , but not linearly as one would naively expect, that is α M M α 1 . All the tested computers returned different trends, and they changed between jobs even for the same computer. Sometimes a negative α would go closer to zero or further from zero and a positive α would sometimes grow or decrease. This fact suggests that the systematic error expressed by α has a complex origin that probably involves several components of the machine. We have also studied sweeps starting, not from |0 , but from the states obtained acting on |0 with R x ( π /4), R x ( π /2) and R x ( 3π /4). The results plotted in Fig. 7, show a rough agreement of the values of α. This suggests the result is not strongly state dependent.

III. ORIGIN OF THE ERROR
In this section we propose an explanation of the shift-fit effect based on a potential error in the implementation of the gates R x (±π/2). In the ideal case these gates are realized as exp (∓itΩ/2 σ X ), where Ω is the pulse amplitude and Ωt = π/2. An off resonance error (ORR) in the R x gate pulse can be modelled as follows [15]: Replacing these gates into (3) we obtain a gate U 3 (θ, − π 2 , π 2 , δ) that includes the ORR error. Finally, we apply the calibration matrix M cal , to obtain the probability of measuring the |0 state for various angles θ where we have assumed that δ is a small parameter. Starting from 6 and expanding in powers of α gives These two expression are equivalent up to O(δ 3 ) assuming α = 2δ and the using the same calibration matrix. This means that the VZ gates can indeed be used to correct for this by replacing the θ parameter in eq. 3 with θ − α, which is equivalent to altering the θ in the U 3 gate. It appears that the shift observed is well described by the appearance of ORR errors in the R x gates. However, upon multiple action of these gates, one would expect the errors to accumulate, resulting in a shift that grows proportionally with the number of applied gates. As previously demonstrated, this is not observed (see Fig. 6).
We shall show that despite the previous complications, the α mitigation improves observed CHSH inequalities, suggesting the simple mitigation strategy we present could be useful in short-depth circuits.

IV. EVALUATING THE CHSH INEQUALITY
The CHSH inequality involves running 4 separate circuits which each consist of a Bell state preparation followed by measurements in four appropriately chosen bases (Fig. 8). It is a quintessential experiment in quantum mechanics demonstrating that quantum correlations cannot be explained classically [23]. The correlation function can be expressed as follows: where 4 system observables are shown as A, A and B, B , these letters simply represent different measurement bases of the bipartite system comprising of A and B. AB is the correlated expectation for two of those observables. For a system with a hidden variable or classical correlations, |C| is bounded at 2. For a system with maximal entanglement, this bound is 2 √ 2 [24]. In general the measured mitigated correlations are closer to the theoretical limit as in table II, with the least improved cases appearing when α is very small in one or both qubits. Therefore, using a simple mitigation strategy can improve measured quantities in a real device.
How this improvement scales with depth and number of qubits in the circuit is an important consideration. We have shown the shift effect does not appear to be consistent with increasing depth as seen in 6. However, when increasing the system size a set of calibration circuits could be run on each qubit to determine the α shift whose effect could then be mitigated as outlined above.

V. DISCUSSION AND CONCLUSION
In this paper we have highlighted the existence of a systematic error, which appears as an angular shift (α) in the parameter θ of the U 3 gate, and demonstrated its effects can be mitigated by performing a simple calibration before running a set of jobs. This shift was shown to bare characteristics of an ORR error. Therefore, it is now possible to mitigate this component of the total error irrespective of the readout error and other errors. This leads to an increased performance on our benchmark circuits to calculate the CHSH inequality. We found that the systematic shifts are consistent over the time span of a few successive jobs, but not over larger stretches of time.
As the ORR error can be corrected through the use of VZ gates, the change in the θ parameter of the U 3 gate does just this [15]. Although using the 'open pulse' capabilities of some IBMQ quantum computers and finely tuning the R x pulses would result in similar improvements, this is a more complicated procedure and may not completely remove the ORR effect.
We have also shown that although these errors can be corrected for single gates, the application of multiple gates to a single qubit does not follow the expected relation from the ORR treatment which imply a linear growth in the shift with multiple gates. The origin of this behaviour remains an open question and further investigation is left to future work. Despite this, applying this correction still yielded improved results in the CHSH inequalities.
Any simple mitigation strategy can only improve the fidelity of calculations by a small factor. Yet, a modest increase in fidelity for a small upfront computation may be worth the extra time. Although this method could not be applied to deep circuits we envision it could be useful for many qubit, short-depth quantum circuits, especially if combined with other mitigation techniques.

SUPPLEMENTARY MATERIAL
A. Coefficient of determination, R 2 The coefficient of determination, R 2 , is defined as Where the total sum of squares SS tot and total sum of residuals SS res are with y i being a particular data point, f i being the prediction of y i andȳ the average of the observed data. If R 2 = 1, the fit is an exact match to the experimental data while anything lower implies a progressively worse fit.
In total the statistics of the goodness of fit of our proposed shift with respect to IBM and the ideal curve (setting p 0 = p 1 = 1 and α = 0) are tabulated below for an aggregate of all of the sweeps over all computers. Furthermore, we ascertained that there was no correlation between the alpha values and the cited IBM error rate by ordering the size of the errors for a given computer's qubits by magnitude and comparing them to the magnitude of α associated with a given job. There was no polynomial (up to order 4) which gave any appreciable R 2 value for any computer.

B. Largest observed shift values
The table below shows the fitted data for 20 qubits with the largest average α after 100 sweeps, with exception of Rochester at 10 sweeps due to the large number of qubits. This process was carried out on the Cambridge, London, Rochester, Paris and Johannesburg computers.