1. Introduction
The concept of arbitrage is fundamental in financial literature and has been used in classical analysis of market efficiency [1] [2] , whereby arbitrage opportunities are quickly exploited by investors. However, pure arbitrage opportunities are unlikely to exist in a real trading environment [3] [4] . An arbitrageur typically engages in a trade that involves some risks. In the specific case where these risks are statistically assessed, then it is appropriate to use the term statistical arbitrage (SA). SA has been broadly investigated in literature, however, scholars either focus on definitions or on developing and testing investment strategies, while we are not aware of any attempt to reconcile these two areas of research. On the one hand, several studies introduce definitions extending the concept of arbitrage through statistics but with little emphasis on strategies [5] - [11] . On the other hand, research on statistically determined arbitrage strategies focuses on models and investment opportunities [12] [13] with little or no discussion on definitions and theoretical framework. This leads us to our research question. What is SA?
This paper addresses this question with an in-depth investigation of SA. We begin by reviewing existing definitions of arbitrage, which are reduced to a common framework to analyze and compare them. We survey statistically determined arbitrage strategies analyzing both the academic and financial industry research. In total, we review 165 articles on the subject, published between 1995 and 2016. Particular attention is paid to hedge funds techniques, market neutral investment strategies and algorithmic trading. The strategies are discussed in a standardized way analyzing equity, fixed income and, for the first time, commodity. We find that these strategies show significant similarities and common features that define them. The comparison of theoretical definitions and strategies’ key features indicates that no available definition appropriately describes SA strategies. To bridge this gap, we propose a general definition, which more closely reflects investors’ strategies. In addition, we suggest that, instead of searching for a definitive theoretical definition of SA, scholars should instead agree on a classification system that encompasses the current forms of SA while facilitating the inclusion of new types as they emerge. We propose a simple system for classifying strategies that takes into account the strategies’ risk and return profile. We illustrate the advantages of this approach by demonstrating how it can guide theoretical development and empirical testing. We also provide examples of potential future research directions.
We make several contributions to the existing literature. We identify a general definition, which encompasses all SA strategies and introduce a classification system that facilitates their study. This is achieved through an innovative investigation of SA both in academic and financial industry research. In our review, for the first time, we analyze SA across all asset classes (equity, fixed income and commodity) to identify common features and defining elements. Our analysis brings clarity in SA investing and allows investors to have a common framework to assess different investment opportunities.
The paper is organized as follows. In Section 2, we review existing definitions of SA producing a comprehensive mapping. In Section 3, we report a survey of statistically determined arbitrage strategies. In Section 4, we identify the key features which are common to the various strategies. We combine the findings of the previous sections and propose a general definition and classification system. Section 5 concludes the paper.
2. Review of Definitions
It is commonly accepted that Statistical Arbitrage (SA) started with Nunzio Tartaglia who, in the mid-1980s, assembled a team of quantitative analysts at Morgan Stanley to uncover statistical mispricing in equity markets [14] . However, SA came to the fore as a result of Long-Term Capital Management (LTCM), a hedge fund founded in 1994, where Nobel Prize winners Sholes and Merton both worked. The company developed complex SA strategies for fixed income [15] which were initially extremely successful. However, in 1998, as a result of the financial crises in East Asia and Russia, LTCM’s arbitrage strategies started producing large losses which endangered global markets and forced the Federal Reserve Bank of New York to organize a bailout in order to avoid a wider financial collapse. Nevertheless, SA continued to grow in popularity with applications progressively expanding to all asset classes. SA has become one of the main investment strategies in investment banks and mostly for hedge funds [16]. In particular, the term SA is used to denote hedge funds that aim to exploit pricing anomalies in equity markets [17] . Technological developments in computational modelling have also facilitated the use of SA in high frequency trading and with the so-called machine learning methods, such as neural networks and genetic algorithms [18] [19] [20] [21] . In more recent years, SA has seen renewed interest in emerging areas such as bitcoin [22] [23] , big data [24] [25] [26] and factor investing [27] .
The literature on the limits of arbitrage is quite broad and provides some insights on why SA opportunities exist. Mou [28] reports how arbitrageurs have to face three different types of risks: fundamental risk [3] , noise trader risk [29] and synchronization risk [30] . Duffie [31] describes the risks arising from inattentive investors. Finally, behavioral effects can generate additional risk and asset bubbles. On the one hand, these risks create SA opportunities. On the other hand, the same risks can undermine arbitrageurs’ efforts and cause delays in correcting market anomalies.
In this section, we review all definitions of arbitrage available in literature which may be suitable to define SA. Our analysis encompasses both alternative definitions of arbitrage as well as definitions of statistical arbitrage. Before reviewing the various definitions, we briefly recall the four types of definitions that are commonly used: 1) lexical, 2) conceptual, 3) abstract and 4) operational [32] [33] . Lexical definitions use simple terms for a wide audience. Conceptual definitions describe a concept in a way that is compatible with a measurable occurrence. Abstract definitions are used when the meaning cannot be measured empirically. Finally, operational definitions provide a clear and concise meaning of a concept in a way that can be measured. Operational definitions clearly specify the object and criteria of measurement which makes them particularly suitable for scientific investigation. We find that existing definitions can be categorized as lexical, conceptual or operational while there are no abstract definitions.
2.1. Lexical Definitions of SA
Some lexical definitions tend to be vague and lack formalism because traders, for good commercial reasons, tend to be obscure about their investment methods. Pole [13] for example writes that SA uses mathematical models to generate returns from systematic movements in securities prices. According to Avellaneda and Lee [34] , the term statistical arbitrage encompasses a variety of strategies characterized by systematic trading signals, market neutral trades and statistical methods. Montana [35] defines SA as an investment strategy that exploits patterns detected in financial data streams. Burgess [36] defines statistical arbitrage as a framework for identifying, modelling and exploiting small but consistent regularities in asset price dynamics. Other definitions are centered on the concept of mispricing. Thomaidis and Kondakis [37] define SA as an attempt to profit from pricing discrepancies that appear in a group of assets. Do, Faff and Hamza [38] claim that SA is an equity trading strategy that employs time series methods to identify relative mispricings between stocks. Burgess [36] also describes statistical arbitrage as a generalization of a traditional arbitrage where mispricing is statistically determined through replicating strategies. In using derivatives, Zapart [39] describes statistical arbitrage as an investment opportunity when perfect hedging is not possible.
A general definition of SA strategy should describe what SA is and its objectives. We find instead that some definitions focus on specific implementations and techniques. In particular, in a broad range of papers, SA is associated with pairs trading [14] [40] - [46] and cointegration [47] [48] [49] [50] [51] .
2.2. Conceptual Definitions of SA
Another set of definitions can be classified as conceptual as they can be associated with specific measures. In reviewing Hedge Funds (HFs) strategies, Connor and Lasarte [52] use the probability of a loss in defining SA as a zero-cost portfolio where the probability of a negative payoff is very small but not exactly zero. Stefanini [12] uses the expected value in noting that SA seeks to capture imbalances in expected value of financial instruments, while trying to be market neutral. For Saks and Maringer [53] , SA accepts negative payoffs as long as the expected positive payoffs are high enough and the probability of losses is small enough. Focardi, Fabozzi and Mitov [54] focus on uncorrelated returns reporting that SA strategies aim to produce positive, low-volatility returns that are uncorrelated with market returns.
2.3. Operational Definitions of Arbitrage
We next discuss the various extensions of arbitrage available in the literature that are used mainly in asset pricing. All definitions can be classified as operational and are mathematically formulated. Here, we provide a description of the various arbitrages while we refer to the relative papers for a more rigorous formulation.
We first introduce the classical definition of arbitrage, defined as a zero-cost trading strategy with positive expected payoff and no possibility of a loss. The absence of arbitrage is a necessary condition for equilibrium models, however this condition alone is often too weak to be practically useful for certain applications such as option pricing [10] .
A first attempt to provide a new definition of arbitrage is made by Ledoit [5] who defines δ-Arbitrage (δA) using the Sharpe ratio [55] [56] . Ledoit [5] defines δA as an investment strategy having a Sharpe ratio above a constant and strictly positive level δ. In the context of incomplete markets, Chochrane and Saa-Requejo [6] independently apply the same concept as Ledoit to derivatives. They define a strategy as a Good Deal (GD) if its market price lies outside the range of plausible prices as determined by the various discount factors.
Bernardo and Ledoit [7] introduce the Approximate Arbitrage (AA) as they note that the Sharpe ratio is not a good measure of the attractiveness of an investment opportunity. If returns are not normally distributed strategies can have arbitrarily low Sharpe ratios, hence the introduction of a gain-loss ratio. AA is defined as an investment strategy whose maximum gain-loss ratio is above a predefined constant value greater than one. Instead of using the Sharpe ratio or the gain-loss ratio, Carr, Geman and Madan [9] base their definition of Acceptable Opportunity (AO) on two distinct sets of probability measures (valuation and stress measures). AO is defined as an investment strategy having a non-negative expected value under each valuation measure and losses capped under a set of stress measures. In other words, AO is an investment opportunity acceptable to a wide variety of reasonable individuals as it has expected non-negative payoff with losses capped under probability measures reflecting stressed conditions (stress measures). Bertsimas, Kogam and Lo [8] introduce
-Arbitrage (εA) referring to replication strategies for derivatives. An εA occurs whenever the price of a derivative significantly differs from the least costly optimal replication strategy.
In the literature, there are two definitions of Statistical Arbitrage (SA) which differ significantly from each other. Bondarenko’s SA [10] is a trading strategy which can have negative payoffs, as long as the average payoff is non-negative for given augmented information set. Key in the definition is the introduction of the augmented information set, which, in addition to the market information at time t, also includes the knowledge of the final price. Hogan et al. [11] provide an alternative definition of SA which focuses on long horizon trading opportunities. Hogan’s SA is a long horizon trading opportunity that, at the limit, generates a risk-less profit. According to this definition SA satisfies four conditions 1) it is a zero-cost, self-financing strategy, that in the limit has 2) positive expected discounted payoff, 3) a probability of a loss converging to zero, and 4) a time averaged variance converging to zero if the probability of a loss does not become zero in finite time. The fourth condition only applies when there always exists a positive probability of losing money.
As a summary, we provide a high-level description of all the reviewed arbitrage definitions in Table 1. Most of them are intended to describe only specific types or aspects of SA and will be discussed and compared to SA strategies in Section 4.2.
3. Literature Review of Strategies
3.1. Literature Review
The existing literature on SA includes a small number of reviews of arbitrage strategies which cover only single asset classes. In fixed income, Duarte,
Longstaff and Yu [15] conduct an analysis of the risk and return characteristics of the most widely-used fixed income arbitrage strategies. In equity, Do, Faff and Hamza [38] analyze different approaches to pairs trading: distance approach, cointegration approach, stochastic spread approach and stochastic residual spread approach. Again, focusing on equities, Pole [13] elaborates on pairs trading as well as statistical models for time series analysis. There are no reviews for commodities, where studies primarily focus on modelling spreads and term structures for single commodities [57].
1Ornstein-Uhlenbeck is a model used to describe the multivariate dynamics of financial variables [44] .
In our review, for the first time, we look at SA across all asset classes to identify common features and defining elements. We review the existing literature on statistically determined arbitrage strategies and, particularly, on those labelled as SA. We identify 165 articles in literature discussing SA strategies spanning from 1995 to 2016 (see Table 2). The surveyed studies focus on equities (104 studies), followed by bonds (40) while other asset classes appear only in a small number of articles: commodities (9), volatility (9) and FX (1). Just two articles discuss pairs trading across asset classes (mix): investment grade credit default swaps versus equity [58] and gold miners versus gold [59] .
We categorize the various strategies based on the classification proposed by Duarte, Longstaff and Yu [15] who identify five different types of SA strategies in fixed income: 1) swap arbitrage strategies, 2) term structure arbitrage (or yield curve arbitrage), 3) mortgage arbitrage, 4) volatility arbitrage and 5) capital structure arbitrage. We add equity pairs trading to the classification for fixed income of Duarte, Longstaff and Yu [15] . The term SA is used very frequently in particular in relation to pairs trading (112) which includes pairs trading between indices (13), ETFs (4) and spread trading between commodities (6). Various articles focus on cointegration (21), the Ornstein-Uhlenbeck1 stochastic process (10) and, more recently, high frequency trading (9). Pairs trading is predominantly an equity strategy (103). Capital structure arbitrage is the second most documented strategy (30) which includes primarily convertible arbitrage strategies (19). Term structure strategies are documented only in eight studies of which four analyze bonds. Swap spread arbitrage and mortgage arbitrage are discussed in three studies each.
Table 2. Studies on arbitrage strategies. The table reports the breakdown by asset class of existing studies on statistically determined arbitrage opportunities.
3.2. Review of Strategies
We next describe the six identified trading strategies. Pairs trading is a SA strategy which is particularly popular in equity [41] . In its simplest formulation, pairs trading aims to identify pairs of stocks whose prices have historically moved together. When the spread between the two components of the pair significantly widens, the strategy sells the best performing security to buy the laggard. If the spread reverts to the mean the trade will be profitable regardless of market trends. This strategy relies on the assumption of a (long-term) equilibrium in the investigated spreads [60] which can be detected through a variety of statistical methods [14] [34] [38] [40] [41] [42] [61] . Long and short positions can be combined in a ratio which makes the trade market-neutral (with a neutral beta position versus the market) or dollar-neutral. The use of pairs trading is not limited to stocks. There are applications to other areas such as spreads between different commodities [62] - [67] , commodity future contracts [68] and freight markets [69] [70] . Pairs trading can also be used to model the spread between different portfolios [71] [72] [73] .
Term structure arbitrage is a common SA strategy which typically involves taking market-neutral long-short positions at different points of a term structure as suggested by a relative value analysis [15] . Positions are held until the trade converges and the mispricing disappears. Term structure arbitrage is particularly common in fixed income (also called yield curve arbitrage) and commodities. In spite of being one of the most common SA strategies, the literature on implementations of yield curve arbitrage is quite limited and mostly focuses on interest rates models [15] [74] . Term structure arbitrage in commodities uses models (similar to the one used in rates) to identify relative value opportunities across the curve [57] . An implementation of term structure arbitrage in commodities is described by Mou [28] who identifies investment opportunities arising from the futures rolling of the main commodity indices. In credit, SA opportunities in the term structure of CDS are studied by Jarrow, Li and Ye [75] .
Volatility arbitrage is a popular and widely used strategy [76] [77] [78] [79] [80] . Its implementations are structured to be pure bets on volatility and should not be influenced by the actual direction of the underlying. Similarly to other types of arbitrage, volatility arbitrage refers to a wide range of different strategies which can be classified into 1) gamma trading, 2) volatility surface arbitrage, 3) cross asset volatility trading and 4) dispersion trading. Gamma trading plays the implied volatility versus the historical volatility on the same asset [9] . If the realized volatility exceeds the volatility implied in the option price, arbitrageurs can profit by buying an option and hedging the delta in the underlying market. The positive income is proportional to
(Realized Variance-Implied Variance) where S is the price of the underlying, and Γ is the gamma of the option [78] . Volatility surface arbitrage is a relative value strategy trading the implied volatilities on the same underlying in different points of the volatility surface. Arbitrageurs identify anomalies in implied volatilities across different strike prices and maturities and profit from buying (selling) options whose implied volatility is excessively low (high) [81] . Cross-asset volatility trading plays the implied volatility of an asset versus the implied volatility of another asset through traditional long-short trades. Finally, dispersion trading (also known as decorrelation trading) trades the volatility of a basket of securities (generally and index) against the volatilities of the components of the same basket [81] . The volatility of an index is a function of the volatilities of the constituents and the correlations between them. Greater correlations translate into less diversification and higher index volatility. Decorrelation is traded by selling index variance swaps and buying single stock variance swaps [82] .
Swap spread arbitrage is another popular fixed income strategy which bets on the difference between a fixed and a floating yield [15] [83] . It is structured in two parts. On the one hand, the arbitrageur enters a par interest rate swap paying a fixed coupon rate SR and receiving the floating LIBOR rate
. On the other hand, the arbitrageur buys a treasury bond, with the same maturity as the swap, with the money borrowed through a repurchase agreement known as repo. Entering this part of the trade the arbitrageur earns the treasury rate TR and pays the repo rate
. The overall cash flow of the trade is
where
is the fixed interest rate component (also known as swap spread) and
is the floating rate part which needs to be rolled periodically (typically every three months). The strategy generates a positive income as long as the floating yield exceeds the fixed one. Swap spread arbitrage is immune from interest rate risk if both the repo rate and LIBOR (which generally have the same maturity and rolling dates) react similarly to a move in rates.
Mortgage arbitrage consists of buying mortgage-backed securities (MBSs) while hedging their interest rate exposure primarily through derivatives [84] . The strategy provides a positive carry as the yield on MBSs is typically higher than that of comparable treasury bonds. As the spread earned is generally small, arbitrageurs use leverage to enhance returns. Mortgage arbitrage strategies can be classified based on the different types of MBS used. A popular implementation of the strategy is with pass-through MBSs which pass all of the interest and principal cash flows of a pool of mortgages to the pass-through investors [12] .
Capital structure arbitrage involves taking long and short positions in the various instruments of a company’s capital structure [15] [85] [86] [87] [88] . This includes a variety of strategies between equity, debt and credit instruments of a given company. Some of the most popular strategies are credit arbitrage and convertible arbitrage. Credit arbitrage (also known as capital structure arbitrage) usually refers to strategies that aim to exploit mispricing between a company’s credit default swap (CDS) and its equity. Arbitrageurs use the information on the equity price and the capital structure of an obligor to compute its theoretical CDS spread. The theoretical CDS is then compared with the level quoted in the market. If the market spread is higher (lower) than the theoretical spread, then the strategy goes short (long) on the CDS contract while simultaneously hedging the equity with a short (long) position [89] .
Convertible Arbitrage is one of the most popular capital structure strategies and involves buying a portfolio of convertible bonds while selling short the underlying stocks [90] [91] . Intuitively, if the stock increases in price, the bonds will appreciate and if the stock falls the short position will profit. In some versions, the interest rate risk is hedged with treasury futures or interest rate swaps. In addition to credit arbitrage and convertible arbitrage, other capital structure arbitrage strategies focus on the spread between bonds and equities of the same company. In particular Schaefer and Strebulaev [89] show that structural models provide accurate predictions of the sensitivity of corporate bond returns to changes in the value of equity (hedge ratios). Other strategies instead focus on the spread between CDS and corporate bonds or different types of credit default swaps [92] [93] .
This review allows us to identify the defining features of the different strategies across asset classes. They are summarized in Table 3.
4. What Is SA?
In this section, we define SA strategies. We identify those features which are common to the surveyed arbitrage strategies. We compare them with the available definitions and provide a new definition in conjunction with a classification scheme. The new definition incorporates all strategies’ key elements and the classification scheme encompasses the important dimensions of SA while being flexible and easy to use.
4.1. Strategies Key Features
All strategies aim to exploit relative value opportunities through the implementation of long-short positions. Pairs trading invests in the spread between two stocks. Term structure models the spread between yields or future prices.
Table 3. Arbitrage trading strategies. The table reports the defining features of the surveyed strategies.
Volatility arbitrage identifies relative value opportunities between volatilities. Swap spread plays a fixed spread versus a floating spread. Mortgage arbitrage models the spread of MBS over treasury. Capital structure arbitrage profits from the spread between various instruments of the same company. Spreads trading involves taking long-short positions in order to profit from spreads or simply to bet on a security while being market-neutral.
However, not all strategies need mean reversion. Pairs trading and term structure arbitrage need spreads to revert to their mean to be profitable. Other strategies instead need a persistent positive spread-carry: between implied and realized volatility (volatility arbitrage), between the fixed and the floating spread (swap spread arbitrage), in the MBS spread over treasury (mortgage arbitrage) and between various instruments of the same company (capital structure arbitrage). If spreads narrow these strategies are less profitable and can turn into a loss. In addition, not all strategies are zero-cost. This is not only due to market frictions or trading costs but it is true by construction. For example, pairs trading (in the market-neutral form) may require a net payment and mortgage arbitrage requires the purchase of MBSs.
It is not possible to clearly define whether SA strategies are market-neutral. All strategies invest in some risk factors while hedging others. For example, term structure arbitrage may hedge only against parallel shifts of the term structure. Volatility arbitrage hedges against movements of the underlying but not of the underlying volatility. Swap spread arbitrage hedges against changes in treasury and swap rates but not against credit risk. Mortgage arbitrage hedges against movements in treasury rates but not mortgage spreads.
Not all strategies guarantee gains but rather offer positive expected excess returns with an acceptably small potential loss. Arbitrageurs require a positive expected excess return over the risk free to compensate for risk. The potential loss must be acceptably small in order to qualify the strategy as arbitrage rather than simple investment. Although not all the academic literature reports it, trades always have take profit and stop loss features. The take profit identifies when a trade no longer offers positive expected excess returns. A take profit is triggered in case there is reversion to the mean (pairs trading, term structure arbitrage, volatility arbitrage and capital structure arbitrage) or when the positive carry disappears (swap spread arbitrage and mortgage arbitrage). The stop loss quantifies when a loss is no longer acceptably small and results from investors’ risk tolerance.
From the previous analysis, it is possible to conclude that three key factors define statistically determined arbitrage opportunities: 1) relative value, 2) positive expected excess returns and 3) acceptably small potential loss. Take profit and stop loss are features which enable to operationalize SA strategies (see Table 4).
4.2. Definition of SA Strategy
From the review of strategies and definitions, we find that both in the definitions and strategies, statistics are used to explain securities mispricing. In particular,
Table 4. Surveyed features of statistically determined arbitrage strategies. For each trading strategy, the table reports whether the listed features are present or not. Where there is no clear assessment (−) is reported.
they focus on the same observable phenomenon but from different perspectives. Definitions focus primarily in strengthening the concept of arbitrage introducing additional constraints that can make theory more consistent with financial markets. In some cases, they use tools common to practitioners, such as the Sharpe ratio in δA. In other cases, instead the focus is more on the theoretical framework, such as in the augmented information set in Bondarenko’s definition [10] . Strategies instead use quantitative models as a tool to have a more efficient approach to uncover mispricing. Starting from the empirical evidence of market inefficiency, investors use different techniques to identify arbitrages with a given statistical confidence. It is evident how both academics and practitioners look at the same issue: academics rule out those investment opportunities which are not compatible with a rigorous pricing, while investors try to identify investment opportunities resulting from inaccurate pricing. In both cases statistical methods have been used. Now the question is: do they come to the same conclusions? And more particularly, is there a definition of SA which encompasses the various strategies?
We aim to create a definition which is measurable. That rules out lexical definitions which focus generically on systematic strategies [13] [35] [36] [94] and relative value [37] [39] [95] . We compare the key features of SA strategies with conceptual and operational definitions (see Table 5).
The available conceptual definitions do not capture all key features: Connor and Lasarte [52] and Saks and Maringer [53] do not mention relative value, while Stefanini [12] and Focardi, Fabozzi and Mitov [54] do not require small potential losses. The analysis of available operational definitions reveals that, singularly, no definition requires long-short trading nor spread modelling. More generally, with the exception of εA no definition refers to relative value analysis.
Table 5. SA definitions versus strategies’ key features.
Only δA, GD and AA incorporate the feature of positive excess returns while the other definitions generically refer to positive expected returns as there is no initial cost involved. AA embeds the feature of acceptably small potential loss but this is limited to a specific measure (gain-loss ratio). AO limits losses through the use of generic stress measures. Hogan’s SA partially requires acceptably small potential losses as the probability of a loss converges to zero with time. All definitions embed the concept of take profit as long as it is assumed that a strategy is closed at maturity or when the expected returns are no longer positive. AOs can be closed in stop loss if the realized loss is higher than what is acceptable according to the stress measures. Hogan’s SA has the concept of stop loss if it is assumed that a strategy is closed when the constraints on the probability of a loss are no longer satisfied. AA trades are closed in stop loss only if the gain-loss ratio is lower than one. According to the other definitions instead a trade is closed only when the defining criteria are no longer met and this does not necessarily involve a stop loss. In conclusion, there are some differences across definitions. Although some definitions are compatible with various strategies’ common features, nevertheless they fail to incorporate all of them as defining elements.
As no available definition fully captures what is done in practice, we identify a conceptual definition that incorporates all strategies’ key elements. We choose to use a conceptual definition as it clearly defines SA while leaving each analyst to select the most appropriate measure as explained below.
We define a SA strategy as a relative value strategy with a positive expected excess return and an acceptably small potential loss. We note the following in relation to our proposed new definition. First, SA is a relative value strategy. This reflects the fact that all the reviewed strategies play the spread of a security against another one. It should be noted that, while the concept of relative value is universally accepted, its boundaries are not clearly defined. A priori a total return strategy can be considered a relative value strategy of an investment against the overnight rate (which is close to zero). It is using the common understanding that we refer to relative value strategies as strategies aiming to find mispricing using historical relationships. As a relative value strategy, SA requires that the underlying securities are combined in a long-short portfolio. This allows to more accurately isolate some sources of risk (expected to deliver positive excess returns) while hedging others. The underlying securities may or may not belong to the same asset class.
Another element is given by the expected positive excess return. This part of the definition incorporates two features. The first one is given by the fact that the strategy focuses on the expected return. This differs from the definition of arbitrage where the strategy has no admissible possible negative outcomes. Losses are allowed in our definition of SA. The second one is given by the excess return. This reflects the fact that every arbitrageur embarks on a strategy involving some risk only if there are expectations of returns higher than the risk free whenever an initial investment is required.
The last requirement is given by the acceptably small potential loss. This element is fundamental in order to differentiate SA from a simple investment strategy. To be called arbitrage, a strategy needs to have a constrained loss profile. A strategy is closed whenever the defining criteria are no longer satisfied: 1) in stop loss, if the loss is no longer acceptably small or 2) in take profit, if the performance is positive and the expected excess return is no longer positive.
This definition cannot be operational unless we define how to measure a positive expected excess return and an acceptably small potential loss. The need for clarity on this issue is critical. However, the complex and dynamic landscape of financial markets suggests that no definitive theoretical or operational definition of SA is likely to be agreed. Because of this we propose to use the definition in conjunction with a classification scheme.
A positive expected excess returns requires defining the risk free and a probability measure. The risk free can be the cost of financing (for unfunded strategies) or the cash rate (for funded strategies). In the case of a zero-cost trading strategy, the risk free is equal to zero. Defining an acceptably small potential loss requires identifying a set of suitable risk measures and criteria to establish what is acceptably small. Examples of risk measures are the probability of a loss, the Value at Risk (VaR) and the Conditional Value at Risk (CVaR), see [96] - [99] . It is left to each investor to define what is acceptably small according to his utility function.
This classification scheme aims to be sufficiently detailed to encompass the important dimensions of SA while at the same time being intuitive and easy to use. To be widely accepted, a definition should also appeal to practitioners and other stakeholders by reflecting the world as it is perceived. Our definition, with annexed classification scheme, satisfies the four canons of a good definition: adequacy, differentiation, impartiality and completeness [32] . It is adequate as it clarifies a substantial portion of the meaning of SA. It shows differentiation as it eliminates confusions including all the terms which distinguish SA from a generic investment strategy. Impartiality in the definition is guaranteed as all key elements receive similar significance while assuring the necessary completeness. Our definition of SA compares favorably to existing SA definitions. The definition of Bondarenko [10] is not suitable to describe this wider range of strategies. Hogan’s SA definition instead seems to be more focused on investors’ strategies and this is reflected by its broader use in more recent literature [54] [100] . However, Hogan’s definition does not emphasize the need for positive excess return and the peculiarity of relative value. Additionally, it is not flexible enough to include SA strategies based on specific ratios, see for example the Sharpe ratios used by Bertram [101] , Cummins and Bucca [67] and Goncu [100] . Our definition reformulates the definition of Saks and Maringer [53] adding relative value. This addition is fundamental to rule out investing in short term government bonds (with positive expected return and low probability of a loss) as a SA strategy.
Our definition and classification system could guide future research. For example, the use of a common classification system allows investigating the profitability and riskiness of SA strategies across asset classes and time. This enables mapping pricing anomalies and can provide directions on how to improve pricing models. The existence of persistent SA opportunities in selected strategies can be used as an indicator to direct future research to less studied asset classes and instruments. Having a framework brings transparency to the term SA, helping investors in making investment decisions. For example, our definition of SA can be used in the hedge funds industry where there is no agreement on a standardized classification system of strategies [102] . This can help address the issue of a lack of uniform definitions in hedge funds where several classification systems are still in use with significant differences among them [103] [104] .
5. Conclusions
In this paper, we investigate the concept of statistical arbitrage (SA). As there is no agreement in literature on a common definition, we review both the theoretical and empirical works on SA since its introduction. In particular, we look at all those definitions, which may be suitable to identify this class of strategies. We produce a review of all strategies which may be associated with the concept of statistically determined arbitrage opportunities. We identify those common features which define the concept embedded in investors thinking. As no definition is suitable to describe this type of strategies, we introduce a general definition and propose a classification system that encompasses the current forms of SA strategies while facilitating the inclusion of new types as they emerge.
Our study makes several contributions to the existing literature. We bridge the gap existing between the literature on arbitrage definitions and SA strategies. We perform an innovative investigation of SA both in academic and financial industry research analyzing, for the first time, SA across all asset classes (equity, fixed income and commodity). We find a general definition, which includes all SA strategies and propose a classification system measuring the strategies’ risk and return profile. This facilitates the inclusion of new strategies and measures as they emerge. Our analysis allows investors to have a common framework to evaluate investment opportunities and brings clarity in SA investing, guiding theoretical development and empirical testing. We also provide examples of potential future research directions.