Foundations for Wash Sales

Consider an ephemeral sale-and-repurchase of a security resulting in the same position before the sale and after the repurchase. A sale-and-repurchase is a wash sale if these transactions result in a loss within $\pm 30$ calendar days. Since a portfolio is essentially the same after a wash sale, any tax advantage from such a loss is not allowed. That is, after a wash sale a portfolio is unchanged so any loss captured by the wash sale is deemed to be solely for tax advantage and not investment purposes. This paper starts by exploring variations of the birthday problem to model wash sales. The birthday problem is: Determine the number of independent and identically distributed random variables required so there is a probability of at least 1/2 that two or more of these random variables share the same outcome. This paper gives necessary conditions for wash sales based on variations on the birthday problem. This allows us to answer questions such as: What is the likelihood of a wash sale in an unmanaged portfolio where purchases and sales are independent, uniform, and random? This paper ends by exploring the Littlewood-Offord problem as it relates capital gains and losses with wash sales.


Introduction
Wash sales impact a portfolio's tax liabilities. Determining the likelihood of wash sales is also important for understanding investment strategies and for comparing actively and passively managed portfolios. Wash sales apply to investors, but not to market makers.
Taxes play a significant role in economics and finance. Taxes influence behavior, shape the engineering of financial transactions, and sometimes have unintended consequences. Therefore, thoughtful analysis is imperative for taxes. This paper adds firm mathematical foundations to aid the understanding of wash sale taxes.
1. The loss p 1 − p 2 is not permissible for taxes. That is, this loss may not be subtracted from profits or gains and it may not be used to get a lower tax rate.
2. The cost-basis of the shares repurchased on d 3 is set to p 3 + (p 1 − p 2 ). The shares purchased on d 3 have the start of their holding period reset to d 1 .
Short positions may also be wash sales. For example, consider holding a short position of 100 shares of a security starting on date d 1 in a portfolio Π. Then suppose this short position is closed at a loss by purchasing 100 shares on day d 2 . Once this position is closed on day d 2 , then Π contains no shares of this security. Next re-short another 100 shares of substantially the same security on d 3 where |d 3 − d 2 | ≤ 30 days. These transactions leave the portfolio the same while getting a tax advantage for the loss. This tax advantage is also disallowed by the wash sale rules.
Consider a wash sale as described by Definition 1, where (p 1 − p 2 ) + p 3 > p 1 or in other words p 3 > p 2 . Suppose the shares are sold at price p 4 > (p 1 − p 2 ) + p 3 > p 1 at the later date d 4 ≥ d 3 . In the case with the wash sale, there is a capital gain of p 4 − [(p 1 − p 2 ) + p 3 ] which is smaller than the capital gain p 4 − p 1 if the wash sale had not occurred. Capital gains are taxable. A capital gain p 4 − p 1 is from the single purchase of the shares for price p 1 on d 1 and the single sale of the shares on date d 4 for price p 4 , thus skipping the sale at a loss and repurchase.
This means such a wash sale gives p 4 − p 1 − [p 4 − [(p 1 − p 2 ) + p 3 ]] or p 3 − p 2 less taxable income than a single purchase of the security at price p 1 on date d 1 and a single sale for price p 4 on d 4 . Of course, a wash sale's loss is not allowed.
Wash sales may be avoided by restricting each security in a portfolio to be either purchased or sold only every 31 calendar days. This restriction may not be suitable for many portfolios. In a portfolio containing options, it may be impossible to maintain this restriction.
It has also been suggested, e.g. [9], wash sales may be avoided by purchasing or selling (moderately) correlated, but not substantially the same, securities. That is, if a security is sold at a loss then purchase a different but correlated security within 30 days maintaining some of a portfolio's characteristics while keeping the tax advantage.
Historically many securities are assumed to only trade on about n = 252 business days per year [10]. Although reflecting on global markets one may assume there are n = 365 trading days.

Background
There has not been much research on wash sales, e.g., [9]. There is important work on taxation and its investment implications. Take, for example, [11,12] and [13].
The birthday problem is classical. According to a blog post by Pat B [14] the birthday problem may have originally been given by Harold Davenport as cited in [15] and later published by [1]. In any case, von Mises gave the first published version to the best of our knowledge.
Bounds of day counts for the birthday problems include [16] who gives bounds for birthdays of distance d for both linear years as well as cyclic years. In a cyclic year, 1-January is a single day from 31-December of the same year. Bounds for birthdays of distance d for cyclic years are given by [17].
The birthday problem applied to boys and girls (random variables with different labels) are discussed in [18] as well as [19]. That is, how many birthdays are shared by one or more boys and one or more girls? A comprehensive view is provided by [20] including stopping problems with the boy-girl birthday problem. Non-uniform bounds for online boy-girl birthday problems are given by [21] and [22].
Tight bounded Poisson approximations for birthday problems are given by [23]. Poisson approximations to the binomial distribution for the boy-girl birthday problem is given by [19]. A Stein-Chen Poisson approximation is used by [26] to solve variations of the standard birthday problem. Matching and birthday problems are given by [27]. Incidence variables are used to study birthday problems with Pareto-type distributions in [28].
Results on the expectation for getting j different letter k-collisions are given by [34]. Their results are expressed as truncated exponentials or gamma functions.

Structure of this Paper
Section 2 reviews variants the birthday problem applied here. First the classical birthday problem is discussed. Next this section progresses through the ±d birthday problem. After the definition and key results are given about the ±d birthday problem, the boy-girl birthday problem is explored. Finally, the ±d boy-girl birthday problem is defined and several bounds are derived as they relate to a necessary condition for wash sales. Subsection 2.1 gives an example of wash sales based on boy-girl birthday collisions of a single day.
Section 3 generalizes results of the previous sections. In particular, it shows how to compute B d (n, b, g), the number of b boys and g girls that give a probability of 1 2 or more where a boy and a girl have birthdays within d days of each other over n days. Subsection 3.1 gives an example of wash sales based on boy-girl birthday collisions over a range of ±d = 30 days.
Finally, Section 4 explores how wash sales impact capital gains and losses. Since wash sales are capital losses, they may offset capital gains. Several results, including the Littlewood-Offord problem, are applied to capital gains and losses as they may be impacted by wash sales.

The Birthday Problem and Wash Sales
The birthday problem is often applied to finding the probability of coincidences. So there is a rich literature on variations of the birthday problem [31,32]. Asset sales are often viewed as carefully selected. However, portfolios using American-style options may exhibit asset sales or purchases beyond the control of the portfolio managers.
Definition 2 (Birthday-Collision) Given two random variables X 1 , X 2 mapping respectively to x 1 , x 2 in the same range [n], then a birthday-collision is when To model random wash sales, this paper assumes independent identically distributed random variables. A common statement of the birthday problem is: Definition 3 (Birthday Problem) Consider n days in a year and k independent identically distributed (iid) uniform random variables whose range is [n] and n ≥ k. What is the probability B(n, k) of at least one birthday-collision among these k random variables?
A key question is: Over n consecutive days for what integer k does arg min k B(n, k) ≥ 1 2 hold for k iid uniform random variables? In other words, given n days, what is the least k iid uniform random variables so that B(n, k) = 1 2 ? Solutions to this basic variation of the birthday problem are well known. The probability B(n, k) is the compliment of the probability of k iid uniform random variables having no birthday-collisions. Therefore, if there are no birthday-collisions, then k birthdays can be in n k k! permutations out of all possible n k mappings of the k random variables onto [n]. In other words, the n k subsets of k distinct elements of [n] is the exact number of subsets the k variables may map to without a collision. These k variables may be ordered in k! permutations. That is, for n ≥ k and B(n, k) = 1 otherwise. Starting with n and a probability p = B(n, k), then computing k is often done using the inequality 1 − x ≤ e −x . In particular, the smallest k giving a probability of 1 2 that there is at least one birthday-collision requires k to be roughly 2(ln 2)n or about 1.18 √ n. See for example, [1,35,36]. Another classical approach is to look at the random variable X as the sum of all birthday-collisions of k people over n days, see for example [19,27,37,38]. A concise exposition is given in [38] which we follow. Presume the birthday day of person i ∈ [k] is given by the random variable Y i ∈ [n]. Since a potential birthday collision is a Bernoulli trial, so X is binomially distributed. Thus, X ∈ { 0, 1, 2, · · · , k 2 } where k 2 is the maximum number of potential birthday-collisions. The expectation of the maximum number of birthday collisions possible is k 2 with probability 1 The expected maximum number of birthdaycollisions is 1 n k 2 . If n is sufficiently larger than k, then X is approximately Poisson 2 )/n . In the case of the ±d birthday problem, if two random variables X 1 , X 2 map within d days of each other, then this is a ±d birthday-collision [16].
Two birthdays x 1 and x 2 of distance |x 1 − x 2 | demark a span of size 1 + |x 1 − x 2 |. For example, |4 July − 3 July| = 1, so these dates are in a ±d = ±2 span, but not in a span of ±d = ±1.

Definition 4 (±d Birthday Collisions)
Consider n days in a year, spans of less than ±d days, and k iid uniform random variables with range [n]: Then B d (n, k) is the probability at least two such random variables have a ±d birthday-collision. That is, these two random variables have ranges in less than d days of each other.
In n days with a ±d span, then arg min there is a probability of at least 1 2 where at least two such random variables are fewer than d days from each other.
Definition 5 (Blocks of days) Let i : k > i > 1. Suppose birthdays are ordered as There are no birthdays between x i−1 and x i and there are no birthdays between x i and x i+1 .
A block of days contains a single birthday on one of its end-points. The birthday x i is associated with two blocks: The days between x 1 and x 2 form a block of size |x 1 − x 2 | since there are no birthdays between x 1 and x 2 . Thus, two nearest birthday pairs contained in a span of ±d are separated by a block of size d − 1.
Take k iid uniform random variables and consider ±d birthday-collisions over [n] days. Naus [16] gives the next idea: If there are no ±d birthday-collisions, then there must be at least size d − 1 blocks of no birthdays between each nearest birthday pair. This gives a total of (k − 1)(d − 1) days with no birthdays in k − 1 contiguous blocks of at least d − 1 days each. Therefore, if there are no ±d birthday-collisions, then k birthdays can be in n−(k−1)(d−1) k k! permutations out of all possible n k mappings of the k random variables. Thus, to get the probability of at least one ±d birthday collision, take the compliment of the probability of having no ±d birthday-collisions. The next result follows.
Using the bound 1 − x ≤ e −x on Naus' result gives k of about 0.83 n d−4 , see [16]. Also [31] approximate k to about 1.2 n 2d+1 for the cyclic version. Note, Theorem 1 with d = 1 gives the solution to the standard birthday problem of Definition 3. That is, a span of d = 1 and blocks of size d − 1 = 0.
The falling factorial is In these terms, Theorem 1 may be expressed as The next definition is based on [18,20,23].
Lemma 2 ( [18,25]) Consider n days in a year and two sets of distinctly labeled iid random variables all with range [n]: g random variables are girls and b random variables are boys. Then B(n, b, g) is the probability that at least one girl and at least one boy have a birthday-collision and

Wash sale Example 1: Same Day Purchase and Sale
Consider a portfolio Π = {a 1 , · · · , a k } where a i : k ≥ i ≥ 1 is asset (security) i held in Π. At the end of business on day ℓ, consider portfolio Π ℓ = {a 1,ℓ , · · · , a k,ℓ } the market value of asset i in Π ℓ is |a i,ℓ | and the total value of Π ℓ is |Π ℓ | = k i=1 |a i,ℓ |. Just before the start of each tax year, asset i has market value |a i,0 | and Π has total market value |Π 0 |. Assume each asset is sufficiently liquid so our purchases or sales do not impact its market price.
Suppose portfolio Π has T total iid uniform and random transactions during the business days of one calendar year. Assume trades are distributed on an asset-weighted basis from the initial weight of each asset in the portfolio just before the trading year commences. Thus, just prior to the first trading day and with no other information, asset a i is expected to have t(i) = T |a i,0 | |Π 0 | trades in one year. Take t(i) transactions and define the independent Rademacher 2 random variables η 1 , · · · , η t(i) representing buys or sells of portions of asset class i in portfolio Π: That is, the b independent Rademacher random variables where η j = +1 represent buys (boys) and the g random variables where η j = −1 represent sells (gals).
So, for example, take c = 1, then |b − g| ≤ 1 holds with high probability as t(i) gets large. Of course, as t(i) gets large, the likelihood of wash sales increases. That is, the total number of buys and sells is expected to converge to be about the same as the total number of transactions grows. However, along the way, the number of buys or sells may not be as balanced [2,40]. Select the probabilities that the number of buys and sales are the same, given t(i) total trades, in asset class a i are: Assuming the portfolio Π already holds this single asset type, a boy-girl collision only is a necessary condition for a wash sale. A birthday collision must be accompanied by a sale at a loss and a repurchase of substantially the same security within 30 calendar days.

General Wash Sales
Necessary conditions are given here for wash sales where a purchase and sale are within ±d calendar days. Since the purchase and sale are not known to be at a loss while keeping substantially the same portfolio before and after the ±d birthday collision.
Definition 7 (Boy-Girl ±d Birthdays) Consider n days in a year, spans of ±d days, and two sets of distinctly labeled iid uniform random variables all with range [n]: g random variables are girls and b random variables are boys. Then B d (n, g, b) is the probability at least one girl and one boy are mapped to less than d days of each other.
For example, starting with n, d and k = g+b and g = b, then arg min gives k so there is a probability of ≥ 1 2 so at least one girl and one boy have ±d-birthday collisions.
Theorem 3 Consider n days in a year, a span of ±d days, and two sets of distinctly labeled iid uniform random variables all with range [n]: g random variables are girls and b random variables are boys. Then B d (n, g, b) is the probability at least one girl and one boy have a ±d birthday-collision and: Proof: This proof calculates the probability of not having no boy-girl ±d birthday collisions. That is, one minus the probability of no boy-girl ±d birthday collisions. This gives the probability of at least one boy-girl ±d birthday collision. Given n days, a ±d span, and iid uniform random variables separated into g (girls) random variables and b (boys) random variables. Then the total number unconstrained mappings of the b and g variables to [n] is n b+g giving the denominator in front of the double sum.
The value B d (n, g, b) is not impacted if either any number of boys have the same birthday or separately any number of girls have the same birthday. Rather B d (n, g, b) is impacted by boy-girl collisions. Therefore, consider partitions of b boys and g girls. To prevent the girls' partitions and boys' partitions from colliding into ±d spans of the same range, count the number of places these i and j non-empty partitions may be mapped so there is no ±d > 1 birthday-collision. By Lemma 1 there are This completes the proof.

Wash sale Example 2: d = ±30 Calendar Days
Start with the same setup as the previous wash sale example from subsection 2.1. Let h be half the total trades t(i) in day i. That is, h ← t(i)/2. Assuming n ∈ {252, 365} trading days and d = ±30 calendar days gives the probabilities of girl-boy ±30-day birthday-collisions for a single asset type is: Consider only a single asset type. The intuition behind these probabilities is straight-forward. For instance, consider n = 365 days and to avoid boy-girl collisions each girl and boy must be separated by at least 30 days before and after their birthday from the other gender. So the 365 days may be broken into about six blocks of about 60 days.

Wash Sale and Integral Capital Gains and Losses
Capital gains or capital losses may be rounded to the nearest integer for US tax calculations. Provided all trades are rounded. Rounding drops the cents portion for gains whose cents portion is 50-cents or below. Rounding adds a dollar to the dollar portion of gains whose cents portion is greater than 50 cents while dropping the cents portion. Losses work the same way. Gains and losses must all be rounded or none must be rounded. So, from here on, let all gains or losses be integers.
Long term capital gains and losses are aggregated and at the same time short term capital gains and losses are aggregated. At the end of the tax year the long term and short term aggregates are added together to get the final capital gain or loss for taxation.
The focus here is capital gains or losses for capital assets that may have wash sales. Wash sales are losses, but losses may offset gains. The study of options and their associated premiums is classical [10] and we do not address it here. So, option premiums are ignored.
In a portfolio, individual capital gain values and individual capital loss values are usually distinct. Though rare, identical capital gains and capital losses are possible. Identical capital gains or losses are possible for portfolios built using options. We are ignoring option premiums. That is, asset purchases may be done via the exercise of cash-covered American-style put options. Also asset sales may be done via the exercise of American-style covered-call options. In these cases with options that become inthe-money, a portfolio manager has no control of the asset sales or purchases or timing of such trades. See Figure 1.
Most often, put or call option strike prices are at discrete increments. For example, many put and call equity options have strike prices in $5 or $10 increments. Suppose a portfolio is built only using the exercise of American-style options. Many asset gains and losses may be for identical amounts. Of course, this depends on the size of the underlying positions or the number of options written. Options with the same expiry on identically sized underlying assets may have very different values [10].
In such option-based portfolios assume uniform, independent, and random capital gains and capital losses. This may be modeled by the Littlewood-Offord Problem.

Definition 8 (Littlewood-Offord Problem) The integer Littlewood and Offord's problem is given an integer multi-set
Assuming equal probability of gains and losses and no drift [10]. Given an integer multi-set V = {v 1 , v 2 , · · · , v n } so v i ≥ 1, ∀i ∈ [n]. The multi-set V represents capital gains and capital losses. Capital gains and capital losses are all from sales. The iid Rademacher random variables ξ i ∈ {+1, −1} determine if a v i is a capital gain or loss.
Over a tax year, the total capital gain or loss is In an optimal solution of this version of the Littlewood-Offord problem, [5] showed the n-element multi-set √ n . The next lemma's proof follows immediately from the linearity of expectation given Rademacher random variables. See, for example, [39].

Lemma 3 Consider any integer multi-set
and the random variable For any Rademacher random variable ξ i , it must be IE[ξ i ] = 0 and IE[ξ Thus, a proof of the next theorem follows since the variance of a sum of independent random variables is the sum of the variances.
Theorem 4 implies the next corollary.
Corollary 1 highlights an exceptional case where all capital gains and capital losses are the same. Wash sales require the loss and gain to be from essentially the same security.
Starting from s v [1] and going to s v [2 n ] contains 2 n − 1 intervals. Since all s v [i], for i ∈ [2 n ], are different and their differences must be even so Assuming wash sales occur with the same random and uniform probability among all losses, the expected disallowed loss is 2 n −1 n . This is  Figure 2 are 0, but if V has an even number of 1s, then the most common value is 0.
The following tail bound is given by [42] where v 1 , v 2 , · · · , v n 2 = v 2 1 + v 2 2 + · · · + v 2 n , Since by Theorem 4, σ Sv = v 2 1 + v 2 2 + · · · + v 2 n . Suppose V = {1, 1, · · · , 1} and |V | is odd. Since no sum of V is 0, there are n 2 capital gains and n 2 capital losses. This means if S v = tσ, then there are n 2 + tσ 2 capital gains and n 2 − tσ 2 capital losses. Losses are necessary for wash sales. Therefore, the bound IP [S v > tσ Sv ] ≤ e −t 2 /2 gives the probability there are at least tσ Sv 2 more gains than losses. That is, there are The term s v [1] is excluded since it has no losses, hence no wash sales. The boy-girl ±30 birthday problem gives a necessary condition for wash sales of substantially identical securities. Recall B 30 (252, g, b) is the probability of at least one boy-girl ±30 birthday collision, so 1 − B 30 (252, g, b) is the probability of no such birthday collision.
Given any number of boy-girl ±30 birthday collisions of the same security and suppose these birthday collisions produce at most a single wash sale. In this case let G be a total taxable gain or loss where all gains and losses are the same. Suppose these gains and losses are all 1. This gives,

Conclusions and further directions
Wash sales may be modeled in a number of ways. These include variations of the birthday problem and the capital gains of portfolios and wash sales impact may be modeled using the Littlewood-Offord problem. The k-armed bandit, see for example [43] or [44], etc., appears to apply to wash sales and the birthday problem. Robbins' discussion of maximizing expected value of sums of random variables selected from different distributions is applicable to constructing portfolios by writing options.

Acknowledgement
Thanks to Noga Alon for insightful comments.