The Impact of Stock Names on the Expected Stock Return —A Study Based on the Behavior Finance

Every investor in the market has access to the stock names, making it the most popular information. However, this piece of information is often ig-nored by people and considered insignificant in the decision process. In fact, it is almost always the stock names that give investors the first impression of a stock, and thus psychologically speaking, should in turn impact the decision process. In this paper, we score the (Chinese) stock names according to the meaning and the efficiency of passing information, so that we can work out a quantitative analysis of the stock names. Theoretically we derive the relation-ship between the stock name scores and the expected stock returns. Practical-ly we build up an imaginary market-neutral portfolio and analyze its return by historical data. From both the theoretical and the practical aspects we discuss the two hypotheses—the liking theory and the information theory, and we show that in the Chinese stock market, the liking theory dominates, which is opposite to the result from the US stock market.


Introduction
In the framework of traditional microeconomics and finance, investors are assumed to be rational decision makers, whose activities are directed by public information and the profit/risk calculated from mathematical models. The famous efficient market hypothesis is an example of such stock evaluation models. In recent years, as the concept of behavioral finance being accepted, the irrational behavior of investors starts to be taken seriously. Robert C. Shiller's [1] (2000) How to cite this paper: Song, S. and Li, R. (2019) The Impact of Stock Names on the Journal of Mathematical Finance research shows the collective impact from a large number of irrational investors on the US stock market, and proposes that the irrational investors' behavior caused multiple violent stock market fluctuations. Allen M. Poteshman and Vitaly Serbin [2] (2003) distinguish the rational and irrational investment activities. They study the stock future market in the United States and identify many irrational investment decisions and discuss the pricing mechanism. In J. Bradford De Long's [3] (1990) research paper, the behavior of noise traders is described, and they discuss the situation when the market participants are just the irrational investors and arbitragers who make strategies against those irrational investors.
When considering the stock names, it is very interesting to think about whether or not rational investors should take it as a meaningful input when making decisions. In the paper of van den Assem [4] (2018), the impact of stock names on the expected stock returns is discussed in detail. Two models-the "liking" theory and the "information" theory-competing with each other are examined according to the US stock market data, and the conclusion supports the so-called "information" theory, which indicates that stock names contain useful information and the use of stock names comes from rational and sophisticated investors rather than irrational and noise traders.
In our research, we want to re-examine the two theories by the data of the Chinese stock market. The reason is that we see two important differences between the Chinese and the US stock market. First, the proportion of individual investors (relative to institutional investors) is much larger in China than that in the US. Second, the arbitrageurs' activity is limited by the Chinese stock market rules, especially from the short side. Moreover, compared to the English letters, the Chinese characters of the Chinese stock names can potentially pass more information or indication to investors. All these factors make it a meaning job to check the Chinese stock market, which is a good representative of all emerging markets.
Before expanding our discussion, we present an example that shows the stock names in Chinese making impact on the stock price. In 2016, the trading day following the US presidential election, two stocks in China behaved specially. "Chuan-Da-Zhi-Sheng" (002253), which pronounced like "trump win" in Chinese raised in price and hit the up-limit. Meanwhile, another stock "Xi-Yi-Gu-Fen" (002265), which pronounced like "Aunt Hillary", plunged and hit the down-limit. While the two stocks have no financial connection to either president candidate, we can only track the reason to the stock names and the imagination of investors.
The research about Chinese stock names is not a popular topic and need further exploration. Some relevant papers are L. Liu [5] (2004), Y. Liu [6] (2008), G. Li [7] (2011), and Z. Chen [8] (2011). The similarity of these papers is that researchers think mainly about the correlation between stock names and the actual returns, and there is little theoretical discussion about how this correlation is built up.
In our paper, we apply the method of van den Assem [4] (2018) to the reality of the Chinese stock market, and analyze how the stock names affect the expected stock returns. The results show that the Chinese stock market data support the liking theory, which indicates that the stock names play a purely psychological role in the decision process of investors, and that the noise traders dominates in the Chinese stock market. This result is opposite to the result from the US stock market, and in the following sections we will discuss the reasons in detail.
This paper shows its novelty by setting up a new rating system for stock names so that we can quantify the goodness of each stock name. Then the use of an imaginary market-neutral portfolio gives the readers a clear view of the effect by the stock names.

Scoring of Stock Names in Chinese
To quantify our problem, we need to score the stock names. For each stock, we assume the basic score to be zero, and we set up three conditions for one bonus point each. So every stock name can have a score ranging from 0 to 3. Condition 1: the stock name accurately describes the main business of the company.
Condition 2: the stock name contains a territory name.
Condition 3: the stock name implies some good implications.
Among these three conditions, the first one considers the effectiveness of information transfer. If the stock name shows the main business of the company accurately, it helps investors learn the background of the company quickly, which can contribute in the decision process. The second one focus on the territory of a stock and company, which we believe could attract the attention of some investors, especially individual investors. Notably speaking, the word "China" is also considered a territory description and affects potential investors.
Relative discussion can be found in Y. Qiao's [9] (2012) paper. This also makes sense when we see that no stock name contains both "China" and another territory name, so there is no stock gaining two bonus points by this condition. The third condition comes from the psychological aspect, saying that some investors may have faith in good-meaning words and rely partially on the good luck. L.
Zhang [10] (2016) studies this effect in the paper and proves that such investors do exist. Some examples of good-meaning Chinese characters are "fu" (happiness), "tai" (safety), and "jin" (gold). For this reason, the management of some companies may purposely add these good words into the stock names to attract investors.
The full list of stock name scoreboard is shown in Appendix III.

The Theories of Stock Pricing
About the mechanism of how stock names impact on stock price, we think and compare the two theories. The first one is called the "liking" theory, which is discussed in multiple papers (Alter, Adam L., and Daniel M. Oppenheimer, [11] 2006; Alter, A.L. & Oppenheimer, [12] 2008; Adam L. Alter, Daniel M. Oppenheimer, [13] 2009). The core idea is that ordinary investors like the stock names and thus have the preference of buying the corresponding stocks without deep research, which causes an overpricing of stocks and low long-term returns. The second theory is called the "information" theory, which is discussed in a paper in 2013 [14] (T. Clifton Green and Russell Jame, 2013). This theory chooses the aspect that a good stock name indicates good operation skills of the management, and thus the long-term growth of the company is promising. However, this piece of information can only be captured by those sophisticated investors, while ordinary investors tend to overlook it due to lack of research. Then the corresponding stocks have both good fundamental data and "low-key" position, generating good returns. The two theories reach completely opposite conclusions. There is already a paper [4] (van den Assem et al., 2018) that uses the data from the US stock market to check both theories and conclude that the "information" theory matches the reality. In our paper, however, we want to examine the two theories with the Chinese stock market data, as it is a representative of the emerging market and varies from the US market in many important aspects. Now we briefly go through the mathematical model mentioned in Assem's paper [4] (van den Assem et al., 2018). Consider an economic process that a company is operating one and only one risky project with a one-time income f at the end of the project. Assuming there are only two types of investors in the market who can invest in this company. One is the arbitrageurs, who are sophisticated and rational investors and make investment decisions based on the analysis of the company's fundamental data. The other is ordinary investors, or sometimes called noise traders, who are affected heavily by their personal preference and emotion and do not have a good understanding of the public information. These are irrational investors. We assume the proportion of noise trader in the market is λ.
Using the method proposed by Hong and Sraer [15] (2013), the investors make decisions by maximizing the following function.
In this function, ij n is the volume the investor i trades (either long or short) stock j. j p is the price of stock j, is the subjective estimation of the stock value by the investor i, and γ is the trading cost. Here we assume the discount rate is zero, which does not affect our discussion. The investor's index can be either i = A (arbitrageur) or i = N (noise trader), and the sub index j can be either j = H (high-score stock names) or j = L (low score).
Taking the first-order derivative and we get the equation to calculate the equilibrium price as shown in Equation (1).
Now we discuss the liking theory and the information theory respectively.
In the liking theory, there is no correlation between the stock names and the companies' operation, but a "good" (or high-score) stock name can attract many irrational investors. Those noise traders believe that a high-score stock name means a higher project income H f b + , and a low-score stock name means a lower project income L f b − . This variation of income is purely the subjective judgment of the investors without any rational foundation, so the real project income is still f, which the rational investors (arbitrageurs) know correctly.
Recall that the proportion of irrational investors is λ. Then for a high-score stock name, the stock value estimation is milarly for a low-score stock, the value estimation is Assuming the stock number outstanding is q, and using Formula (1) we can get Formula (2) and (3) ( ) In Formula (2) we get the difference in balance price between high-score stocks and low-score stocks, and we can see that under the assumption of the liking theory the balance price of high-score stocks are higher than that of the low-score stocks, and the difference is proportional to the number of noise traders.
In Formula (3) we get the difference in the expected stock return, obviously the expected return of the high-score stocks is lower as they are overpriced. The difference in returns is also proportional to the number of noise trades.
Generally speaking, the irrational demand from the noise traders push up the trading price of the high-score stocks, while the real value of the companies do not change. In turn it leads to a lower expected stock return.
Under the assumption of information theory, the stock name reflects the ability of the company's management, because a good manager would deliberately select a good stock name to attract the attention of investors. This piece of information can only be revealed by those sophisticated investors, who have detailed and deep research about the company. Those investors (arbitrageurs) realize that companies with good names could have high income from the operating project because of the good management. We label the higher income as f φ , where 1 φ > . On the other hand, the noise traders do not recognize the information and still treat all the stocks equally. Assuming the companies with good stock names (and thus good management) has a proportion of π, then average price of all the stocks in the market is which is also the subjective value estimation for noise traders. Meanwhile, the arbitrageurs correctly estimate the value of good-name stocks to be f φ , and that of other stocks to be f. Again we assume the proportion of noise traders is λ.
Now we can use Formula (1) to get the balance price and the expected return of stocks with either high-score names or low-score names.
In Formula (4), we see that the balance price of high-score stocks is higher than that of low-score stocks, similarly to the conclusion of the liking theory. However, from Formula (5) we find that the expected stock return for high-score stocks is now also higher than that of the low-score stocks. This is because that in the information theory, the companies with good stock names indeed have a better future income ( 1 φ > ). The difference in returns is proportional to the number of noise traders.
As a summary of this section and as a direction of our data analysis, we write three hypotheses here for verification.
Hypothesis 1: If the liking theory works, then the stock returns are negatively related to the stock name scores; if the information theory is correct, then the stock returns are positively related to the stock name scores.
Hypothesis 2: The correlation between the stock names and the stock returns increase as the number of noise traders increasing.
Interestingly, the first hypothesis tries to show the difference of the two theories, while the second hypothesis shows a similarity of the two theories.
In Assem's paper [4] (van den Assem et al., 2018) he concludes that the information theory wins by checking out the US stock market data. So to verify if the assumption of the information theory is indeed correct, we have the third hypothesis.
Hypothesis 3: the score of stocks names has a positive correlation to the value of a company.

Data Analysis
In this paper, all the stock price and volume data come from the RESSET dataset, and all the fundamental data come from the compustat data by the S & P GLOBAL. In our data analysis, we refresh the fundamental data on a yearly basis.
Specifically, we use the component stocks of (China) CSI300 index for the research and choose the time slot from 2011 to 2013. The reasons for the selection are: first, component stocks in the CSI300 index have good volatility; second, the three years from 2011 to 2013 is a period when the Chinese economy developed in a steady way. The impact from the 2008 financial crisis has almost disappeared, which means a stable and healthy macroeconomic environment; third, in April 2010, the CSI300 index future started to trade, so we choose the data af- Imagine an arbitrage portfolio in which we long the high-score stocks and short the low-score stocks. We call it a name-arbitrage portfolio, as it tries to profit from the difference in stock name scores. The net position is zero, and the money is evenly distributed among all stocks. We observe the monthly returns of this portfolio and get the risk-adjusted return by subtracting the risk-free ratẽ 0 t t r r r = − . The analysis is done with real history data from Jan 2011 to Dec 2013, and we decompose the risk-adjusted returns by the following three ways.
First we use the simple CAPM model: In Equation (6) MKT t is the risk-adjusted market return, and in practice we use the average return of all the stocks in the China A stock market, subtracting the risk-free rate. t ε is the regression error term.
We also use the more detailed Carhart [16] (1997) four-factor model.
Again MKT t is the market return. SMB (small minus big) is the small company extra return, HML (high minus low) is the high book value company's extra return, and UMD is the momentum return of the portfolio. This model decomposes the total return into four factors, and the residual return can be taken as the part from our name-arbitrage portfolio.
At last we use the Fama-French [17] (2015) five-factor model.  (7). RMW is the extra return by the company's profitrate, and CMA is the extra return due to the company's investment activity.
In practice, we calculate the factor returns by building up the corresponding portfolios. For example, to get the data for SMB, we rank the CSI300 component stocks by asset, and then long the smaller 30% and short the larger 30%, which gives us a portfolio that arbitrages from the small company preference. We then use the history data to get the monthly risk-adjusted return of this portfolio and take it as the SMB factor return. Similarly we get the HML factor returns by ranking the asset/cap ratio, and CMA by ranking the liability/asset ratio. The UMD (momentum factor) comes from the past year performance of the name-arbitrage portfolio.

The Test of Hypothesis 1
With Hypothesis 1, α is the extra return brought by the name-arbitrage portfolio. If the liking theory is correct, we should have 0 α < , and if the information theory is correct, we should have 0 α > . We make the regression to the monthly data between 2011 and 2013 according to Formulas (6), (7), and (8), respectively.
The results are summarized in Table 2.
From the results we see that the α values are all negative for the three regression models. The CAPM model is simple as it only separates the market factor, and the monthly arbitrage return of our portfolio is as high as −0.563%, or −6.97% annually. The latter two models both separate multiple factors and the monthly returns due to the name-arbitrage portfolio are still −0.307% and −0.351% respectively. According to the t-stats, the three regression analysis are significant at the 2.5% significance level. The negative α values support the liking theory.
Notably, the total return of the name-arbitrage portfolio between 2011 and 2013 is positive, but this positive return comes from the market and the fundamental factors, while the arbitrage activity to the stock names gives negative returns. This is shown in more details in Appendix II.

The Test of Hypothesis 2
In the second hypothesis, no matter with the liking theory or the information theory, the price deviation caused by the stock names becomes more significant with higher number of noise traders. To verify this hypothesis, we apply the method described in the paper of Baker and Wurgler [18] (2006), and classify we believe are interesting to individual investors and more likely to attract the attention of noise traders. We rank the stocks according to the two ratios and divide each to three subgroups-the upper 30%, the middle 40%, and the lower 30%. Then we have six subgroups in total. We build up the name-arbitrage portfolios for each of the six sub-samples and repeat the process of regression with CAPM, Carhart-four-factor, and Fama-five-factor models. The results are summarized in Table 3.
We make the following observations from Table 3.
1) All the α values in the regressions (6 subgroups, 3 models for each grouping methods) are negative, which again supports the liking theory.
2) The CAPM models give significant α values in all the subsamples at the 1% significance level. This is similar to the observation from the big sample. However, we should also keep in mind that the CAPM model is too simple to effectively separate different risk factors.
3) In the Carhart-four-factor and Fama-five-factor models, only the two "upper 30%" subgroups have significant α values, while the other four subgroups do not. This result agrees with the second hypothesis, as the "upper 30%" subgroups correspond to the stocks with more noise trader attention. In these two subgroups, the impact of the name-arbitrage strategy is more obvious.

More Details on the Regression Results
In the previous sections we focus on α values, because we mostly care about the extra returns brought by the stock names. Meanwhile, the goodness-of-fit of the regression models are also important, as we want to know whether it is reasonable to use the model for analysis. We summarize the result on the total sample in Table 4.
From Table 4 we see that a simple CAPM model does not explain the returns of the name-arbitrage portfolio very well, as neither the F-stats nor the R 2 is significant. This is why we rely mainly on the Carhart-four-factor model and the Fama-five-factor model in our previous discussions. With the Carhart-four-factor model and the Fama-five-factor, the F test are significant at the 0.1% level, and R 2 are as high as 68%.

The Test of Hypothesis 3
The third hypothesis points out a fundamental question: is the stock name re- works as an independent judgment of the liking theory and the information theory.
We apply the cross-sectional data as in the paper of Mueller, Ouimet and Simintzi [19] (2017), and do regression analysis to the following equation.
In Equation (9), i V is the Tobin's q ratio, which was proposed by James Tobin in 1969 to measure the value of a company. i F is a dummy index, which takes the value 0 for low-score stocks (name score = 0) and value 1 for high-score stocks (name score = 2). i Z is the logarithm of the company's revenue. The sub-index i goes through the component stocks of CSI300 index with either 0 or 2 stock name scores. If Hypothesis 3 is correct, we should have a positive coefficient β .
A company's asset, revenue, and market value change every year, but the stock names do not change very frequently. To keep the consistency, we make the regression for each year from 2011 to 2013, and the results are shown in Table 5.
From Table 5, we observe that none of the β values are significantly larger than zero and the residual return (α) has a big absolute value. This result does   However, if the stock names contain both a location and the business, it is more likely to invoke a sense of familiarity, which is not a rational feeling, but may play a role in the decision process. For example, the stock "jiang-xi-tong-ye (600352)", in which the first two words "jiang-xi" is a province in China, and the second two words "tong-ye" means copper mining. For individual investors, it is impossible to look through all the stocks in the market, so they tend to spend more time on stocks that they feel familiar with. An investor from the Jiangxi province would likely take a few moments to look at the stock "jiang-xi-tong-ye" because of the location familiarity. The word "tong-ye" (copper mining) shown in the stock name further passes the information about the company's main business and may trigger some imagination for investors, as they may start to think about news they learn recently that is relevant to mining. At this point, this stock has already successfully attracted some attention from individual investors.

Further Discussion about the Stock Name Scores
We discuss this example to show one of the mechanisms that how a stock name can affect an investor's decision process. On the contrary, a stock with zero point fails to pass a useful information and leaves little first impression to investors.

The Dual between the Liking Theory and the Information Theory
It should be noted that the above discussion about stock names is related only to the investors' psychological status, which leads to certain behavioral financial results. However, the fact that investors (noise traders) prefer certain stocks does not increase the actual company values. On the contrary, the preference causes an overpricing and a lower expected return. From the results shown in Table 2, we see clearly that when we long the high-score stocks and short the low-score stocks, the resultant α return is negative. It tells us that in the Chinese stock market, the liking theory is the one that matches the reality. The stock names affect the noise traders and skews their judgment about stock values, so that stocks with good names are overpriced and stocks with regular names are less popular.
In principle, there should be an arbitrage opportunity here for the sophisticated investors, as they can make a reverse strategy by longing the low-score stocks and shorting the high-score stocks. However, the short sale is strictly limited in China, so this arbitrage space stays and consistently gives us a negative α value.
Previously researchers conclude that the information theory is supported by the US stock market data [4] (van den Assem, 2018), but our analysis with the Chinese stock market data supports the liking theory. One of the fundamental differences between the two markets is that in the United States, the liquidity is much better than in China, especially for short sale, so the arbitrageurs can promptly react to any arbitrage opportunity and bring the stock price to a balance point. Another major difference here is that in the United States, almost 90% of the investors are institutions according to the data of US Security and Exchange Commission, who are sophisticated and rational decision makers. On the other hand, in the Chinese stock market, the percentage of individual investors is as high as 40%, who play the role of noise traders.
In Hypothesis 2, we assume that the difference of return for stocks with high-score names and low-score names is proportional to the number of noise traders, and to verify this assumption, we select some fundamental items of companies as an indicator of noise trader attention. Specifically we use the asset/price and revenue/price ratios. The logic behind is that these two fundamental items are most commonly discussed by both institutional and individual investors. Stocks with good fundamental data attract a lot more attention than those with bad fundamental performance, so these two ratios are good representatives of investors' attention. Moreover, for institutional investors, they have access to multiple data sources and their decisions are not based solely on the two fundamental items. On the other hand, individual investors have very li-mited information source, so they rely more on the common fundamental items that are easy to obtain. As a result, the asset/price and revenue/price ratios attract more from the individual investors than the institutional investors, and thus they can work as a separator for the two types of investors. In practice, we rank the CSI300 component stocks and divide the sample into 30% -40% -30% segmentations, where the upper 30% has a higher proportion of individual (noise) investors.
The results shown in Table 3 clearly support the second hypothesis. We further notice that only in the sub samples with the higher noise trader ratio (upper 30%) we have statistically significant value of α, which is negative and therefore proves the liking theory. When we compare the result in Table 3 with that in Table 2, we find that the p values in the total sample are 0.0159 (four-factor model) and 0.00390 (five-factor model) respectively, while in the two upper 30% sub samples, the corresponding numbers are 9.52e−5 (four-factor model), 0.000257 (five-factor model), and 0.000288 (four-factor model), 1.14e−5 (five-factor model), respectively. The level of significance is obviously higher in the sub samples than in the total sample. This comparison is not mathematically rigorous, but it does help us to see the role of noise traders in the impact of stock names.

Explanation of Hypothesis 3
In Hypothesis 3, we try to investigate the correlation between the real value of a company and the stock name. The so called "real value of a company" is actually the expected one-time income from the risky project in our model. This income is eventually the return to the investors. The correlation between the stock name and the company value does not mean the stock name changes a company's value. Instead, it is just an indicator of the company value, and the information theory assume that arbitrageurs are able to get useful information about the company from its stock name.
In our data analysis, we use the Tobin's q to measure the company value.
However, from the result we do not see any evidence of the correlation between Tobin's q and the stock names. This result again opposes the information theory and supports the liking theory.

Conclusions
This paper analyzes the component stocks of CSI300 index in the Chinese stock market, and studies the relation between the stock names and the risk-adjusted returns. We compare two candidate theories-the liking theory and the information theory. The liking theory indicates that the stock names affect the stock returns in a purely psychological way and cause an overpricing. The information theory believes that stock names reflect some true information about the companies' management and thus good stock names are related to good company value. These two theories give opposite predictions to stock returns, but they both point out that the impact of stock names gets stronger with more noise traders' participation.
The novel contribution of this paper is that we build up a scoring system to rate every stock in the CSI300 index, and a market-neutral strategy based on the scores. We further apply three different models (CAPM, Carhart-four-factor, and Fama-five-factor) to decompose the total portfolio return, and show that statistically high-score (name) stocks have lower risk-adjusted returns. The conclusions of our data analysis results are summarized below: 1) The liking theory matches the reality of the Chinese stock market, in which irrational investors could be attracted by good stock names and therefore make emotional decisions to push the stock overpriced.
2) Noise traders are the key factor of the price deviation caused by stock names. A high number of noise traders can magnify the impact of stock names.
3) In China, the proportion of individual investors is much higher than that in the United States, and the individual investors usually take the role of noise traders, so the liking theory works for the Chinese market, while some other research shows that the information theory is good for the US.
Funding Journal of Mathematical Finance Figure A1 shows the monthly risk adjusted returns of each portfolio and factor, including the name-arbitrage portfolio, MKT, SMA, HML, CMA, and RMW. This figure is plotted by imagining 10 million dollars invested in one of the portfolios and calculating the monthly profit and loss (subtracting the risk-free return).
We first notice that the name-arbitrage portfoliohas a lower risk-adjusted return than the market portfolio (MKT), which gives us a negative α in the CAPM model.
We further notice that the SMB and RWM portfolios both have higher return than the name-arbitrage portfolio, again indicating a negative α. Interestingly, the only positive risk-adjusted return comes from the small-cap preference strategy (SMB), while the high-book-value strategy and the high profit rate strategy cannot win over the risk-free rate.