#### DMCA

## Microsoft Word - Stock comovement_2015_01_28_Anonymous.docx

### BibTeX

@MISC{Greenwood_microsoftword,

author = {R Greenwood and S Grossman and J Stiglitz},

title = {Microsoft Word - Stock comovement_2015_01_28_Anonymous.docx},

year = {}

}

### OpenURL

### Abstract

Abstract We study the comovement of asset returns caused by communication among investors. We develop an equilibrium model of investor communication and trading and derive a number of testable predictions. We use a novel dataset on an active online stock forum in China to measure investor communication. For each stock, we consider its "related stocks" that are frequently discussed on the sub-forum dedicated to the given stock. We find that there is substantial excess comovement among the returns of a stock and its related stocks. Excess comovement is greater when related stocks are more frequently discussed. Furthermore, the effect of frequent communication on excess comovement is stronger for stocks with higher information asymmetry. Introduction One fundamental question in financial economics is how asset prices are determined. In the rational expectations paradigm, price changes reflect changes in fundamental values. However, the empirical literature documents that there can be excess comovement in stock prices that is difficult to explain by fundamentals. 1 Understanding the source and extent of excess comovement can shed light on the structure of asset prices and facilitate the design of portfolio management strategies. In this paper, we study whether investors' communication can generate comovement of stock returns. In particular, we directly measure investor communication using a novel dataset of online stock forums in China. We document substantial excess comovement among stocks that are discussed together by investors on online forums and study factors that influence such comovement. To motivate our empirical tests, we develop a Grossman- Stiglitz-type (1980) 1 See for example Lee, Shleifer, and Vishny (1991), Pindyck and Rotemberg (1993), and 3 The model also predicts that excess comovement in asset returns is positively related to the frequency investors communicate before trading. Intuitively, more frequent communication leads to greater dependence of investors' beliefs on common signals and thus greater comovement. Further, the model predicts that the effect of communication on excess comovement is more pronounced when investors have less accurate beliefs, i.e., for stocks associated with greater information asymmetry. The intuition is that for stocks subject to greater information asymmetry, communication among investors exert a greater influence on investors' beliefs. We test these predictions using a unique dataset from one of the most active online stock forums in China. The Chinese stock market provides an ideal environment to study investor behavior. Established in the 1990s, the modern Chinese stock market has been developing rapidly but still suffers from a number of issues, such as the irrationality and immaturity of individual investors (e.g., Xu (2001) and For any given stock, there is a sub-forum of the online forum devoted to discussion about it. We will refer to the stock that the sub-forum focuses on as the target stock of the sub-forum. Investors are also free to discuss other stocks in a sub-forum. Based on the model, we expect the stock returns of the stocks discussed together to have excess comovement. To test this hypothesis, for any target stock, we consider the most frequently discussed stocks (henceforth "most related stocks") on its sub-forum. We construct a related portfolio that consists of the five most related stocks to a target stock in each month. We then estimate regressions of target stock returns on the returns of their related portfolios to examine the correlation between these returns. We find that the correlation between a stock's and its related portfolio's returns is highly significant, even after controlling for market returns and industry returns, suggesting that there is excess comovement among these returns. The excess comovement is also economically significant, e.g., a 1% increase in the related portfolio return is associated with a 0.2% increase in the target stock return. To address the concern that the correlation may be spuriously generated by a temporal trend or comovement among industries, we conduct the following falsification test. We first create for each target stock a "placebo" portfolio that consists of several placebo stocks randomly selected in the industries of the related stocks. We then estimate the same regressions replacing the returns of related portfolios with those of the placebo portfolios. We find the coefficients on the target stock's return in the regressions to be insignificant, suggesting that the excess comovement 5 we find is unlikely to be caused by temporal or industry factors. We next examine the prediction on the relation between the frequency of communication and stock comovement. We create a proxy variable for communication frequency by computing the number of investors' posts about the top related stocks in the sub-forum for a target stock. We then include the frequency and its interaction with the related stock portfolio return as independent variables in the regressions of the target stock returns. We find that more frequent communication leads to higher excess comovement between the return of target stock and its related stocks, consistent with the model's prediction. We then examine the prediction that the effect of communication on return comovement is greater for stocks associated with greater asymmetric information. We use three proxy variables for information asymmetry of stocks: stock illiquidity, market capitalization, and analyst coverage. We divide our sample of stocks into five quintile groups according to each of the information asymmetry variables and conduct our regressions separately for each group. We find that for more illiquid, smaller, and less covered stocks, the frequency of forum discussion has a greater effect on excess comovement among stocks, consistent with the theoretical prediction. We conduct a number of robustness tests. First, we carry out a time-series robustness test by conducting our tests separately for two equal sub-periods of our time period. Our results continue to hold for each of the two sub-periods. Second, we include a number of industry, market, and macroeconomic variables in our regressions and find our results to be robust. Third, we use the number of clicks the posts receive (instead of the number of posts) to proxy for the frequency of 6 investor communication and define the portfolio of related stocks. We obtain similar results. Fourth, we control for Fama-French factors in our tests to address the possibility that comovement can arise from style investing and find that our main results to be qualitatively unchanged. To alleviate the concern about endogeneity in our results, we employ an exogenous variation in the extent of investor communication caused by the most important holiday period in China, the Spring Festival Holidays. We show that communication in online forums in the month that contains the Spring Festival is significantly lower than the months immediately before and after. We reestimate our tests of comovement separately for the festival month and the neighboring months and find that the comovement in the festival month is the lowest, suggesting causality in our main results. Our paper contributes to the literature that studies excess comovement in asset returns and its relation to investor behavior. To the best of our knowledge, our paper is the first to document excess comovement of stock returns generated by communication in a social network. Pindyck and Rotemberg (1993) find excess comovement in stock prices. Green and Hwang (2009) document that after splits stocks comove more with other lower-priced stocks. We complement this literature by using a unique database on individual investors' communication to study the effects of communication and its frequency on excess comovement. Our paper is also related to the literature on information transmission in social networks and its effects on economic agents' beliefs and behavior (e.g., Hong, Kubik, and Stein (2006), Malloy (2008, 2010)). Similar to this literature, we show that communication among investors can have substantial impact in the financial markets. Finally, our paper is related to the literature on the effects of internet message board discussions on stock returns and volatility (e.g., Antweiler and Frank (2004) and The rest of the paper is organized as follows. Section 2 develops the model and derives empirical predictions. Section 3 presents our data construction and empirical analysis. Section 4 provides the results of additional robustness tests. Section 5 concludes. All proofs are included in the Appendix. The Model In this section, we develop a Grossman- Stiglitz-type (1980) model to analyze the effects of communication on comovement in stock prices. The basic structure of our model is similar to that of Veldcamp (2006). Consider an economy with two dates, 0,1. t There is a continuum of investors of unit mass with identical preferences. The preference function is dependent on the terminal wealth W at date 1 as follows, There is a risk-free asset and two risky assets in the economy. For simplicity, the risk-free rate is assumed to be zero. The values of the two assets at date 1 are given by stochastic where x is a common component and i y are idiosyncratic components. Note that without loss of generality, we assume that the coefficients on x to be 1 for both assets. The shocks x and i y are independent and normally distributed. We assume that investors have identical prior beliefs that 2 2 0 0 ( ,~( , ), 1,2 ), . Investors are endowed with initial wealth 0 W and trade after they form their posterior beliefs about the assets at date 0. The aggregate supply of asset i is i S for i = 1, 2. The equilibrium is defined by the usual market clearing conditions and the optimization of investors' problem. At date 0, all investors receive signals about the asset values before they trade the assets. For simplicity, we assume that they receive a sequence of signals 1, 2 , , , the information publicly will help stock prices to converge to the fundamental values faster and thus helping him to realize his profits earlier. Indeed, van Bommel (2003) shows that it can be optimal for informed investors with limited investment capacity to release private information with noises to the public. We begin by assuming that the signals j z are independent signals, i.e., 1, 2,. , , .. Therefore, investors choose their portfolios to solve the following optimization problem where the expectation is taken with respect to investors' information set N I after receiving all signals at date 0. The market clearing conditions together with (7) allow us to solve the asset prices. Proposition 2. In equilibrium, the asset prices after communication are given by . Using Note that the covariance of the intrinsic asset values is We have the following proposition that compares the covariances in fundamental values and asset prices. 5 Since the initial asset prices are constant, the covariance of prices here are equal to the covariance of changes in asset prices from the initial time. We follow the convention of studying changes in asset prices and their covariances in the framework of investors with CARA preferences and asset values with normal distributions, e.g., see Veldcamp (2006) and Banerjee (2011). 12 Proposition 3. The covariances of fundamental values and asset prices satisfy Therefore, when the signals received by investors are independent from each other and investors are fully rational, there is no excess comovement in asset prices beyond those in the fundamental values. Next, we assume that the signals j z are not independent from each other and investors still regard them as independent. 6 The motivation is that there are unlikely to be many independent signals about firm values in a short time period. Investors, however, have incomplete information or paid limited attention about the sources of the signals (especially on online forums) and regard them as independent. 7 Our assumption is also similar to the persuasion bias of agents in For simplicity, we assume that all the signals j z are identical and equal to . z x This assumption does not change our results qualitatively. We now have the covariance of asset prices equal to The following proposition describes the properties of excess comovement in asset prices. 6 Our results and intuition still hold in the case where investors treat the signals as correlated, as long as they underestimate the correlation among the signals. The results are available upon request from the authors. 7 There is a large theoretical literature that studies incomplete information, limited investor attention and asset prices, see, for example, Merton (1987) , ii) The following is always true: By part (iii) of Proposition 4, the model also predicts that the effect of communication on asset comovement is more pronounced for stocks subject to greater information asymmetry (higher 0 ). 8 The intuition is that for stocks with greater information asymmetry, 8 The condition in part (iii) of Proposition 4 holds when the signals are not too precise relative to the prior beliefs of investors, which is likely to be the case for online communications that we study in this paper. 14 communication among investors have a greater effect on their posterior beliefs and thus exert a larger influence on stock return comovement. Empirical Analysis Data and Variables We collect our data of investor communication records by tracking all the messages posted on an online forum: the East Money Stock Forum (http://guba.eastmoney.com/). We choose this forum because it is the earliest stock forum in China and also one of the most active and influential forums. When we search the key words "stock forum" on the most popular search engines in China (Baidu or Google (Hong Kong)), the East Money Stock Forum always ranks as a top outcome. Moreover, the forum is fully compatible with the East Money trading software that is widely used by investors in China for placing orders to trade stocks. Investors can thus easily access the information posted on the stock forum when they use the software to trade. Therefore, the East Money Forum provides a relatively representative and comprehensive dataset of communications among investors that can be influential on stock trading and prices. On the East Money Forum (henceforth the "forum"), there is a sub-forum for every stock on which investors can discuss and exchange information about the given stock. We will refer to the designated stock of a sub-forum as the target stock. On each such sub-forum, investors can also discuss other stocks, which we define as related stocks to the target stock of the sub-forum. Below are two example messages that discuss related stocks on the sub-forum for the target stock "The best sector in 2008 will be railroad industries; the undisputable leader in railroad stocks is Guangzhou-Shenzhen Railroad (601333)." "Since FAW Automobile (000800) tumbles, the prospect for Wuhan Iron and Steel won't be great." As discussed in Section 2, communication on a sub-forum can potentially lead to excess comovement among the returns of a target stock and its related stocks. Due to limited availability of the forum data prior to 2008, we study the period from 2008 to 2012 in this paper. To ensure that there is sufficient discussion by investors on the forum, we also focus on the sub-forums devoted to the component stocks in the Shanghai Stock Exchange (SSE) 180 Index, one of the most important benchmarks for the Chinese stock market. Similar to the S&P 500 index in the US, the SSE 180 index consists of stocks with large market capitalization. Besides being representative of the Chinese stock market, the SSE 180 stocks are associated with high trading volume, which helps them to attract investors' attention. Therefore, there are large numbers of messages on the sub-forums dedicated to these stocks. We use stock returns data from the Resset Database (http://www.resset.cn). During the period from 2008 to 2012, the composition of the SSE 180 index experienced several adjustments and a total of 296 stocks have been included in the index. Our sample of stock return data includes 255,844 stock-trading-day observations for these stocks. We download investors' messages on the forum using a Perl program. Our program can 16 retrieve from each message information such as the identifiers of stocks mentioned in the message and the posting time of the message. Messages can be posted on both trading and non-trading days. Since the messages posted on non-trading days also convey information to investors, we include them in our sample. We retrieve a total of 13,528,136 messages for our sample of stocks in the period from 2008 to 2012. We use the daily return of stocks, Ret, the daily market return, MKTRet, and the daily industry sector return for a given stock, INDRet, in our empirical tests. To capture the returns of other stocks discussed on a sub-forum, we define a related-stock return variable as follows. For each stock-month, we consider all the messages posted on a target stock's sub-forum during the month. We record the frequency of a related stock being mentioned in these messages and rank the related stocks by such frequencies. We form the portfolio of the five most related stocks on a monthly basis. Note that although we require the target stock to be included in the SSE 180 index, we do not impose the same restriction on its related stocks. We calculate the daily Mean Related-Stock Return, or MRR, of the stock as the daily average stock return of this portfolio, i.e., Ret is the date t daily return of the related stock j. 9 17 five related stocks for one target stock in the SSE index during a six-month period in our sample. In this example, a top related stock is mentioned on the sub-forum from 2 to 15 times each month. We use the total number of times that the top five related stocks are mentioned on a sub-forum in a month, Freq, as a proxy for the intensity of communication among investors. Furthermore, we consider a number of (Chinese) market and macroeconomic factors in our analysis: Inflation, the monthly growth rate of Consumer Price Index; GDP Growth, the monthly growth rate of real gross domestic product, interpolated from quarterly data; Term Spread, The difference between the long-term (10-year) treasury bond yield and the short-term (3-month) treasury yield [Insert Communication and Comovement of Stock Returns 18 In this section, we study the comovement of returns of target stocks of sub-forums and their related stocks discussed on these sub-forums. As discussed in Section 2, our model predicts that investors' communication about a group of stocks can generate excess comovement among these stocks. We first conduct time-series regressions of each stock's returns on the returns of its The comovement among stocks studied in Model 1 can be generated by market-wide stock movement that drive returns of both the forum-target stock and its related stocks. To alleviate this concern, we include market returns on the right hand side of the regressions and estimate the following model: In Model 2, the coefficient 1m indicates the excess comovement between the stock and related-stock returns, after controlling for market returns. Panel B of Furthermore, this coefficient is positive and significant at 1% levels for 161, or 68%, out of 296 regressions and insignificant (or negative) only in 86, or 19%, of the regressions. Therefore, after controlling for market-level changes, we find significant excess comovement among forum-target stocks and related stocks. [Insert When examining the coefficients in Models 1 and 2 across all target stocks, it is possible to compute the overall t-statistics to assess the joint significance of the stock-by-stock regressions. However, the simple t-statistic (following the Fama-Macbeth method) for the average coefficient is calculated under the premise that the estimation errors are independent across regressions, which may be violated in the cross-sectional setting, leading to potential biases. To allow for cross-sectional correlation across residuals, we calculate overall t-statistics using the Placebo Test In the previous section, we document the existence of excess comovement between returns of stocks discussed on the online forum. However, it is still possible that temporal trends or other unobservable temporal factors, rather than information sharing among investors, drive the correlations between stock returns. We address this potential concern by conducting a placebo test. For each forum-target stock and month, we randomly select five stocks from the same industries of the top five related stocks in that month to form a placebo portfolio of stocks. Similar to the construction of the actual related-stock portfolios, we adjust the composition of the placebo portfolios on a monthly basis. We define , m t RANDRet as the average date t return of stocks in the placebo portfolio of a target stock m. We then conduct stock-by-stock time-series regressions by replacing the related-stock returns in Model 2 with the placebo portfolio returns: [Insert Communication Intensity and Return Comovement According to our model, as the rounds of communication between investors increase, investors update their beliefs about the stocks more, leading to greater comovement among stock returns. Therefore, we expect the excess comovement to be higher for stocks subject to more intense discussion. In this section, we use the frequency that stocks are discussed on sub-forums as a proxy for communication intensity and test this prediction. We include the (logarithm) of the frequency variable (Freq) and its interaction with the return of related stocks (MRR) in our time-series regressions and estimate the following model 22 for each target stock: The coefficient of the interaction term between Log(Freq) and MRR in Model 4 captures the marginal effects of more frequent discussion on the comovement between target stock and related-stock returns. We report the results of these stock-by-stock regressions in Taken together, the evidence in this section suggests that excess comovement is concentrated among stocks that are more frequently discussed by investors, consistent with the theoretical prediction. [Insert Information Asymmetry, Communication, and Return Comovement In this section we examine the relation among information asymmetry, communication, and 23 excess comovement of stock returns. Our model generates the cross-sectional prediction that the noisier investors' prior beliefs are, the stronger the effect of communication is on excess comovement. To test this prediction, we examine whether stocks with higher information asymmetry have higher levels of return correlation with their related stocks. We use three variables to proxy for information asymmetry: illiquidity, firm size, and analyst coverage. First, we employ the widely used Amihud illiquidity measure (Amihud and Mendelson, 1986; Amihud, 2002), calculated as follows: Vol are the daily price and trading volume of stock i. We use the natural logarithmic transformation of the Amihud measure to mitigate the effect of any outliers. Second, we use the logarithm of stock market capitalization as a proxy for firm size and information asymmetry. We average all daily measures to obtain quarterly measures. Third, we use the number of analysts who cover a stock in the previous year as an additional proxy since greater analyst coverage provides more information to the public. We use the above three proxy variables of information asymmetry to construct subsamples. Specifically, we divide the 296 target stocks into five quintile groups according to the value of the information asymmetry variable in the lagged quarter. We readjust the composition of the five groups quarterly. We then estimate the regression of Model 4 separately for each quintile over time and compare the differences of stock return comovement among the different groups. 24 [Insert Panel A of (Note that since we include the interaction term in these regressions, the coefficients of MRR should not be interpreted as the overall excess comovement as before. Therefore, we focus on the interaction term and do not compare the coefficients of MRR across the subsamples.) Panel B shows that the coefficient of the interaction term decreases as stock market capitalization increases (from 0.102 in the bottom quintile to 0.04 in the top quintile; the difference is statistically significant with a t-value of -3.40). Panel C shows that the difference of the coefficients of Log(Freq)×MRR for stocks with the lowest analyst coverage and those with highest analyst coverage is negative but insignificant. Since stocks with higher illiquidity, smaller sizes, and less analyst coverage are subject to higher information asymmetry, these results suggest that the effects of communication on return comovement are more pronounced for stocks with higher information asymmetry, consistent with the model's prediction. Robustness Tests 4.1.Time-Series Robustness Tests 25 In this section, we perform a robustness test by conducting our main regressions in two equal sub-periods of our sample, i.e., the periods from January 2008 to June 2010 and from July 2010 to December 2012. We estimate the regressions of Models 1 to 4 separately for the two sub-periods and report the results in [Insert 4.2.Industry and Macroeconomic Conditions In our tests of Models 2 to 4 in the previous sections, we included the market return in the independent variables in order to control for the effects of market-wide factors on the comovement of stock returns. To address the possibility that stock prices may move together in response to industry-wide information, other changes in the financial markets, and the macroeconomic conditions, we consider various additional controls in this section. Next, we control for other aggregate factors in the financial markets in the model. We include several aggregate market-level variables: IPO Activity, to capture whether the market is "hot" or "cool"; Log(Turnover), to proxy for the trading activity in the market; and Term Spread, to represent the effects from the bond markets. In particular, we estimate the following model: Finally, we include all the control variables and estimate the following model: 27 [Insert We next include the frequency of discussion, Log(Freq), and its interaction with MRR in Models 5 to 8 to examine whether the results in Section 3.4 continue to hold with the additional industry, market, and macroeconomic variables. Panel B of As an additional robustness check, we remove related stocks that are in the same industry as the target stock in our construction of the related portfolio and then repeat our tests in Models 1, 2, and 4. The results are again qualitatively similar (see [Insert Our dataset allows us to define an alternative measure of the degree of investor communication by the number of clicks the messages receive on the forum as of the end of 2012. We use the total number of clicks received on messages about related stocks to rank and obtain the top five related stocks, and form the portfolio of related stocks each month. We then reestimate Models 1 and 2 using this new definition of related portfolio returns. We also repeat the estimation of Model 4 by replacing the number of posts (Freq) with the total number of clicks on the posts (Clicks). [Insert Communication, Style Investing, and Comovement The literature on comovement shows that comovement can arise when investors follow defined investment styles, such as large-vs. small-cap and growth vs. value investing (see, for example, Vijh (1994), In particular, we perform the following two groups of tests. First, we augment Models 2 and 4 with the Fama-French small-minus-big and high-minus-low factors. To be consistent with factor models, we replace the dependent variable Ret and the independent variables MRR and MKTRet by excess returns, i.e., differences of returns with risk-free rates. We calculate the Fama-French factors and risk-free rates in China following Fama and French (1993). Second, we modify Models 2 and 4 by replacing the dependent variable Ret with the Fama-French 3-factor Alpha and reestimate the models. We obtain the Fama-French 3-factor Alphas as residuals of 3-factor regressions of daily returns over the entire sample period. 10 We report the results of these tests in We find in Panel A of 29 interaction of Log(Freq) with Excess MRR is positive and significant at the 1% in the augmented Model 4. In Panel B, we observe similarly that the corresponding coefficients are positive and significant at the 1% levels. 11 These results are in line with our main findings. [Insert Lagged Communication and Comovement In the previous tests, we form the portfolio of the most related stocks in the same month in which we examine the correlations of stock returns. One alternative explanation of our findings is that the communication among investors could instead arise from the excess comovement among the target stock and its related stocks. The results in the cross-sectional tests in Section 3.5 can help to partially alleviate this potential concern about reverse causality. In this section, we form the related-stock portfolios using the top five related stocks of the target stock in the previous month and investigate the comovement of stock returns in the current month. Since returns in the current month cannot affect the communication among investors in the previous month, this helps to further address the above concern. We estimate the regressions in Models 1 to 4 with the above modification and report the results in 30 our findings that there exists excess comovement among stocks discussed together on online stock forums and that such comovement is stronger when accompanied by more intensive communication among investors. [Insert We report in Panel A of messages about related stocks in a festival month, compared to 10.8 (11.1) in the previous [Insert We estimate the regressions in Model 2 separately for the festival month and the months before and after, and report the results in Panel B of Conclusion In this paper we use a novel dataset of online forum discussions in China to study stock comovement and communication among investors in a social network. We develop a model in which investors receive informative signals through communication before trading. The model predicts that communicate can generate excess comovement in stock returns. We find that there exists substantial excess comovement among the returns of a forum's target stock and its related stocks -stocks that are discussed on the same forum. Excess 32 comovement is greater when related stocks are more frequently discussed. Furthermore, the effect of frequent discussion on excess comovement is stronger for stocks with higher information asymmetry, i.e., small, illiquid stocks, and stocks covered by fewer analysts. These findings are consistent with our model's predictions. We use the exogenous variation in communication in the Spring Festival month to establish causality in our results. Finally, we find our results to be robust in a host of different specifications, including tests in different sub-periods and tests that control for additional industry, investment style, market, and macroeconomic factors. The basic Bayesian updating formula implies that From Proof of Proposition 2. For simplicity, we use the vector notations below, i.e., Maximizing the above with respect to , we obtain the investors' optimal portfolio The market clearing condition implies that * . S Therefore, we obtain from (18) that [ | ] ( , | ) . Therefore, (8) follows from Proof of Proposition 3. By Now the LHS minus the RHS of (23) is equal to Therefore, (23) holds. Proof of Proposition 4. i) The first statement follows directly from (9) and (12). By (12) and (10), Therefore, by calculation, Ret The daily stock return MRR The average daily return of the top 5 related stocks of each target stock MKTRet The market daily average weighted return INDRet The daily average weighted return of all stocks of the same industry as our target stock. We use the industry sector definitions b China Securities Regulatory Commission (CSRC) RANDRet The mean return of the 5 randomly chosen stocks for each target stock Other Variables Freq The total number of the times that related stocks are mentioned on the forum for a target stock in a month Inflation The monthly growth rate of CPI (Consumer Price Index) GDP Growth The monthly growth rate of GDP (Gross Domestic Product), interpolated from quarterly data. Term Spread The difference between the long term yield (10-year) and the short term yield (3-month) on National debt IPO Activity The number of new firms that make Initial Public Offering in a month Log(Turnover) The log value of the value-weighted monthly turnover rate of all stocks in the market Economic Index The indicator for economy status calculated by the National Bureau