The Virtual Repeat Sale Model for the House Price Index for New Building in China

By using the characteristics of the new building in China, this article constructs the virtual repeat sale method to produce virtual repeat data which is similar to the repeat sale model on the house price index. Case-Shiller procedure and OFHEO method are used to calculate the house price index for new building in China. A discussion is given and furthering models are needed to take advantage of the virtual repeat sale data.


Introduction
There are mainly two house price indexes in China nowadays: the China Real Estate Index System and 70 large and medium-sized city real estate price index.There is something wrong in the data quality of the house price index, for example, the China Real Estate Index System uses only the price offered by developers instead of the real price making a bargain, 70 large and medium-sized city real estate price index replaces the cover of all buildings by sampling.Moreover, there are shortages on the theory and method to compile index, which calculates the price index depended by average and they are the same as the first or second generation methods.
The most outstanding character of the real estate has hetero-quality [1].The ones in the area of real estate always underline that the position and environment around have great influence on house price.For an extreme example, the average price of new sale house in the center of a city is ten thousand yuan per one square meter, however, the new sale houses that are all in urban by chance have the same price a year later because of rebuilding the old area and there are no new place, but can you say that the price before a year hadn't risen?There is no difference in terms of average price that is full of misunderstanding according to the example.Considering the strong difference in quality of house, the science way to calculate price index is to take the change price in the same quality house into consideration and separate influence from the changeable quality to sale price making a bargain, that the real necessary of market is reflected, meanwhile, the right market signal can be transferred.
Developed countries pay more attention to the two points referred upon so far in order to overcome the problem resulted from the first or second generation methods.The third generation method develops into two models, and one is hedonic method that combines house price with series of main quality (such as position, floor, square, towards, material, environment surrounding), the other is repeated-sale method [2]- [6] that is used to consider the change of price making a bargain in different period time about the same building.But Chinese cities change very fast and new buildings have dominant, that commons pay attention to house price of new buildings, so the repeated-sale method abroad can't be applied into directly.Certainly it's hard to use hedonic method, because the exact degree can't be ensured and the cost is also humorous based on collecting the character data of each house wholly.
However Chinese real estate has its own character, that new residences have structure stability and almost all communities are developed by large area usually.The difference between Chinese estate and estate abroad should be used efficiently to construct house price of new building in China.

Virtual Repeat Sale Data
We can mine the information of repeat trade like second-hand trade according to the product structure and possess in setting price of Chinese residence.Because of the necessary to building stability, the same building, unit and towards of house are almost same except different floors.The house price would rise by floors upward in possess of setting price for high floor or a little high house, which can be used to make up likely repeat data.
Suppose that the same building, unit and toward (simply called "three common") have been sold two floors, which is written by ( ) , l l l l < that p 1 , p 2 represent prices during the month t.

Rule 1: Interpolation Method
If a "three common" building were sold on the lth floor in other month, the formula below would be chosen to calculate the supposed price on the lth floor in tth month: ( ) where that works out the fantasy price on the condition of 1 2 l l l < < is called interpolation method.

Rule 2: Extrapolation Methods
If a "three common" building were sold on the lth floor in other month and on the condition of l l l < < , the formula (1) could be selected to compute, which that calculate the supposed price on the condi- tion of 1 2 l l l < < or 1 2 l l l < < is called extrapolation method.

Rule 3: Strengthen Method
Supposed the "three common" building had been sold out k floors that are called 1 2 k l l l < < <  , which the price is written by 1 2 , , , k p p p  and the month is called t.If a three commons building were sold on the lth floor in other month, the outer push could be chosen on the condition of 1 l l < or k l l < .We choose the inter- polation or extrapolation method moderately in other case, which is performed detail as rule 4 below.

Rule 4: The Set of Threshold Value
The condition of 2 1 1 l l δ − ≤ is necessary when apply the interpolation method based on the inequality of Except those conditions upon, there are still other conditions to use the interpolation or extrapolation, for example, buildings must be high-floor or a little high-floor or the total floors (called total l ) should be 15 floors at least, which the inequality 1 3 l ≥ , total k l l < .

Virtual Repeat Sale Method
Similar to the BMN model and OFHEO house price index [5] [7], the model of virtual repeat sale method is represented by: ( ) ( ) ( ) where i X τ is a dummy variable that equals 1 if the price of house i was observed for a second time at time τ , −1 if the price of house i was observed for the first time at time τ , and zero otherwise.
In the Case-Shiller procedure, it H is a Gaussian random walk.Therefore, each step of the random walk is assumed to be independent of the previous step.This is not the case for the OFHEO index; the steps are assumed to be dependent.This means that the errors in the regression when fitting the model ( 2) are not independent which a violation of a standard regression assumption is.
For two sales of a house at time t and s ( ) Var ⋅ is the variance of a random variable.Their three-step procedure is described below.
Step 1. Fit the model in (2) by OLS.Step 1 gives the BMN index.
Step 2. Compute the residuals of the regression in (2), and denote these as ˆi d .Fit the model in (3) Index numbers for periods β are the parameter estimates of (4).

An Example
There were 6354 origin records for a sample data in 2012 and in Xiangyang city, Hubei, China.1463 virtual sale records were constructed by the method described in Section 2. By using the virtual repeat sale method in Section 3.1, index numbers for months 1, 2, ,12 t =  are computed as following Table 1 is residuals of the regression for BMN model.And Figure 2 is residuals of the regression for OFHEO model.
Comparing Figure 1 and Figure 2, it is clear that BMN model is better than OFHEO model.This phenomenon is opposite to usually ones which is given by the repeat sale model.So using the virtual repeat sale data is different significant from the original repeat sale model and need to research further.
Figure 3 is depicted for the indexes of BMN and OFHEO model.In this example, the red one is more convinced us than the blue one.

Conclusion Remarks
The article come up with that the virtual repeat sale method produces virtual repeat data and the calculation method similar to OFHEO, and gives out a kind of virtual repeat trade model to compute house index, which tries    to calculate according to one year data of a city, based on the character of new building in Chinese city.
As the weakness of the traditional repeat sales methods, perhaps the most obvious issue is that single sales are excluded, thus reducing the sample size significantly.The number of observations which are eliminated is staggering.So, further research is needed to use all data and virtual repeat data.
necessary when apply the extrapolation method based on the ineis necessary when apply the extrapolation method based on the inequality of k l l independent distribution.Step 3. The predicted values of the squared deviations from (3), 2 i d  are used to derive the weights needed to obtain GLS estimates of the t β parameters in the following regression:

Figure 3 .
Figure 3.The indexes of BMN and OFHEO.

Table 1 .
The indexes given by BMN and OFHEO.