Research on the Effect of Artificial Intelligence Real Estate Forecasting Using Multiple Regression Analysis and Artificial Neural Network: A Case Study of Ghana

To transition from conventional to intelligent real estate, the real estate industry must enhance its embrace of disruptive technology. Even though the real estate auction market has grown in importance in the financial, economic, and investment sectors, few artificial intelligence-based research has tried to predict the auction values of real estate in the past. According to the objectives of this research, artificial intelligence and statistical methods will be used to create forecasting models for real estate auction prices. A multiple regression model and an artificial neural network are used in conjunction with one another to build the forecasting models. For the empirical study, the study utilizes data from Ghana apartment auctions from 2016 to 2020 to anticipate auction prices and evaluate the forecasting accuracy of the various models available at the time. Compared to the conventional Multiple Regression Analysis, using artificial intelligence systems for real estate appraisal is becoming a more viable option (MRA). The Artificial Neural network model exhibits the most outstanding performance, and efficient zonal segmentation based on the auction evaluation price enhances the model’s prediction accuracy even more. There is a statistically significant difference between the two models when it comes to forecasting the values of real estate auctions.


Introduction
The accuracy and timeliness of real estate value forecasts are critical to prospective homeowners, developers, investors, appraisers, and tax assessors, as well as other real estate market participants, such as mortgage lenders and insurance companies. Real estate valuation techniques that have been around for a while, such as a cost and a sale comparison, are lacking in a standard and certification process [1]. In this way, the availability of a real estate value prediction model helps close a crucial information gap while also improving the efficiency of the real estate market [2].
Over the last two decades, there has been an explosion of empirical research assessing residential real estate pricing. In the early 1980s, along with the advent of information systems technology, computers were initially utilized to evaluate real estate [3]. Then various statistical techniques were employed to assess market data, with MRA proving especially helpful [4]. For valuing real estate, MRA models are the most used. It has been utilized in several residential real estate appraisals to complement the traditional sales comparison technique [5]. MRA techniques have come under fire from both the academic and practitioner communities. MRA has often resulted in significant difficulties for real estate valuation, mainly because of concerns with multicollinearity in the independent variables and the presence of "outlier" properties in the sample.
Real estate policies are strongly influenced by understanding the housing markets. Therefore, delivering insight into those markets is critical to a state's efforts to build a solid real estate policy foundation [6] [7]. When looking at necessary economic activities, such as banking, insurance, and urban development, it is critical to creating reliable models for forecasting real estate values [8] [9]. Modelling real estate has been researched as a complex system because of the considerable uncertainties and dynamic factors [10].
Additionally, nonlinearity within the data makes it impossible to use multiple regression analysis when a market demands quick and accurate answers. While real estate value models (MRA models) remain the industry standard, the application of AI systems has emerged as a superior option. More recent and increasingly feasible is the use of AI for real estate appraisal. The development of new models has been on the rise since then. While several AI systems exist, artificial neural networks (ANN) and econometric models (ES) are used for real estate appraisal. Real estate valuation is something that is only becoming feasible to use AI for. Consequently, there have been many experiences, which have resulted in a constant rise in new models. The study sought to investigate how artificial intelligence systems such as MRA and ANN predict price valuation of Real Estate in the five main zones (Northern, Southern, Eastern, Western and Urban) of Ghana.

Literature Review
Numerous previous researches have used artificial intelligence to aid in real estate forecasting, most of which utilized ANNs and statistical analysis. Stevenson et al. [11] conducted a study comparing two alternative sale techniques assuming that the valuation process is variable for both auctioned and private treaty transactions. To determine if agents alter suggested pricing to promote homes, they experimented with intentionally underpriced properties to aid in Artificial Neural Networks Systems' marketing efforts [12] [13] [14] [15] [16]. Worzala et al. [17] utilized neural network technology to conduct a real estate evaluation, deploying two ANN models to see how well they predicted residential property sales prices. They worried about the reproducibility and consistency of findings and the neural network's general "black box" character. The results of the study contradicted prior studies showing that ANNs are a better tool for appraisal analysis. Nghiep and Al [4] predicted home prices using multiple regression analysis (MRA) and artificial neural networks (ANN), comparing the predictive abilities of the models. The researchers demonstrated that when a moderate to large data sample size was utilized, ANN outperformed MRA. Limsombunchao [18] tested two models for property price prediction, the hedonic pricing model, and an artificial neural network, using a random sample of 200 homes in Christchurch, New Zealand. The research included various factors, including the size, age, and design of the property, the number of bedrooms, bathrooms, garages, and other neighbourhood amenities, and the geographical location; the results demonstrated the ANN's ability to forecast home values.
In the mass appraisal industry, [19] evaluated the prediction accuracy of an ANN and different multiple regression models. They found that a non-linear regression model outperformed the ANN in terms of prediction accuracy and that the ANN's output was insufficiently transparent to provide a clear assessment model. McCluskey et al. [20] compared several geostatistical techniques for calculating the value of mass appraisals to an ANN model and the traditional linear hedonic pricing model. They discovered that ANNs beat conventional multiple regression models and were on par with spatially weighted regression models in terms of performance. ANNs, on the other hand, maintain a "black box" design, limiting their use to practitioners in the area. In relatively recent research, Nez Tables et al. [21] advocated the usage of ANNs when sufficient statistical information and an extensive collection of data spanning years were available. The writers considered exogenous variables and circumstances, such as neighbouring structures and the surrounding area of a home. Zhou et al. [22] attempted to enhance current ANN-based prediction models and ultimately made many recommendations for mass real estate assessment in China.
Along  [25] attempted to develop a forecasting model for real estate evaluation prices using ANN regression and compared the model to ANN and multiple regressions. The study is essential since it tried to forecast a time series index for the African real estate industry using a macroeconomic indicator and machine learning techniques. Kwon [26] examined a non-linear macroeconomic time series forecasting model and is one of several empirical researches attempting to anticipate real estate values. Unlike earlier research that relied on macroeconomic indicator data to forecast real estate, this study constructed a fictitious housing market and integrated ANN with agent-based modeling (ABM). Chung [27] tried to create an ANN-based apartment price index forecasting model. Although the model predicted time series using the real estate price index and macroeconomic data, it was constrained by the model's short number of variables. All the study gives the impression that several scholars in the field of artificial intelligence have made attempts to investigate how the use of forecasting models such as MRA and ANN performs in the real estate industry. This gives enough premise for this study to be conducted focusing on the developing country such as Ghana where not much studies have been conducted on the phenomenon under investigation.

Conceptual Framework and Hypothesis Development
The main thrust of the study was to ascertain how AI forecasting models such as MRA and ANN predicts Real Estate valuation prices in the five main zones of Ghana. Ghana's real estate sector is in its infancy. This sector makes a significant contribution to the economy's development in terms of housing supply. The industry includes both private and public developers of residential and commercial real estate. Based upon the hypothesis described above that guided the study, a conceptual model was generated to serve as a blueprint for the study. Figure 1 briefly illustrates the nature of the research model and the compositions of the research hypothesis that informed the study.

Data Collection Procedure
The data for this research pertains to apartment real estate transactions in Ghana during the previous five years. The information is gathered from real estate data firms, banks, statistics agencies, and other relevant institutions. Because the structure and scope of collected data differ, a process of standardization is also needed. Additionally, in Tamale's northern region, Sabonkudi Estate Ltd and Pinnacle Solution Limited. The sample period is January 1, 2016, to December 31, 2020, with a total of 2544 data points. The data set spans more than five years and includes all apartment auction items in Ghana. The Seoul region was chosen for analysis because apartment prices in Accra, Ghana's capital, are more uniform than in other locations.

Regression Model
A linear regression model is used to ascertain the features and relationship between independent and dependent variables. A basic linear regression model has just one independent variable, while a multiple linear regression model contains many independent variables [28] [29]. Multiple panel regression analysis with fixed effect and descriptive statistics (means, standard deviations, maximum and minimum) were adopted in analyzing data collected to assess. The linear regression analysis produces the following linear equation, which specifies the nearest plane to each data point on a scatter plot. To be more precise, the Stepwise Multiple Regression Analysis was used to determine the degree of connection between Real Estate price valuation and individual characteristics based on the physical, legal, and economic qualities that distinguish real estate properties in the same region. , while X1, X2, and X3 represent the independent variables decomposed into Auction Data (AD). Physical Data (PD) and Macroeconomic Data (MD), respectively that predict the outcome of the dependent variable, Real Estate Price valuation (REP). By substitution,

Artificial Neural Network
ANN is a method that replicates the intelligent behavior of people [30] [31] [32]. A neural network model capable of expressing non-linear relationships has garnered interest as a means of circumventing the constraints of conventional statistical methods that describe forecasting models as linear combinations of independent variables. The model is based on human brain cells and makes no assumptions about the probability or variable distributions. As a result, ANNs may be used to a broader range of data than conventional statistical techniques. A neural network is composed of nodes and weights. A layer is a group of nodes with similar characteristics; there are usually three layers: input, hidden, and output [30] [31] [32]. The input layer is composed of input nodes that receive data as input values. The hidden layer is formed of hidden nodes; each node in the hidden layer takes as input values the output values of the preceding layer. The output layer is composed of output nodes that reflect the network's ultimate output values. In a graphical representation, Figure 2 summarizes the ANN model. The nodes of various layers are linked through their associated weights, which are multiplied when nodes' output values are transmitted to other nodes. Each network node calculates output values by applying an activation function to the input values. There are many activation functions, and the study will utilize the sigmoid function below.

Empirical Results and Discussions of the Study
Using the STATA software, multiple linear panel regression analysis with fixed effect was computed to analyze financial risk implications on the financial performance of Ghanaian commercial banks. The study tested some basic assumption needed to be met out before carrying out panel regression model.

Test of Key Assumptions
Standardization is required before experimenting due to the variability in the kind and amount of data gathered. Standardization is the process of changing each variable's value to a range between 0 and 1. This kind of data preparation is well-suited for model development and forecasting. The research used the minmax technique of standardization. All variables are included in the regression model when it is developed. The ANN model has two hidden layers with a total of 60 remote nodes in each layer. Because the study forecast values between 0 and 1, the sigmoid function is used as the activation function. Finally, the linear regression model's coefficients are tuned. The two models' mean absolute percentage error (MAPE) and root mean square error (RMSE) are used to evaluate their performance. where P i and F i denote the actual and projected auction selling values of i-th real estate, respectively, and n denotes the number of auction items. Without grouping, the training and holdout sets each include 1467 and 1077 auction cases. They are selected at random from all Ghanaian auction cases.
The STATA software was adopted to test for the key assumptions required for computing multiple panel regression. Diagnostic tests such as multicollinearity, autocorrelation, heteroscedasticity test, panel unit root test and Hausman test were conducted on the dependent and independent variables before the primary regression analysis.

Empirical Results and Discussions
The study formulated three research hypotheses and tested the two main models multiple panel regression analysis with fixed effect. The study formulated three research hypotheses and tested them using various panel regression analysis with fixed effect and ANN. The study tested whether unique errors are correlated with the regressors. The average forecast of auction sale price rate is compared to the average of market auction sale price rate to determine how near the predicted values are to market values. Due to the fixed nature of the auction appraisal price, the forecast price is sufficiently near to the market price if the average prediction of auction sale price rate is comparable to the market auction sale price rate.
The average of market auction selling prices and the average prediction of auction sale prices derived using multiple regression analysis, and ANN models are shown in Table 1. As indicated in Table 1 Table  1 demonstrates this point briefly.
The next phase involves a series of grouping operations designed to enhance predicting performance. To begin, five zones based on the 2020 Ghana City Basic Plan are utilized to categorize the area. North, south, east, west, and urban zones comprise the five zones. The north zone has 475 auction cases, the south zone has 654 auction cases, the east zone has 502 auction cases, the west zone has 490 auction cases, and the urban zone has 970 auction cases. The training and holdout sets are distributed in the same proportion as in the prior experiment. Table 2 summarizes the MAPE and RMSE of the suggested forecasting models when applied to the holdout set using a five-zone grouping method.
The MAPE and RMSE of the MRA model are the lowest of the two forecasting models in all five zones of Seoul, as shown in Table 2. The MAPE and RMSE of the MRA model, on average, are 15.31 and 0.0042 for five zones, respectively. This finding indicates an improvement over prior experimental results obtained without grouping.
Additionally, Table 3 shows the averages of market auction sale price rates and projected sale price rates for five zones using regression and ANN models. The MRA model offers the best performance for the north zone with an average   value of (0.9511). This was quite different from the South Zone, with the ANN model showing the best performance with a mean value of (0.9686). In the same tone, there was difference in performance of the two models in the East and West Zones. This was amply reflected in the MRA performance of East zone  Finally, the study conducted an independent sample t-test on the prediction values for all models to validate the outcomes of our trials with different grouping methods. The independent sample t-test p-values for three forecasting models without a grouping procedure are shown in Table 4. The findings show that all pairings of models perform significantly differently. Similarly, the t-test is used to compare models with a five-zone grouping. The results in Table 5 indicate that all zones have significant p-values, meaning that the ANN model performs better than the MRA model.

Conclusions
The research examines how artificial intelligence may aid in our comprehension of real estate market movements. A two-stage model was proposed to account for all variables affecting property values while also reflecting the many methods in which investors may engage in the market. When combined with AI techniques, each phase generates extra output that aids in comprehending both the present situation and near-future possibilities. Because artificial intelligence and machine learning techniques may be utilized to gain a competitive edge in real estate markets in various ways, the research focused on one of them-using them as a valuation assistance tool.
The research offers two forecasting models for the selling price of real estate at auctions using artificial intelligence and statistical methodologies: a regression model and an ANN model. Our empirical research demonstrates that the ANN model performs the best. Additionally, three grouping procedures are used to enhance the ANN models' performance. The ANN model with auction appraisal price grouping is much more efficient than any other forecasting model developed in this study. These empirical findings indicate that selecting suitable grouping criteria is critical for improving the prediction accuracy of a forecasting model. They also have significant ramifications for forward-thinking investors in real estate auction markets, as well as real estate fund managers.
Today's financial markets are inextricably linked to the real estate sector. Nowadays, many academics and practitioners have used statistical and artificial intelligence methods to investigate the real estate industry. To provide a holistic picture of the auction markets, a significant real estate business segment, this research builds forecasting models to estimate future values of specific real estate auction goods. This is the first research that we know uses data from individual apartment auctions to build forecasting models for real estate auction prices. Our ANN approach enables real estate fund managers to develop more effective investment strategies. It helps improve the investment efficiency of real estate auction markets and contributes to efficient financial markets.
Additionally, it contributes to the long-term economic advantages of real estate auction markets' associated stakeholders. In this way, the model proposed in this article contributes to economic growth sustainability. In summary, each step of the model introduces fresh perspectives and information that may aid in the price of a specific asset. A significant benefit of using AI algorithms is that they can be customized to represent market segments and automatically retrained and modified to include new data. The case studies show that ANN and MRA have been effectively used for real estate assessment. Different researchers' comparative results indicate that ANN outperforms MRA in terms of accuracy. However, a few studies demonstrate the advantage of MRA over ANN, while others reach an ambiguous conclusion. Hybrid systems overcome constraints and capitalize on possibilities to create more powerful systems than those made using just one intelligent system.

Limitations
This research may have certain drawbacks. The model created in this research is based on data collected throughout the sample period from apartment auction markets in Ghana. The empirical findings are restricted to apartment auction markets in Ghana that were active during a specific time.

Future Works
Future research may be enhanced by creating a model applicable to other real estate industries based on the concept of our model. It is anticipated that predicting capability would increase with more varied data, and models will be used more widely. Additionally, the study may be expanded by examining the critical variables affecting the grouping process to enhance model performance.