Methodology to Predict NPA in Indian Banking System

Banking sector is the backbone of any economy and plays a pivotal role in the development of the nation. With the ever increasing need of involvement of banks in the economic growth process, the issue of non-Performing Assets (NPAs) has assumed mammoth proportions. Subsequently, as government in emerging countries pushes for financial inclusion, the risk associated with bank assets increases. With this background, this paper analyses problem of non-Performing Assets (NPAs) in Indian banking system. The authors have devised a unique way to forecast the NPAs in Indian banking system in 2020. The focus was to devise a model which would play a pivotal role in forecasting future NPAs in the Indian banking sector. This was achieved by looking into various methodologies and zeroing in on a model, which could be implemented to help understand how NPAs could be predicted.


Introduction
Banking sector is one of the most crucial aspects of an economy and it is high time the Indian banking sector starts to deliver the goods.Myriad schemes like Jan Dhan, Make in India, Direct benefit transfer require the Indian Banking sector to expand its spectrum of services and widen its capabilities to cater to larger masses.The necessity of it can be understood by the fact that India's vision is to provide banking services to even its remotest geographical area and to accommodate people belonging to both ends of the spectrum.However, the road towards fulfilling the vision would not be easy as expansion of this scale attracts major roadblocks and challenges.The perennial problem of NPA will be the biggest issue to tackle as far as expansion is concerned.
Stringent RBI guidelines have added to the woes as it has resulted in greater stress for banks, since it is now asked banks to declare stressed assets.This has resulted in creation of provisions for bad loans, resulting in transfer of NPAs in the bad loans category.This has diminished profits and has created a negative sentiment for the banking sector.
The determination of NPA is very critical for a bank's financial health and liquidity concerns.The health of the assets of a bank is measured by its NPA ratio.Given that the public banks in India are under a lot of financial stress due to accumulated non-performing asset, it is essential that they are in a position to plan for the future so that remedial steps can be taken.
It is crucial to understand what actually results in the rise of NPA before we delve into further details.Understanding the reasons for NPAs will provide a platform to better analyse the methodologies used to predict NPAs.
The causes for NPA can be attributed to many factors like bribery, corruption etc.One of the many causes is the lack of effort from the service provider during the pre-sanction process.Ignoring the process of conducting a due diligence process or conducting it in a haphazard manner leads to increasing cases of NPA.An excuse for corporations is blaming the economic scenario for defaulting; however forensic audits paint a different picture.In majority of the cases, borrowers have wilfully defaulted or transferred funds to other parties.The integrity of the promoter also comes into question in some high profile cases like that of Kingfisher.Obtaining tangible collateral security is also a big headache during credit enhancement as the loan amount outweighs the sales value of the attached security by a considerable margin, thereby diluting the provision of SARFAESI Act (Securitisation and Reconstruction of Financial Assets and Enforcement of Security Interest Act).The after disbursement monitoring process is also porous as in most of the cases there is no forensic audit after initial sanction of loan.There is a lack of due diligence for loan disbursement of the sector, where interest of promoter lies and after disbursement, monitoring is poor resulting in NPAs [1].
Corporate debt restricting scheme is another scheme which is misused by both corporations and banks.The objective of the scheme was to help genuine debt ridden companies by offering them flexible repayment terms and giving them time according to the economic scenario but the corporation often misused it to cajole the banks to lower their interest rates, prolong the repayment dates and so on.
Sluggish Legal system, inappropriate technology, poor quality management, improper monitoring and follow up processes, managerial deficiencies, and ineffective recoveries are other causes which results in rising NPAs [2].
With this paper, the authors want to propose a methodology for forecasting NPAs, which can thereby be used to forecast NPA of an Indian bank and gauge the crisis which Indian banking system is facing.We intend to forecast the percentage of NPA which banks will have in 2020.The forecasted value can be used by RBI to estimate whether the system will survive or collapse.
In order to develop a holistic model for forecasting NPAs, data from all Scheduled Commercial Banks (SCB) were taken into the fray, and various forecasting models were tried.Iterations were performed so as to come up with a model which was comprehensive and provided satisfactory results.Subsequently, the model was tried on SBI so as to forecast their NPAs by 2020.Ideally, we can choose banks of different sizes and levels of exposure to risks, in order to analyse the NPA.However, we have chosen SBI as a reference point as it is a credible public sector bank.SBI is mammoth in size and its data are not affected by minor fluctuations.

Literature Review
Researchers in the past have tried countless number of times to forecast a bank's performance based on certain macro-economic factors.Roger et al. (2011) [3] focuses on forecasting recession based on a bank's commercial and industrial loan commitments.The paper emphasises on the fact that a bank's commitment of loans depends upon future interest rates hence can be used as an indicator for nation's growth Tobback et al. (2014) [4] has forecasted loss from bad loans on a country's performance using linear regression models [4].The magnitude of NPA is comparatively higher in public sectors banks than private sector banks (Singh, 2013) [5].A high level of NPAs also suggests high probability of a large number of credit defaults that affect the profitability and liquidity of banks.A study on trends of NPAs in India from various dimensions explains how mere recognition of the problem and self-monitoring has been able to reduce it to a great extent (Meenakshi et al., 2010) [6].On the similar lines, it was found that public sector banks in India, which function to some extent with welfare motives, have as good a record in reducing NPAs as their counterparts in the private sector (ibid).The literature suggests that the lending policies of various Banks were not proper due to having improper financing.Banks should provide detailed information to the customer about their lending policy (Jain and Sheikh, 2012) [7].Moreover, various Private Banks are not granting Loans outside India, so they could do so to expand their business.They pointed out that instead of focusing on urban areas only, the Banks should set up branches in the rural regions also which could improve their profitability (Beck Roland et al. 2013) [8].study the macroeconomic determinants of nonperforming loans (NPLs) across 75 countries during the past decade.According to their dynamic panel estimates, the following variables are found to be significantly affecting NPL ratios: real GDP growth, share prices, the exchange rate, and the lending interest rate.In the case of exchange rates, the direction of the effect depends on the extent of foreign exchange lending to un-hedged borrowers which is particularly high in countries with pegged or managed exchange rates.In the case of share prices, the impact is found to be larger in countries which have a large stock market relative to GDP.
None of the models have tried to forecast NPA values itself using the macro-economic indicators and we have made an attempt to do the same using both step wise regression as well as logarithmic regression.

Research Methodology
The main objective of this research is to propose a methodology for forecasting NPAs which can thereby be used to forecast NPA of an Indian bank and gauge the crisis which Indian banking system is facing.
There are many forecasting methods which can be used to predict the variable of interest.They are broadly divided into two categories: 1) Regression models: Simple Linear Regression (SLR), Multiple Linear Regression, 2) Time series based methods: Moving Average, Simple Exponential Smoothing, Holt, and Holt-winter method.
Regression models are used for prediction purposes.Based on the trend of independent variables, they predict the trend for the dependent variable.Since we will be predicting the NPA's in 2020 based on external factors, regression models are the best fit in this particular situation.SLR could not be used because it gives the value of dependent variable (Gross NPA's) in terms of only one independent variable.NPA is often dependent on various factors so this method will never give satisfactory results.
We finally conducted the analysis by MLR method.

Variables Used in Forecasting Technique
To use MLR we had to determine the factors on which NPA's in India depend on.Qualitative factors cannot be modelled in the forecasting equations and hence they were neglected for the purpose of this study.
After much deliberation and thought process we came up with the following 4 quantitative variables which were used to predict NPA's in 2020 of banking sector.

1) Repo rate
Repo rate is the rate at which the central bank of a country (RBI in case of India) lends money to commercial banks in the event of any shortfall of funds.Repo Rate is generally related to the Bank Prime lending rate as well as reverse repo and MLR.It is an indicator of the prevailing interest rate in the country.Interest rate and inflation has a cumulative effect on the economy and ability of the borrower to pay back.Hence, repo rate is a crucial factor impacting NPAs.
2) Gross domestic product A country's GDP is the total market value of all final goods and services produced in a country in a given year which is equal to the total consumer, investment and government spending.GDP is a growth indicator of an economy.As GDP grows, loans and advances also grow and hence it directly impacts NPAs.Moreover, when the economy is in shambles, corporate will not be able to pay the debts which will thereby lead to an increase in NPA's.

3) Loans and advances
Loans and advances are considered the most important factor while forecasting NPAs.As the size of loans and advances increases, the proportion of NPA's increase due to increase in risk in that case.

4) Inflation rate
The index is a measure of the average price which consumers spend on a market-based "basket" of goods and services.Inflation based upon the consumer price index (CPI) is the main inflation indicator in most of the countries.Inflation rate in India is based upon the Indian Consumer Price Index.As inflation rises it becomes cheap for borrowers to borrow money, because of inflation purchasing power of consumer's fall resulting in a drop in profits for the companies.Combination of both these factors results in rise in NPA's [9].

Standard Error Terms
To choose the best model for forecasting, we have standard error terms to be calculated.A comparison of these can help us choose the best fit model.MAPE and bias are the two statistics which are used by organizations worldwide for choosing the best forecasting model.They use MAPE numbers and subjective knowledge to choose the final forecasting model.We did the same and MAPE was our deciding criteria for the forecasting model.

Research Process
Research was conducted in two steps.
1) Annual Data (4 factors) for last 11 years (2004-2011) was taken for Indian Banks.MLR models were developed using combination of various factors.The best fit model was used to predict the NPAs in the Indian banking sector in year 2020.
2) The same methodology is then used to predict the NPAs for SBI.In case of SBI we took quarterly data.This helped us to check the impact of seasonality via Holt-Winter method.Finally the best fit model was used to predict NPA for SBI in 2020.

Data for SBI
The data is collected for all the above quantitative factors for SBI from 2010 Q2.In total there are 23 data points.

MLR Analysis (Linear, Stepwise and Logarithmic) and Results for Overall Banking Sector
In Multiple Linear Regression we have used a combination of independent variables (CPI, GDP, Repo Rate, Loans and Advances) for finding out their relationship with the NPAs.It can be used to predict the NPA of a given bank on the basis of the independent variables [10] [11].The MLR model is: where, Y i = Gross NPA for the quarter "i" (dependent/response variable), X ki = independent/explanatory variable taken for regression such as GDP, β 0 = Y intercept, β k = slope of Y with respect to X ki , holding other variables 1 2 3 , , , i i i X X X  constant, ε i = random error in Y for observation "i".MLR models were developed using combination of all variables.The results are shown below: There can be a case of multicollinearity in this kind of situation.For example, we can fairly guess that with GDP growth, loans and advances will grow.Hence rather than directly doing MLR analysis we do a stepwise regression which gives us the model with only the significant independent variable and also it keeps in check any issue which arises due to multicollinearity.As expected with stepwise regression we got only three significant variables CPI%, GDP  For MLR we will choose the equation given by stepwise regression for the reasons mentioned above.While performing Stepwise MLR all other assumptions (homoscedasticity etc.) were also checked.We also explored the possibility of a nonlinear relationship between NPA and the variables identified from stepwise regression.A logarithmic regression model was run between Ln (NPA) as dependent variable and Ln (CPI), Ln (GDP) & Ln (Loan & Advances) to check if there exists a polynomial relationship between NPA and the mentioned independent variables.The table below shows results from the logarithmic regression model.It can be clearly seen from above results that MAPE of logarithmic regression is very high and hence it cannot be used.We would use stepwise regression equation as our final model [11].

Analysis and Recommendations for Overall Banking Sector
To compare the results in each methodology we have calculated the MAPE for each one of them.
The MAPE of MLR is 5.50% when all four variables are taken into consideration.R 2 of MLR is 0.995 which is considered a very good indicator.However, this is because of the fact that data set is only of 11 years.On RBI website data for NPA till last 11 years is mentioned and this resulted in high R 2 .But the purpose was only to devise a methodology and hence 11 data sets also fulfil the purpose.
As discussed in section 5.1 to improve the results of MLR, a stepwise regression model was run for both linear and polynomial relationship.MAPE for linear stepwise model is 6.25% whereas for polynomial relationship it is 36%.MAPE of 6.25% is very good and linear stepwise regression takes care of multicollinearity issue.Hence we would use linear step wise regression model for NPA prediction as it is the best fit.

Final Forecasting for Overall Banking Sector
The final forecasting model to be used will be linear stepwise regression model as discussed above.The forecasting equation is:

MLR Analysis (Linear, Stepwise and Logarithmic) and Results for SBI
As discussed above the same model was applied on data from SBI to check the accuracy of the model and predict NPA of SBI in 2020.
The Results are shown below: For multi-collinearity we performed linear stepwise regression.As expected with stepwise regression we got only two significant variables CPI% and Loans & Advances.The important point to notice here is that for doing step wise regression and accurate prediction we first remove the seasonality component from raw data and then run step wise regression.We then predict the values from the model and finally seasonalize the values to get final output.In the entire process we have assumed multiplicative seasonality.The table below shows results from stepwise regression.It can be clearly seen from above results that stepwise regression and logarithmic regression give comparable results and based on MAPE figures which are marginally better in case of linear stepwise regression we use the same for prediction of NPA.

Analysis and Recommendations for SBI
To compare the results in each methodology we have calculated the MAPE for each one of them.Contrary to our expectations the MAPE is high in models.
The MAPE of MLR is 10.47% when all four variables are taken into consideration.R 2 of MLR is 0.8644 which is considered a very good indicator of the fact that the variables chosen determine 86% variability in NPA.
MAPE for linear stepwise model is 7.78% whereas for polynomial relationship it is 9.08%.MAPE of 7.78% is very good and hence we would use linear step wise regression model for NPA prediction as it is the best fit.

Final Forecasting for SBI
The final forecasting model to be used will be linear stepwise regression model as discussed above.The forecasting equation is: Y1 69720.0950.096 L&A 2994.97CPI.= − + * + * We will calculate NPA in 2020 based on the above equation.
In 2020, assumptions regarding the independent variables are: CPI (X1) in 2020 = 6% (assuming stable inflation), Loans and Advances in 2020 = 2,093,704.87crore (Taking loans and advances growth of 10% year on year as we expand the banking services).
With this the Forecasted NPA = 147,921 crore is 7.07% of the Average Loans and Advances of SBI.

Holt-Winter Method for SBI to Account for Seasonality
Holt Method of forecasting has been employed to estimate the NPAs of SBI based on the past 5 years' quarterly data.The actual data is first de-seasonalized by considering the seasonality in the value of NPAs for a bank.Then as per Holt method, the level and trend adjusted forecasted data is calculated by considering a nominal value of "alpha" and "beta" and the standard sum of errors (SSE) is calculated.This SSE is then minimized using the "solver" function of Excel which takes into account the reference variables and calculates the most optimal value of alpha and beta for the present data.This helps us in forecasting the value of next quarters' NPA with least amount of error [12].Forecast Equation: where ℓ t denotes an estimate of the level of the series at time t t , b t denotes an estimate of the trend (slope) of the series at time t t , α is the smoothing parameter for the level, 0 ≤ α ≤ 1 and 0 ≤ α ≤ 1 and β is the smoothing parameter for the trend, 0 ≤ β ≤ 1.
Holt method we will not use for forecasting as it does not take care of seasonality.We now introduce a third equation to take care of seasonality (sometimes called periodicity).The resulting set of equations is called the "Holt-Winters" (HW) method after the names of the inventors.The exponential smoothing formulae applied to a series with a trend and constant seasonal component using the Holt-Winters additive technique are:

Remedies
The model proposed by us predicts 7.07% of the Average Loans and Advances of SBI will fall under the NPA bracket.The sheer value of the figure, INR 147,921 crore, is enough to point towards a major situation in the making.It is imperative to implement remedial measures so as to deflect the major repercussions of the same.Some of them have been discussed in detail below: 1) Strengthening the pre-sanction and after disbursement monitoring process by using technology Banks need to monitor their internal and external processes and constantly look for avenues to fortify the same.This is where technology can stake is claim.Data analytics is the new buzzword and has the potential to provide keen insights in the NPA problem.Our PSUs can avail the services of data analytics to create state of the art processes for cash flow generation capacity of a particular company, past records of the borrowers, track a particular sector's growth projections, track the loan payment schedule etc. Creation of a Central Data repository for all corporate borrowers of all banks would go a long way in establishing coordination and provide a way to track defaulters, thus restricting their entry into the system.Detailed analysis of the fixed assets claimed by the borrowers needs to be done.Proper processes should be in place to check the fixed assets as claimed by the borrowers.Innovative techniques like surprise visits to the factories will keep the borrowers on their toes.Mid-term audits will also ensure a status check of the borrowers, thus allowing banks to take preventive measures [13].
2) Sector wise planning Certain sectors need to be monitored more closely than other.RBI financial stability report states that sectors like infra, aviation, iron and steel, mining, textiles make up 24.8% of total advances of commercial banks whereas they hold a staggering 51.1% in total stressed advances.The concerned sectors should be dealt with extreme caution and extensive due diligence should be carried out before sanctioning anything.Moving forward, banks can think on the lines of creating a credit appraisal team which would cater to specific industries and monitor new applications and their status quo, subsequently raising a red flag if any anomalies are found [14] [15].
3) Standardisation of 3 rd party Vendors 3 rd party agencies like financial analysts, engineers, verification agencies play a pivotal role in concurring the veracity of the claims of borrowers.It is easy to obtain the required documents, provided the 3 rd parties are also on the same page.It is important for the government to intervene and create a mechanism which would be conducive to both the borrower and the lender.Monitoring of 3 rd party vendors, standardisation via government authorisation can pave the way for reducing forgery and initiating a step towards creating a transparent system [13].

Conclusion
The linear stepwise regression model was used to forecast NPAs for Indian Banking Sector and it was also used to predict the same for SBI, as a special case.The model predicted that if the current path was treaded, the issue of NPAs would snowball into a major problem with serious repercussions.As far as the Indian Banking Sector is concerned, it was seen that bad loans would amount to INR 7808 billion which is 4.94% of the Average Loans and Advances of all scheduled commercial banks.The forecasting for SBI showed a grim picture where the bad debt would rise to INR 1479.21 billion, which would be 7.07% of the Average Loans and Advances of SBI.The results point that the need of the hour is NPA control so as to avoid the serious consequences that the future holds for us.We also covered various causes for NPAs and their possible solution which could prevent a loan from converting it into an NPA.

1 ) 2 )
Mean Absolute Deviation (MAD) It measures the size of the error in units.It is calculated as the average of the unsigned errors.Mean Absolute Percentage error (MAPE) The MAPE measures the size of error in percentage terms.actual value of the NPA, y f = forecasted value of the NPA, n = Number of the observation.
Y1 1410.2689135.33 CPI 1.7918 GDP 0.091402 L&A.We will calculate NPA in 2020 based on the above equation.In 2020 assumptions regarding the independent variables are: CPI (X1) in 2020 = 6% (assuming stable inflation), Loans and Advances in 2020 = INR 158,127 billion (Taking loans and advances growth rate of 20% year on year as we expand the banking services), GDP in 2020 = 4189 billion USD (Taking growth rate of 7% year on year).With this the Forecasted NPA = INR 7808 billion which is 4.94% of the Average Loans and Advances of all scheduled commercial banks.
and Loans & Advances.The table below shows results from stepwise regression.
A logarithmic regression model was run between Ln (NPA) as dependent variable and Ln (CPI) and Ln (Advances) to check if there exists a polynomial relationship between NPA and the mentioned independent variables.The table below shows results from the logarithmic regression model.
are smoothing parameters at is the smoothed level at time t, B t is the change in the trend at time t, S t is the seasonal smooth at time t, p is the number of seasons per year.Holt Winter was used to forecast SBI NPA's and results are given below and see if there is any effect of seasonality in the data.The error indices are on the higher side which is indicative to the fact that NPA's depend mainly on external macro-economic factors.