A Preliminary Study on Parkinson’s Disease with Regularized Logistic Regression Method

Parkinson’s disease is a long-term degenerative disorder of the central nerv-ous system. In this paper, we have used gradient descent method with regularized logistic regression to give suggestions on the possible reason for the rise of this disease. Results with different regularization are presented and com-pared. We hope this could open a new way for studying complex diseases using data methods.


Introduction
Parkinson is a very common disease around the world. It has been known from ancient times. Parkinson has another name "Kampavata" in the ancient Indian medical system. But in a Western country, we call it "Parkinson". Parkinson formed by nerve cell damage in the brain causes dopamine levels to drop, after dopamine decreasing in our body system, leading to tremor. Tremor often occurs in one hand; it looks like shaking hardly and frequently especially in finger, hand or foot. Other early signs of Parkinson's disease are voice changes, uncontrollable movements during sleep, unbalance of handwriting, and limb stiffness or slow movement etc.
Here, we try a new method to investigate the reason that causes Parkinson by data analysis with regularized logistic regression method (this method will be mentioned later). There are certain factors which cause Parkinson. The

Background
In the advent of global aging, the rapid growing of elderly populations results in a rising number of PD patients, especially in China. According to the statistical figure "Prevalence of Parkinson's disease in China". The rates for the whole age group were 8 -18/100,000 years, 50/100,000 years for over 65 years, 150/100,000 years for over 75 years, 400/100,000 years for over 85 years [1]. According to the cumulative incidence of age, the risk of Parkinson's disease among 60-year-olds at age 80 is about 2.5%. These data were derived from other studies, and we found that the population with Parkinson's disease was predominantly elderly.
However, we still need to collect data from different countries to prove that Par-   [8]. A measure of the relation is between the mean value of one variable and the corresponding values of other variables. For the initial steps, it is necessary to set up two values for helping load data from the Parkinson's data table such as x and y. Both represent the data in direction order likes from where to the end. Also label values vertically and horizontally such as x label ("Measure 1") and y label ("Measure 2"). In the last step of the initial part, the last one is specified in plot order which means given specific and different standards like positive or negative. The reason to set up the direction to discover the overall Parkinson group being easy to recognized the patients by the plot that we can use in the actual psychological experiment. For example, x 1 + x 2 = 10, predict y = 1 if x 1 + x 2 > 10, and y = 0 if x 1 + x 2 < 10. Legend 1 = legend("y = 1", "y = 0") here using 1 and 0 to represent the healthy person and Parkinson patients (see Figure 1). The steps here are to establish the foundation of a map a feature. To develop the idea of Parkinson's diseases in many complicated factors. In logistic regression utilized sigmoid function because sigmoid can create a perfect curve line from negative infinity to positive infinity such as (0,1). Notice Equation (1) is sigmoid function or logistic function.

Logistic Regression Analysis of Parkinson's Disease
Logistic regression has benefits that come out appropriate S-shaped curve.
Sigmoid function to return a probability value that makes easier to map the

Predict the Size and Accuracy of the Feature
The last steps provided a basic structure of the graph. The next part(Part one) was matching the size of the feature. So set the value of x by seeking its size The purpose of using gradient descent is finding the minimum of a function of multiple variables in order to figure out the data carry Parkinson's disease.
Then, compute and display initial cost and gradient for regularized logistic regression. Also, Equation (3)     If we look at the part before lambda as an "objective", then lambda is keep the training features small. On the one hand, when lambda estimates towards zero reduced the risk overfitting. On the another hand, Lambda set on zero because when lambda bigger and bigger then the decision boundary will not be accurate.
After set lambda, it can directly go to set options and optimized (meliorate) FUNC such as t, x, y, lambda, initial_theta, options. The last step in this part was the most important. For this step need to plot decision boundary. First, labels and legends are given measure as x and y on the coordinate axis. The form as x label ("Measure 1") and y label ("Measure 2"). Second, then repeat the same rule as Part one. Set "y = 1" for healthy people, "y = 0" for patients with Parkinson's disease. Third, the last term is training accuracy and predicts variables in Figure   3 and Figure 4.

Conclusion
In this paper, we have proposed a method with regularized logistic regression to study the origin of Parkinson's disease. Although this method is preliminary, we have given a decision boundary for deciding whether a patient has Parkinson's disease or not by including two measures. More measures could be included in the future with high performance computers. With the number of patients with Parkinson's disease increasing every year, the investigation into the disease is important. Since psychology is a discipline that needs to cooperate with statistics and conduct experiments, it is very important to analyze the independent variables in order to determine the outcome. Through calculation, the data came directly reflect the changes caused by the influence of some factors on the real population. We hope the method with preliminary results presented here could help open a new way for further investigation.