Modeling of Imperfect Data in Medical Sciences by Markov Chain with Numerical Computation

In this paper we consider sequences of observations that irregularly space at infrequent time intervals. We will discuss about one of the most important issues of stochastic processes, named Markov chains. We would reconstruct the collected imperfect data as a Markov chain and obtain an algorithm for finding maximum likelihood estimate of transition matrix. This approach is known as EM algorithm

is the random process of Markov chain.Markov chain is often used to describe how the system changes over time.If we are able to model the system sufficiently to form a Markov chain model, we can then evaluate its various theoretical results through system analysis.In this article, the Matrix transition of Markov chain needed for imperfect observation within a given time is obtained through EM algorithm.
The EM algorithm has two stages: E stage and M stage.This algorithm basically works in this way: We first take into consideration the initial value for the parameter which in this case is the transition matrix.
The full set of data ( X ) is then reconstructed from missing data by estimation of these parameters in E stage.In the M stage, the maximum likelihood of reconstructed data is maximized to get a new estimation of the parameter.Then, based on this new estimation, the E stage is repeated and we get the second estimation in the M stage.Finally, we continue this process as far as to get convergence estimates.
Dempstr [1], Raj [2], Johnson et al. [3] have obtained the value of maximum likelihood of the transition matrix for a sequence of observations.Among the researchers who examined the issues of statistical inference of Markov chains and published articles in this area are Melichson [4], Capee [5], Chip [6] and Robert [7].

Modeling of Missing Data Based on Markov Chain
Following observations in Table 1 are series of real data obtained about a kind of disease related to 3 patients at different times during 6 months in a hospital.We have missed some of the data because the patients have not attended the hospital in some cases.Number 1 indicates that the patient's condition is satisfactory and number 2 shows the patients are in a very bad condition.The asterisk indicates that in a particular time point, the data has not been observed.
First, the patient's condition is reported as: 1) Next month, no data are is available.Two months later the condition of that patient is reported as; 2) One month later his condition is reported as (1) again, and so on.We call such a set of data incomplete and it is shown as Y .
The incomplete data are collected against complete data ( X ) which are gathered in all intervals between the first and the last observation.
Such a system can be modeled as a Markov chain with E condition space and transition matrix P.
Generally, we will have ij X N = if we consider ijt O as the number of condition change from i to j at the time interval t .ijk P as the transition probability from i to j condition at t time, ij N as the number of condition change from i to j in a time unit for the whole X data , and ( ) as the element of matrix r P , i.e. condition change probability r a stage from i to j condition.Having this fact in mind that with the incom- plete data Y and the nonzero suitable initial transition matrix, there will be different complete data X with their own probabilities, we estimate the transition matrix by the EM algorithm as follows: 1-First, we change the specified incomplete data ( Y ) to complete data ( X ) by considering a suitable amount for those intervals we have no observation and data.In other words, we reconstruct the complete data ( X ) from the incomplete data.
2-We consider an initial value for an arbitrary non-zero element of the transfer matrix as ( ) 0 P .3-We compute maximum likelihood of X observations with the initial transmission matrix ( ) 0 P as fol- lows: 4-We calculate the relative maximum (1) and show it as ( ) 5-We repeat the steps 3 and 4 so much that the obtained matrix sequence Theorem 1: Suppose that the matrix of the imperfect data in Table 1 is P , then Proof: We assume the system at the time 0 t is in the m condition and at the next time interval it is in the n condition.The probability that a change occurs between conditions i and j is done in the k time in- terval is computed according to the Figure 1 provided that the system gets to n condition at 0 t t + from the starting point condition m at 0 t .
First the system starting from m condition changes to i condition with the probability of ( ) at the time unit of k .Then the system is likely to change to n condition at 0 t t + with the probability of ( ) Suppose that we define A and B as follows: A : Cases in which changes from i to j condition occurs at 0 t t + .B : Cases in which system starting from m condition at 0 t gets to n condition at 0 t t + .Then we have: Theorem 2: Suppose that the matrix transition of the imperfect data in Table 1 is P , then there is a unique relative maximum with its entry is as follows:

Proof:
( ( ) Equation ( 2) has been obtained because the sum of entries in each line is one.
, , Adding Equation ( 3) on i , we have: , By replacing Equation (4) in Equation ( 3), the proof is completed.Remark: the fourth stage in algorithm is computed and algorithm is completed as the following: 1) We compute 2) We compute matrix transition of first stage through this formula: The obtained ij P s for all i and j forms another matrix in the form of ( ) 1 P for the next iteration.

Numerical Computation
In this section, all the theories mentioned are surveyed on the Table 1 through a program written in R soft- ware.
For actual observations we have Table 2 which is number of change from i to j at the time interval t .Conducting computer program written on the bases of the EM algorithm to estimate the matrix transition of the system and the original transition ( ) 0 ij P , we will have the following outputs: EM Algorithm with another initial matrix will be: As it is seen, the results ( 5) and ( 6) are the same in either way.

Conclusions
In using the EM algorithm, we are practically faced with some issues which should be focused to accelerate our work and save time. 1) If the initial transition matrix ( ) 0 P , which is arbitrarily chosen, is of a zero entry or the number of transi- tions from one condition to another one is zero in a unit of time, the corresponding entries of all produced matrices are zero, too.The initial matrix entries must be non-zero to avoid such condition.
2) Increase in the number of time units between successive observations increases the power of transition matrix.If the number of the observed condition changes is relatively low, the time interval between observations is high.Accordingly, with respect to time needed for computation, it is not economical to integrate all these to- gether in the analysis.Actually it is better to keep the maximum time interval at the level in which the probability of condition change is ignored.If the maximum time interval is too large, the increase in the time unit can be effective.
3) If the maximum time intervals are relatively low, the computation time of each stage is short, although different performances may be necessary before achieving convergence.
4) In practical computation, we have accepted that when the maximum difference between successive entries of transition matrix is smaller than a predetermined value, e.g. .In other words, convergence is achieved when for all values of i, j, and the given value for r, we have: P matrix, in other words, our optimal estimate is calculation and unique maximum p are stated through the following theorem.
It is clear that the probability of the number of condition changes between the i and j conditions along with a change from m to n condition at the t time interval is equal to: expected number of such condition changes in all time intervals is: n conditions repeatedly occurs at t intervals.Therefore, the expected number of transitions from i to j condition on the condition of observed data Y and Matrix transition P accumulating on all observed data is obtained as follows:

Figure 1 .
Figure 1.System gets to n condition.

Table 1 .
Missing data for patient.

Table 2 .
Condition change from i to j at the time interval t for miss- ing data.
ijt O : number