Analysis of a POMDP Model for an Optimal Maintenance Problem with Multiple Imperfect Repairs ()
1. Introduction
With recent advances in science and technology, many systems have become more complex and more massive. Accordingly, the breakdown or deterioration of such systems could frequently result in economic and social damage in our communities. To address those problems mathematically, preventive maintenance models for stochastically deteriorating systems have been widely investigated in the literature. Osaki [1] summarized the mathematical approach for reliability and maintenance. And the papers by Cho and Parlar [2] , Dekker [3] , and Wang [4] are excellent reviews of the subject. Since various techniques for detecting the deterioration level of systems have been developed and available without expensive cost, one can adopt condition-based maintenance (CBM) rather than time-based maintenance (TBM) in many cases. The above three reviews include studies on CBM as well as TBM. Jardine et al. [5] summarized and reviewed the recent research and developments in diagnostics and prognostics of mechanical systems implementing CBM with algorithms and technologies for data processing. Also, Olde Keizer et al. [6] and Alaswad and Xiang [7] reviewed stochastic models for addressing issues on CBM policies.
For constructing a stochastic model for CBM, I need first express a deterioration of a system as some kind of stochastic process. If one can assume that the deterioration levels of a system correspond to finite non-negative integers, then the deterioration of the system could be described as a Markov chain with an absorbing state which expresses a failure of the system. This model is called Markovian deteriorating system and Markov decision processes (MDP) are used for addressing maintenance problems. Derman [8] , Douer and Yechiali [9] , Hopp [10] , Kurt and Kharoufeh [11] [12] , and Tamura [13] studied the discrete-time Markovian deteriorating system and derived the sufficient conditions that some monotone properties including the control-limit policy hold. Also, Lam and Yeh [14] [15] [16] , and Yeh [17] [18] studied the continuous-time Markovian deteriorating system. These studies have some barriers from realistic viewpoints because it is too costly or too technically difficult to detect the true state of a system in practical situations. In these cases, a system is monitored and some incomplete information on the true state of the system is obtained to select an optimal action. When I observe alternative characteristics instead of a true deterioration level of a system at each inspection, the result has some kind of relationship with the true deterioration level although it does not express the true deterioration level. Then, partially observable Markov decision process (POMDP) models are available to express the behavior of such systems. POMDP models use a so-called state vector whose element means the probability of a true state. If an element of a state vector is always 1 in a POMDP model, it coincides with an MDP model. Ross [19] incorporated the concept of incomplete information into the Markovian system with two states. However, it was a little different from the POMDP model. Afterwards, Ohnishi et al. [20] , Grosfeld-Nir [21] [22] , and Chen et al. [23] studied the POMDP model for the maintenance problem where only replacement can be selected as a maintenance action. Srinivasan and Parlikad [24] studied the extended POMDP model which is called a POSMDP (Partially Observable Semi-Markov Decision Process) for optimizing maintenance decisions. Oosterom [25] used the new stochastic order to analyze the POMDP model. And Kivanc et al. [26] applied the POMDP model to real problems. On the other hand, Ohnishi et al. [27] proposed the POMDP model with minimal repair as well as replacement and showed that the monotone properties hold under several assumptions. Pham [28] reviewed the studies on imperfect maintenance. Then, Ivy and Pollock [29] introduced the concept of repair, which is different from minimal repair, in addition to replacement for the POMDP model and derived the marginal monotonicity for the two-state model. Tamura et al. [30] and Fan et al. [31] also studied the POMDP model with repair and replacement. Then, these models assumed one option for repair. In practical situations, however, I can choose some options for repair of which the effects are different.
In this paper, I construct a POMDP model where one of H imperfect repairs can be selected as a maintenance action as well as replacement. For the model, I theoretically investigate the structure of an optimal maintenance policy minimizing the total expected discounted cost for unbounded horizon via the theory of stochastic order relations. In particular, I derive sufficient conditions for monotone properties of the optimal maintenance policy.
2. Model Description
Now I consider a system whose deterioration follows a discrete-state and discrete-time Markov chain with an absorbing state. The system can be classified into one of
states, and then, state 0 represents the process before any deterioration takes place that is, it is an initial new state, whereas,
is a failure state of the system. The intermediates state 1 ∙∙∙ N are ordered to reflect the relative degree of deterioration in ascending order. Thus, I let
denote the transition probability of the Markov chain and let
denote the transition probability matrix.
expresses that state space of the Markov chain. And I let
denote the probability that the true state of the system is
. Also,
means the
dimensional vector that the ith element is 1 and other elements are all zero. The model supposes that an action is selected based on the state vector
instead of the true state i. Thus, I introduce
which means the probability that an outcome
is obtained through monitor given that the true state of the system is i, where
and
. To express that the result of repair is uncertain as well as imperfect, I let
denote the repair probability for the system when repair h is selected, where
and
. That is,
is a set of options for imperfect repair.
Under these settings, at each time epoch, I can select wait, repair h (
), or replacement as follows:
Wait: The system is operated for one period and it isn’t directly inspected after operation. Thus, the system in state i moves to state j with probability
after one period with operation. Instead of inspection, I monitor the system and obtain an outcome
with probability
.
Imperfect repair: I decided on an option among the H possible imperfect repairs. If repair h is selected as an optimal option, then the true state of the system is identified through inspection before repair and the system in state i is repaired to state j with probability
.
Imperfect repair: The system is replaced by a new and identical one. Thus, after replacement, the system in state i is certainly returned to state 0.
In the next, I assume some costs and probabilities for the model. Hereafter, I consider that increasing means non-decreasing and decreasing means non-increasing, respectively. And
means a transpose of a vector
. Before explanation, two stochastic order relations are defined below.
Definition 1. For a non-negative matrix
on
,
means
has a property of totally positive of order 2, that is,
holds for
and
. Similarly,
means
has a property of usual stochastic order, that is, for any h,
holds when
. Furthermore, for non-negative vectors
and
, if a matrix
then I write
. And if
, then I write
.
For the above stochastic order relations, I provide some important results as a lemma.
Lemma 1.
(i) If
then
.
(ii) For any increasing function
on i where
, if
then
.
(iii) If
then
.
The details of the stochastic order relations including
and SI are explained by Kijima [32] and Shaked and Shanthikumar [33] . Thus, I omit the proof of Lemma 1.
I introduce the following assumptions for the probabilities
,
, and
.
Assumption 1.
and
.
Assumption 2. For any
,
.
In Assumption 1, the assumption on the transition probability matrix means that as the system deteriorates, it is more likely to move to worse states, and the assumption on
means that the higher deterioration of the system gives higher outcomes from the monitor. These two assumptions in Assumption 1 have been used in the concerned previous studies to derive the monotone properties. Assumption 2 means that as the system deteriorates, it is less likely to be returned to better states through repair.
When each action is selected, a cost is incurred as follows:
: operating cost for the system in state
.
: repair cost for the system in state
under repair
.
: replacement cost for the system in state
.
And I define
where
means a set of
dimensional vectors whose elements are real numbers. For these costs, I introduce the following assumptions.
Assumption 3. For any
,
,
, and
.
Assumption 4. For any
,
,
, and
.
Assumption 3 indicates that as the system deteriorates, it is more costly to operate, repair, or replace the system. Also, the first relation of Assumption 4 means that, as the system deteriorates, the merit of repair becomes bigger than that of operation. Similarly, in terms of the latter two relations, the merit of replacement becomes bigger than those of operation and repair with deterioration, respectively. The previous studies have also assumed these relations for the analysis of the POMDP models.
3. Formulation
This section provides the formulation of our problem as a partially observable Markov process model and derives a total expected discounted cost for an unbounded horizon. For the formulation, I define
where x is the probability that the true state of the system is i as a state vector. And I let
denote the probability that the true state is j when the current state vector is
and the outcome is obtained as
through monitor after operating the system. And
is a discount factor. Then, by using Bayes’ formula, I have
(1)
and I define the jth element of
is
. Also, I let
denote the probability that the outcome is
through monitor given that the current state vector is
and I have
(2)
Hence I let
,
, and
denote the optimal expected discounted cost when wait, repair h, and replacement are selected at state vector
, respectively. When
is expressed as
(3)
Also, I have
(4)
(5)
where
Now I suppose that
is the optimal expected discounted cost when an optimal action is selected at state vector
and the optimal maintenance policy is adopted afterwards. Then I have
(6)
This problem can be treated as a Markov decision process with a state space
where
Thus, by solving Equation (6), I can find the optimal maintenance policy which minimizes the total expected discounted cost for an unbounded horizon. Hereafter, for simplicity, when a function
satisfies
for
, I describe this relation as
is increasing
with respect to
.
4. Structure of the Optimal Maintenance Policy
In this section, I provide some results on the structure of the optimal maintenance policy. Before proceeding discussion, I summarize several results as the following lemm in preparation.
Lemma 2.
(i) If
then
and
.
(ii) If
then
.
(iii) If
then
.
(iv) If
then
.
(v) If a real valued function
is increasing in
for any
and is increasing in
with respect to
, then
The results (i)-(v) are proved by Ohinishi et al. [20] .
4.1. Basic Results
Next, some lemmas which are necessary to derive the main results of this study are developed. In previous studies, similar lemmas were presented and used to derive the monotone properties.
Theorem 1.
is an increasing function of
in terms of
and is concave in
.
Proof. First, I show that the characteristics of
,
, and
, respectively.
is assumed to be the expected cost when an optimal action is selected for the n-period version of our problem and I use the mathematical induction method, where
. Then, I have
When
,
,
, and
are increasing in
from Assumption 3. Thus,
is also increasing in
. Thus, I suppose that
,
,
are increasing in
. The first terms of these three functions are increasing in
from Assumption 3. Since the second term of
is increasing in
from Lemma 2 (v) and the assumption on
,
is increasing in
. The second term of
is a linear function of
and
for
from
for
and Lemma 1 (i)-(ii) and Lemma 2 (i). Thus,
is increasing in
. And
is increasing in
since the second term is constant, that is, it does not depend upon
.
Second, I consider
through the results on
,
, and
. When I use the argument similar to those of Ohnishi et al. [20] , the second term of
is increasing and concave in
. Since
and
are increasing and linear in
,
is increasing and concave in
. Hence the proof of this theorem completes. □
In addition, I have a result on the difference of the costs as follows.
Lemma 3. For any h,
is concave in
.
and
are increasing in
with respect to
.
Proof. Since
(7)
the first and the third terms are linear, the second term is concave in
from the proof of Theorem 1. As a result, the difference between the second and the third term is concave in
. Thus. Equation (7) is concave in
for any h. And since
(8)
the first term is increasing in
from Assumption 4, the second term is also increasing in
from the proof of Theorem 1, and the third term is constant for
. Thus, Equation (8) is increasing in
. And finally, since
(9)
the first term is increasing in
from Assumption 4, the second term is also increasing in
from the proof of Theorem 1, and the third term is constant for
. Thus, Equation (9) is increasing in
for any h. Hence, the proof of this lemma completes. □
4.2. Monotone Properties
Next, I provide some structural properties of the optimal maintenance policy. For the explanation, I let
denote an optimal action selected at state vector
. And I define
Then, I have the optimal maintenance policy has the structure expressed by Theorem 2.
Theorem 2. There exist real numbers
,
,
such that
where
for
and
.
Proof. I use Theorem 1 and Lemma 3 for the proof. Since
is concave,
and
intersect twice at most. Also, since
and
are increasing in
, as a result, each pair of
and
and
and
intersect once at most. Hence the proof of Theorem 2 completes.
Theorem 2 means that the optimal maintenance policy may be classified into 4 regions at most. In particular, since repair includes inspection proposed by Ohnishi et al. [20] if I suppose
for any
and any
, the structure of the optimal maintenance policy is also similar to that of Ohnishi et al. [20] . Except for Ohnishi et al. [20] , some studies have previously derived monotone properties by introducing several assumptions on costs and probabilities. However, those do not consider that an optimal repair is selected from multiple imperfect repairs whose respective effects are different.
Also, Theorem 2 gives us the following interpretation. In this section, hereafter, I use
instead of
. When a state vector is less than
or between
and
with respect to
order, I should operate the system, that is, maintenance actions are not necessary. Focused on maintenance actions, if a state vector stays between
and
then an optimal imperfect repair is selected and finally, the system should be replaced whenever a state vector exceeds
.
Theorem 2 is useful for numerically finding an optimal maintenance policy. However, if the sequence of optimal imperfect repairs has some characteristics when
, then it would be possible to solve the optimal maintenance problem further efficiently. Thus, by introducing some additional assumptions, I obtain structural properties focused on imperfect repair.
Assumption 5. For any
such that
, I suppose that,
(i)
,
(ii)
,
(iii)
,
(iv)
.
Assumption 5-(i) and 5-(iii), and Assumptions 5-(ii) and 5-(iv) do not hold together, respectively. Then, Assumptions 5-(i) indicates that it is less costly to select repair h compared with repair h' and Assumption 5-(ii) indicates that the system is more likely to be returned to better states through repair h compared with repair h', as the system deteriorates. Thus, Assumptions 5-(i) and 5-(ii) mean that the merit of repair h becomes bigger than that of repair h’ with deterioration of the system. Also, Assumptions 5-(iii) and 5-(iv) have the contrary interpretation.
Under some of the assumptions given by Assumption 5, I have a property on the relation of the optimal repairs.
Theorem 3. I suppose that
and
for
. If Assumptions 5-(i) and 5-(ii) hold then
. And if Assumptions 5-(iii) and (iv) hold then
.
Proof. I define that, for
and
,
Also, I suppose that
If Assumptions 5-(i) and 5-(ii) hold then
and this means that there exists a vector
which satisfies
Hence, if
and
then
since I suppose that
. And if Assumptions 5-(iii) and 5-(iv) hold then
This indicates that there exists a vector
which satisfies
Thus, if
and
then
Hence, the proof of this theorem completes.
Theorem 3-(i) means that I should select larger numbered repairs in lower state vectors in terms of
and Theorem 3-(ii) has the contrary interpretation. This theorem is also useful for finding an optimal maintenance policy more efficiently. For example, I suppose that there are H options on imperfect repair and
for
in terms of
In this case, Theorem 3 tells us that, under Assumption 5, the number of options on imperfect repair gradually reduces from H as a state vector in
increases, that is, the system deteriorates.
5. Discussion
In this section, I explain the significance of Theorems 2 and 3 to find an optimal maintenance policy. Generally, I know that monotone properties are available to find an optimal maintenance policy numerically.
Now this section supposes that the system has two states as good (0) and bad (1) and there are three options for imperfect repair (
). And the state vector
becomes a scalar x which means a probability that the system stays in a bad state. In this case, since I have to decide on an optimal action
for
, our problem has an infinite state space. However, if Theorem 2 holds, that is, I can know the problem satisfies Assumptions 1, 2, 3 and 4 in advance, then I may find 3 real numbers
,
,
in Theorem 2. Then, I first compare
and
for
and find real numbers
and
such that
Second, similarly, I compare
and
for
and find
such that
, since
for
. Thus, Theorem 2 can reduce the computational effort for finding an optimal maintenance policy numerically. This usefulness is conserved for more than two states.
To find the optimal maintenance policy in the above, I need obtain
, that is, the comparison of
for
arises in this problem. Then Theorem 3 is useful to address the problem. Since I have 3 options for imperfect repair, if the costs and the probabilities satisfy Assumptions 5-(iii) and 5-(iv) then Theorem 3-(ii) holds. In this case, comparing
with
, I find
such that
for
as a first step. In the next, I find
that
, that is,
for
through comparison of
and
. And finaly, I find
that
for
through comparison of
and
. As a result of this procedure, I can find the real numbers
,
,
,
and
, and determine the optimal maintenance policy that satisfy
. On the other hand, if the costs and the probabilities satisfy Assumptions 5-(i) and 5-(ii) then I may use Theorem 3-(i) to find an optimal maintenance policy where
.
Even if the number of states is more than 2, I can similarly use Theorems 2 and 3. However, the computational effort becomes much harder with larger states.
6. Conclusions
In this paper, I considered a system whose deterioration follows a discrete-time and discrete-state Markov chain under incomplete information on the true state and constructed a POMDP model for an optimal maintenance problem. I have several options for repair which has different recovery effects on the system in addition to replacement as maintenance actions. Then, I derived sufficient conditions that the monotone property holds as per Theorem 2. As described in Section 4.2, since this theorem is a generalization of some previous results and is useful for efficiently solving the optimal maintenance problem since the optimal maintenance policy may be classified into 4 regions at most.
Furthermore, I established the sequence of optimal repairs under several assumptions in Theorem 3. Since the previous studies on POMDP models for optimal maintenance problems do not focus on the concept of multiple imperfect repairs that the repair probabilities are different, Theorem 3 is a significant property in this area even if the assumptions are so strict. In particular, the feature that the number of options on imperfect repair gradually reduces with the increase of a state vector in terms of
is so interesting.
I can check whether the costs for the respective actions and the transition and repair probabilities satisfy the assumptions for Theorems 2 and 3 before solving an optimal maintenance problem. Thus, as discussed in Section 5, the theorems obtained in this study are available to reduce the computational time since the number of regions of the optimal maintenance policy can be limited.
However, I do not provide quantitative analysis through numerical experiments. While there exist algorithms for POMDP models (see e.g. Monahan [34] , Guo and Lian [35] ), as stated by the previous studies, POMDP models with many states have an aspect of time-consuming computation. Thus, future works are to construct an efficient algorithm for solving our problem under the assumptions for the monotone properties and to obtain some characteristics of the optimal maintenance policy through numerical analysis. Also, I need to consider how to relax the sufficient conditions for the theorems. In particular, four conditions in Assumption 5, which are used for deriving Theorem 3 and are originally proposed in this study, are so strict that it would be difficult for them to be satisfied in real situations. Hence, it is important to construct a numerical method for addressing the optimal maintenance problems without Assumption 5.
Acknowledgements
This work was supported by JSPS KAKENHI Grant Number JP20K04989.