Energy and Power Engineering, 2013, 5, 1377-1382
doi:10.4236/epe.2013.54B261 Published Online July 2013 (http://www.scirp.org/journal/epe)
Research on Hidden Failure Reliability Modeling of
Electric Power System Protection
Jingjing Zhang, Ming Ding, Xianjun Qi, Yi Guo
School of electrical engineering and automation, Hefei University of Technology, Hefei, China
Email: dragonzjj@126.com
Received February, 2013
ABSTRACT
Aiming at digital relay protectio n system, a novel hidden failure Markov reliability model is p resented for a sing le main
protection and double main protection systems according to hidden failure and protection function under Condi-
tion-Based Maintenance (CBM) circumstance and reliability indices such as probability of protection system hidden
failure state are calculated. Impacts of different parameters (containing impacts of human errors) to hidden failure state
probability and the optimal measures to improve reliability by variable parameter method are also analyzed. It’s dem-
onstrated here that: Compared to a single main protection, double main protection system has an increased hidden fail-
ure probability, thu s the real good state probability decreases, two main p rotections’ reliability must be improved at the
same time, so configuration of the whole protection system for the component being protected can’t be complicated.
Through improving means of on-line self-checking and monitoring system in digital protection system and human reli-
ability, the real application of CBM can decrease hidden failure state probability. Only th rough this way can we assure
that the protection systems work in good state. It has a certain reference value to protectio n system reliability engineer-
ing.
Keywords: Double Main Protection System; Hidden Failure; Markov Method; Condition-Based Maintenance (CBM);
Human Error
1. Introduction
Ref.[1-5] are the first to explore hidden failures in pro-
tection system carefully, later many experts carried re-
search on protection hidden failure and its contribution to
protection system reliability and power system reliability
and have obtained many good results[6-14]. Now, CBM
(Condition-Based Maintenance) is presented to apply in
power system and protection system in China, hidden
failure of protection is defined as a function defect of
protection device before; under new CBM circum-
stance [15,16], hidden failure is defined as a hidden de-
fect of protection that can’t be detected by means of
CBM such as on-line self-checking and monitoring sys-
tem, and it may result in mal-operation or non-operation
of protection system under certain condition, for example,
settings of protection don’t change according to the op-
eration mode of protected equipment. Application of
CBM is based on condition of protection device instead
of operation time, it can decrease test time and test cost.
CBM is carried on aiming at hidden failure state of pro-
tection system; the level of its putting into practice de-
termines the level of protection system’s good state.
When carrying on reliability research of protection
system using Markov method, it’s often assumed that
failure rate and repair rate of protection is constant, and
CBM Substitutes routine test by using on-line
self-checking and monitoring method, the routine test
interval doesn’t need to be considered. In the following,
aiming at digital relay protection system, a novel hidden
failure Markov reliability model will be presented for a
single main protection and double main protection sys-
tem separately, according to hidden failure and protec-
tion function under CBM circumstance, reliability indi-
ces such as probability of protection system hidden fail-
ure state will be calculated. Impacts of different parame-
ters (containing impacts of human errors) to hidden fail-
ure state probability an d the opti mal measures to impr ove
reliability by variable parameter method will be analyzed.
It can present a certain reference value to protection sys-
tem reliability engineering and application of CBM in
protection system.
2. Hidden Failure Reliability Model of Single
Protection System
First, hidden failure reliability model of a single main
protection is presented by Model 1, as Figure 1 shows.
Copyright © 2013 SciRes. EPE
J. J. ZHANG ET AL.
1378
The protected component has two states: normal state UP
and outage state DN; protection has four states: normal
state UP and failure state DN, hidden non-operation state
DUN and hidden mal-operation state DUM. It’s assumed
that CBM can’t check all failures of protection system,
so protection system may stay in hidden failure state;
because hidden failure state isn’t failure state, it has no
fault consequence, it doesn’t belong to mal-operation
state or non-operation state; it only shows that the pro-
tection system is in a hidden unhealthy state and may
malfunction under some circumstances. For example,
protection system in hidden failure state may incorrectly
mal-operate when fault happens outside the protected
zone, it may incorrectly refuse to operate when fault
happens inside the protected zone.
When doing research on reliability of protection sys-
tem, each state of the system must be considered, so is
probability of each state and the transition rate between
states. Markov process is a useful tool to analyze these
questions. In Figure 1, state 1 is normal state of compo-
nent being protected and protection equipment; state 2 is
that when component fails, its protection operates cor-
rectly; after component being repaired, it goes to state 1;
state 3 is that component is good, protection has self-
checkable failure; state 4 is that component is good, pro-
tection has non-self-checkable mal-operation failure;
state 5 is that component is good, protection has non-
self-checkable non-operation failure; state 6 is that hid-
den mal-operation is triggered under external fault or it’s
own fault condition, and non-self-checkable mal-opera-
tion of protection happens; state 7 is that when compo-
nent fails, non-self-checkable non-operation of protection
happens; if component is repaired first, it goes to state 3;
if protection is repaired first, it goes to state 2; state 8 is
that component fails, protection’s mal-operation is con-
sidered as correct operation, after component is repaired,
it goes to state 4. Hidden mal-operation state (state 4 ) can
convert to hidden non-operation state (state 5) and vice
versa.
In Figure 1,
C is failure rate of component being
protected,
c is repair rate of component being protected,
P is failure rate of protection(it consists of hardware
failure rate and software failure rate), C1 is self-check-
able success rate of protection, C3 is mal-operation per-
centage of protection, C5 = C3
P(1-C1) is non-self-
checkable mal-operation rate of protection, C6 =
(1-C3)
P(1-C1) is non-self-checkable non-operation rate
of protection,
1is repair rate of protection,
ext is failure
rate of external fault of component being protected.
() 0PnB (1)
8
11
i
i
p
(2)
3
C:DN
P:DN
2
C:DN
P:UP
1
C:UP
P:UP
7
C:DN
P:DN
λ
C
5
C:UP
P:DUN
u
c
8
C:DN
P:DUM
4
C:UP
P:DUM
6
C:DN
P:DN
u
1
P
C
1
5
C
6
C
6
C
5
C
λext
+λp
u
1
u
1
λ
C
λ
C
u
c
u
c
Figure 1. Hidden failure reliability model of single main
protection system.
Through Equation (1) and (2), we can get stable state
transition probability matrix B and each state probability
12 8
() [ ,,,]Pnp pp
.
Defining hidden failure state probability of protection
4hidden
pp
5
p (3)
Defining hidden mal-operation failure state probability
of protection
4hw
pp (4)
Defining hidden non-operation failure state probab ility
of protection
5hj
pp (5)
3. Hidden Failure Reliability Model of
Double Main Protection System
Reliability model of double main protection system is
presented by Model 2, as Figure 2 shows. The model is
similar to Model 1, but it’s more complicated for double
main protection, protection P1 and P2 has identical posi-
tion. Define
P as failure rate of protection P1, the pa-
rameters of main protection P1 is identical to that of
Model 1.
As for protection P2,
P2 is failure rate of protection,
C2 is self-checkable success rate of protection, C4 is
mal-operation percentage of protection, C7=C4
P2(1-C2)
is non-self-checkable mal-operation rate of protection,
C8=(1-C4)
P2(1-C2) is non-self-checkable non-operation
rate of protection,
2is repair rate of protection,
is repair
rate of both protection at the same time. Define:
C9=C1
PC10=C2
P2 .
Defining reliability indices similar to Model 1,
56789 10 11
12 13 14 15 16
hidden
pppppppp
ppppp
 
 (6)
56789101hw
pppppppp
4
 (7)
13 15 16hj
pppp
 (8)
Copyright © 2013 SciRes. EPE
J. J. ZHANG ET AL.
Copyright © 2013 SciRes. EPE
1379
12
C:UP
P1:UP
P2:DUN
14
C:UP
P1:DUN
P2:DUM
8
C:UP
P1:DUM
P2:DUN
13
C:UP
P1:DUN
P2:DUN
7
C:UP
P1:DUM
P2:DUM
3
C:UP
P1:DN
P2:UP
4
C:UP
P1:UP
P2:DN
2
C:DN
P1:UP
P2:UP
11
C:UP
P1:DUN
P2:UP
5
C:UP
P1:DUM
P2:UP
1
C:UP
P1:UP
P2:UP
6
C:UP
P1:UP
P2:DUM
25
C:DN
P1:UP
P2:DN
26
C:DN
P1:DN
P2:DN
20
C:DN
P1:DUM
P2:DN
24
C:DN
P1:DN
P2:UP
18
C:DN
P1:UP
P2:DN
23
C:DN
P1:DN
P2:DN
17
C:DN
P1:DN
P2:UP
21
C:DN
P1:DN
P2:DUN
9
C:UP
P1:DUM
P2:DN
10
C:UP
P1:DN
P2:DUM
16
C:UP
P1:DUN
P2:DN
15
C:UP
P1:DN
P2:DUN
19
C:DN
P1:DN
P2:DUM
22
C:DN
P1:DUN
P2:DN
1
1
u
12
u
2
u
c
u
c
u
1
u
2
u
c
u
2
u
2
u
2
u
1
6
12
11
u
2
u
1
u
1
28
C:DN
P1:DUM
P2:UP
29
C:DN
P1:UP
P2:DUM
32
C:DN
P1:DN
P2:DUM
30
C:DN
P1:DUM
P2:DUM
u
c
u
c
5
11
31
C:DN
P1:DUM
P2:DN
24
12
6
5
u
1
uc
2
u
2
9
c
27
C:DN
P1:DN
P2:DN
1
5
c
5
c
5
c5
c
5
c
5
c
5
c
6
c
6
c
6
c
6
c
6
c
6
c
6
c
7
c
7
c
7
c
7
c
7
c
7
c
7
c
9
c
8
c
8
c
8
c
8
c
9
c
9
c
1
λext+λ
p
λext+λ
p2
λ
C
u
c
u
c
0
c
1
0
c
1
c
8
c8
c
0
5
c
6
c
7
c
8
c
8
c
25
1
0
c
λ
C
λ
C
λ
C
λ
C
λ
C
λ
C
λ
C
λ
C
λ
C
λ
C
λ
C
λ
C
λext+λ
p
u
1
u
1
λ
C
λ
C
u
1
u
1
u
2
u
2
u
2
λext+λ
p2
λext+λ
p2
λext
p2
λext+λ
p
λext+λ
p
Figure 2. Hidden failure reliability model of double main protection system.
4. Hidden Failure Reliability Model of Single
Protection System Considering Human
Error
Human error can be defined as any improper action, re-
sulting in events that will affect the proper action of the
system. From a system point of view, with reliable
hardware and software, human error remains as a great
threat to system safety [17-20]. For example, incorrect
operation of operating personnel occurred in South
America and North Mexico interconnected power grid
cascading outage on Sept. 8, 2011, so now it has been an
important fact o r that deserves our attent i o n.
The reasons for human errors are fatigue and sleep-
lessness, anger, emotional upsets, lack of skill, hunger,
letdown from low blood sugar, medication, drugs and so
on. Human error can be divided into seven kinds: design
error, operator error, fabrication error, maintenance error,
contributory er ro r, i nspection error and ha ndling erro r.
There are numerous techniques available for conduct-
ing human reliability assessment, such as THERP (tech-
nique for human error rate prediction), HEART(human
error assessment and reduction technique) and so on.
Through these methods we can achieve the failure prob-
J. J. ZHANG ET AL.
1380
ability of human operation. Here human error is de-
scribed by a mean failure probability of a constant.
The two fault modes for protection system are mal-
operation and non-operation, the impact of human error
to protection system also has two kinds: mal-operation
and non-operation. In the following analysis, it’s as-
sumed that human error appears after some operation and
repair.
Hidden failure reliability model of sing le main protec-
tion system considering human error is presented by
Model 3, as Figure 3 shows. This model is based on
Model 1, two kinds of human errors are considered: 1)
protection system mal-operation owing to incorrectly
operation of operating personnel, for example, dispatch-
ing personnel or operator on duty fails to follow correct
procedure; 2) protection system are not completely good
after repair, for example, settings of protection don’t
change after repair, this may cause hidden mal-operation
or non-operation of protection system.
In Figure 3, when protection P trips incorrectly owing
to human error, state 1 goes to state 6; when protection P
is not repaired completely owing to human error, state 3
goes to state 4 (hidden mal-operation state) or state
5(hidden non-operation state). As for protection P, Kh1 is
a mean human error rate; v1 is mal-operation percentage
owing to human error; so we can achieve the reliability
indices that are identical to Model 1.
5. Case Studies
Here, take the data of Table 1 for example, we calculate
the reliability ind ices of the three models and analyze the
results; the computation results are shown as Table 2.
Using variable parameter method, phidden curve of Model
1 under different C1 is shown as Figure 4 (that is to say,
under certain C1, when
P increases, we can obtain the
curve of phidden), phidden curve of Model 2 under different
C1 is shown as Figure 5 (to Model 2, when
P2 increases,
phidden curve under different C2 is the same as Figure 5),
impact of human error to phidden of Model 3 is shown as
Figure 6.
3
C:DN
P:DN
2
C:DN
P:UP
1
C:UP
P:UP
7
C:DN
P:DN
5
C:UP
P:DUN
u
c
u
c
8
C:DN
P:DUM
u
c
4
C:UP
P:DUM
6
C:DN
P:DN
u
1
u
1
u
1
v
1
K
h1
(1-v
1)
K
h1
P
c
1
5
c
6
c
6
c
5
c
v
1
K
h1
Pext
C
C
C
Figure 3. Hidden failure reliability model of single main
protection system considering human error.
Table 1. Reliability base data for the computation.
Parameter value Parameter value
C/ y-1 0.04
c/h-1 0.25
P/ y-1 0.08
1/h-1 0.25
P2/ y-1 0.08
2/h-1 0.25
v1=c3=c4 0.5
/h-1 0.25
ext/ y-1 0.005 Kh1/ y-1 0.001
C1=C2 0.9
Table 2. Reliability index calculation results.
Reliability index
Model phidden phw phj
Model 1 0.1263 0.0430 0.0833
Model 2 0.2318 0.0842 0.0119
Model 3 0.1263 0.0430 0.0833
10
-2
10
-1
10
0
0
0. 1
0. 2
0. 3
0. 4
0. 5
0. 6
p
(y
-1
)
P
hidden
C
1
=0.7
C
1
=0.99
C
1
=0.9
C
1
=0.8
Figure 4. phidden curve of Model 1 under different C1.
10
-2
10
-1
10
0
0. 1
0. 2
0. 3
0. 4
0. 5
0. 6
p
(y
-1
)
P
hidden
C
1
=0.99
C
1
=0.9
C
1
=0.8
C
1
=0.7
Figure 5. phidden curve of Model 2 under different C1.
Copyright © 2013 SciRes. EPE
J. J. ZHANG ET AL. 1381
10
-3
10
-2
10
-1
10
0
0.1263
0.1264
0.1265
0.1266
0.1267
0.1268
0.1269
0. 127
0.1271
K
h1
(y
-1
)
P
hidden
v
1
=0.1,0.3,0.5,0.7,0.9
Figure 6. Impact of human error to phidden of Model 3.
From Table 2, Figure 4 to Figure 6, we can draw the
conclusions:
Compared to Model 1, Model 2 has a higher phidden
and phw, a lower phj, this shows that redundant pro-
tection can decrease hidden non-operation state
probability, but at the same time it increases hidden
mal-operation state probability, thus hidden failure
state probability increases, so the completely good
state probability of protection system decreases.
When using redund ant protection, we must consid er
it.
To Model 3, when Kh1 increases, phidden increases;
when v1 increases as the arrow shows, phidden de-
creases; compared with Model 1, when Kh1 is small,
it rarely has impact on these indices. This means
that mean human error rate and mal-operation per-
centage owing to human error can affect hidden
failure state probability, so we must take all meas-
ures that can be done to decrease human rate error
and improve reliability of protection system.
From Figure 4 and Figure 5, we can see that the
curves of hidden failure state probab ility of Mod el 1
and Model 2 under different C1 are similar; w hen
P
increases, phidden increases; when C1 increases, phidden
decreases. This shows that failure rate of protection
and self-checkable success rate of protection can
affect reliability of protection system greatlyand
two main protection’s reliability must be improved
at the same time. Through improving means of
on-line self-checking and monitoring system in
digital protection system, the real application of
CBM can decrease hidden failure state probability.
When reliability of single main protection system is
high, we can consider simplified configuration of
the whole protection system.
6. Conclusions
Aiming at digital protection system, we must take meas-
ures not only to decrease mal-operation probability and
non-operation probability, but also to decrease hidden
failure state probability. Comp ared to a single protection,
double main protection system has an increased hidden
failure state probability, thus th e real good state p robabil-
ity decreases, two main protection’s reliability must be
improved at the same time, so configuration of protection
system for the component being protected can’t be com-
plicated(such as two out of three vote) . Human error rate
can increase hidden failure state probability of protection
system, human error must be reduced during normal op-
eration and maintenance process. Through improving
means of on-line self-checking and monitoring system in
digital protection system, the real application of CBM
can decrease hidden failure state probability. Only
through this way can we assure that the protection sys-
tems work in good state. It has a certain reference value
to protection system reliability en gineering.
7. Acknowledgements
This project is supported by State Grid Corporation of
China Major Projects on Planning and Operation Control
of Large Scale GridSGCC-MPLG024 -2012 ) , the
National Natural Science Foundation of China under
Grant (51007017) and Specialized Research Fund for
the Doctoral Program of Hefei University of Technolog y
2012HGBZ0657 ), the author thanks.
REFERENCES
[1] P. M. Anderson and S. K. Agarwal, “An Improved Model
for Protective-system Reliability,” IEEE Transactions on
Relibility, Vol. 41, No. 3, 1992, pp. 422-426.
doi:10.1109/24.159812
[2] S. Tamronglak, “Analysis of Power System Disturbances
Due to Relay Hidden Failures,” Ph.D. dissertation, Vir-
ginia Polytechnic State University, Blacksburg, 1994.
[3] S. Tamronglak, S. H. Horowitz, A. G. Phadke and J. S.
Thorp, “Anatomy of Power System Blackouts: Preventive
relaying Strategies,” IEEE Transactions on Power Deliv-
ery, Vol. 11, No. 2, 1996, pp. 708-715.
doi:10.1109/61.489327
[4] A. G. Phadke and J. S. Thorp, “Expose Hidden Failures to
Prevent Cascading Outages,” IEEE Computer Applica-
tions in Power, Vol. 9, No. 3, 1996, pp. 20-23.
doi:10.1109/67.526849
[5] P. M. Anderson, G. M. Chintaluri, S. M. Magbuhat and R.
F. Ghajar, “An Improved Reliability Model for Redun-
dant Protective Systems—Markov Models,” IEEE
Transactions on Power System, Vol. 12, No. 2, 1997, pp.
573-578. doi:10.1109/59.589606
[6] R. Billinton, M. Fotuhi-Firuzabad and T. S. Sidhu, “De-
termination of the Optimum Routine Test and
Self-checking Intervals in Protective Relaying Using a
Reliability Model,” IEEE Transactions on Power System,
Copyright © 2013 SciRes. EPE
J. J. ZHANG ET AL.
Copyright © 2013 SciRes. EPE
1382
Vol. 17, No. 3, 2002, pp. 663-669.
doi:10.1109/TPWRS.2002.800871
[7] D. C. Elizondo, “A Methodology to Assess and Rank the
Effects of Hidden Failures in Protection Schemes based
on Regions of Vulnerability and Index of Severity,” Ph.D.
dissertation, Virginia Polytechnic and State University,
Blacksburg, Virginia, April 2003.
[8] X. F. Xiao, O. Y. F. Qian, Z. Q. Jia, et al., “Probabilistic
Model for the Relay Protection System’s Correct Failure
Removal,” Automation of Electric Power Systems, Vol.
31,No. 7, 2007,pp. 12-14.
[9] S. S. Fu and W. H. Xiong, “A New Method for Reliabil-
ity Analysis of Protection in Power Systems,” Automation
of Electric Power Systems, Vol. 30, No. 16, 2006, pp.
32-35.
[10] Z. S. Xue, W. Chao and C. D. Xiao, “Reliability Analysis
Model for Protective Relaying System of UHV Power
Network Based on Markov State-Space Method,” Power
System Technology, Vol. 33,No. 13,2008, pp. 94-99.
[11] Z. J. Jing and D. Ming, “Summary of Research on Hidden
Failures in Protection Systems,” Proceedings of Interna-
tional Conference on Electrical Machines and Systems,
Vol. 10, 2008, pp. 870-872.
[12] Z. Tao, W. Fang and J. Naizheng. “A Novel Algorithm of
Determining the Optimal Routine Test Interval of the
Dual-redundant Relay Protection System,” Automation of
Electric Power Systems, Vol. 34, No. 10, 2010, pp. 67-70.
[13] A. H. Etemadi and M. Fotuhi-Firuzabad, “New Consid-
erations in Modern Protection System Quantitative Reli-
ability Assessment,” IEEE Transactions on Power Deliv-
ery, Vol. 24, No. 4, 2010, pp. 2213-2222.
doi:10.1109/TPWRD.2010.2051463
[14] A. H. Etemadi and M. Fotuhi-Firuzabad, “Quantitative
Assessment of Protection System Reliability Incorporat-
ing Human Errors,” Part O:J.Risk and Reliability, Vol.
222, 2008, pp. 255-263.
[15] G. Xiang, “Application Technology of Condition Based
Maintenance of Relay Protection,” China Electric Power
Press, 2008.
[16] G. Xiang and L. Shaojun. “Condition Maintenance and
Implementation of Relay Protection,” Relay, Vol. 33, No.
2, 2005, pp. 23-27.
[17] D. O. Koval and H. L. Floyd, “Human Element Factors
Affecting Reliability and Safety,” IEEE Transactions on
Industry Applications, Vol. 34, No. 2, 1998, pp. 406-414.
doi:10.1109/28.663487
[18] Z. Bingquan and F. Xiang, “Experiment Research on
Cognition Reliability Model of Nuclear Power Plant,”
Journal of Tsinghua University, Vol. 39, No. 5, 1999, pp.
122-125.
[19] Arizona Public Service(APS), “Cause of Widespread
Outage under Investigation[EB/OL],”[2011-9-9]
http://www.aps.com.
[20] M. Anjia, Z. Geli and L. Yuechun, “Analysis on
Large-scale Blackout Occurred in South America and
North Mexico Interconnected Power Grid on Sept. 8,
2011 and Lessons for Electric Power Dispatching in
China,” Power System Technology, Vol. 36, No. 4, 2012,
pp. 74-78.