J. Software Engineering & Applications, 2009, 2:1-12
Published Online April 2009 in SciRes (www.SciRP.org/journal/jsea)
Copyright © 2009 SciRes JSEA
Design Pattern Representation for Safety-Critical
Embedded Systems
Ashraf Armoush*, Falk Salewski*, Stefan Kowalewski*
*Embedded Software Laboratory, RWTH Aachen University, Aachen, Germany
Email: {armoush, salewski, kowalewski}@embedded.rwth-aachen.de
Received January 9th, 2009; revised February 12th, 2009; accepted March 25th, 2009.
ABSTRACT
Design Patterns, which give abstract solutions to commonly recurring design problems, have been widely used in the
software and hardware domain. As non-functional requirements are an important aspect in the design of safety-critical
embedded systems, this work focuses on the integration of non-functional implications in an existing design pattern
concept. We propose a pattern representation for safety-critical embedded application design methods by including
fields for the implications and side effects of the represented design pattern on the non-functional requirements of the
overall systems. The considered requirements include safety, reliability, modifiability, cost, and execution time.
Keywords: Design Pattern, Embedded Systems, Non-Functional Requirements, Safety-Critical Systems
1. Introduction
Design pattern, originally proposed in [1] by the architect
Christopher Alexander, is a universal approach to de-
scribe common solutions to widely recurring design
problems. Ever since, this concept has been applied in
several domains of hardware design (electronics) and also
became popular in the software domain after the success
of the book Design Patterns: Element of Reusable Ob-
ject-Oriented Software by Gama et al. [2].
As the concept of design pattern aims at supporting
designers and system architects in their choice of suitable
solutions for commonly recurring design problems, this
concept might also be useful to support the design of
safety-critical embedded systems. The design of these
systems is considered to be a complex process, as hard-
ware and software components have to be considered
during the design as well as potential interactions between
hardware and/or software components. Moreover, not
only functional requirements1 have to be fulfilled by the-
se systems. Failures in safety-critical systems could result
in critical situations that may lead to serious injury or loss
of life or unacceptable damage to the environment.
Therefore, also the non-functional requirement safety has
to be considered in these systems to assure that the risk of
hazards is acceptable low in the considered system. To
support the design of safe devices, safety measures are
given by international safety standards as the IEC61508
[3]. Beside life cycle and process requirements, also dif-
ferent measures for the design of software and hardware
components are recommended. These safety measures
have typically an impact on the cost, the reliability, the
real-time behavior of the system, and on the modifiability
of the resulting system. Depending on the application
domain of the later embedded system, these non-func-
tional requirements are of great importance. For this reason,
non-functional requirements should be considered during
the design of any safety-critical system.
While current concepts of design pattern exist for
many different application domains, they typically lack
a consideration of potential side effects on non-func-
tional requirements. In order to integrate these side ef-
fects into the pattern concept, we propose an extended
template for an effective design pattern representation
for safety-critical applications. This pattern representa-
tion includes the traditional pattern concept in combina-
tion with an extension describing the implications and
side effects with respect to the non-functional require-
ments. While this concept has been described briefly in
[4] before, this work focuses on the application of our
approach. Thus, two example patterns are included to
illustrate the proposed representation of design patterns
for software and hardware components in safety-critical
applications.
2. Related Work
The field of design pattern is large and still rapidly
growing. Many researches have focused on the use of
design pattern in the software domain [2,5,6,7,8,9], but
further research is still needed in the domain of safety-
critical embedded systems to integrate the non-functional
requirements in design patterns. In his books [10] and
[11], Bruce Douglass proposed several design patterns
1Note: functional requirements (FR) represent the required functional-
ity itself while non-functional requirements (NFR) describe additional
qualities that have to be achieved by the developed system (e.g. safety,
reliability, modifiability, cost and execution time)
2 Design Pattern Representation for Safety-Critical Embedded Systems
Copyright © 2009 SciRes JSEA
for safety-critical systems based on well known fault
tolerant design methods and by integrating some
modification to increase the safety level on these pat-
terns. Gross and Yu [12] discuss the relationship be-
tween non-functional requirements and design patterns,
and propose a systematic approach for evaluating de-
sign patterns with respect to non-functional require-
ments. They propose the use of design patterns for
establishing traces between non-functional goals in a
goal tree such as a soft goal interdependency graph
(SIG) and the system design. Cleland-Huang et al.
[13,14] enhance the patterns defined by Gross and Yu
[12] through defining a model for establishing trace-
ability between certain types of non-functional re-
quirements and design and code artifacts, through the
use of design patterns as intermediary objects. Xu et al.
[15] classified the dependability needs into three types
of requirements and proposed an architectural pattern
that allows requirements engineers and architects to
map dependability requirements into three corre-
sponding types of architectural components. Konord et
al. [16,17] describe a research of how the principle of
design pattern can be applied to requirements specifi-
cations, which they term requirements patterns for
embedded systems. They include a constraints field in
the pattern template to show the functional and
non-functional restrictions that are applied to the sys-
tem.
In comparison to our work, none of the aforemen-
tioned approaches show clearly the implications on the
non-functional requirements as part of the pattern.
These patterns and the other developed patterns focus
on the traditional structure of the pattern that includes:
context, problem and solution. The use of non-func-
tional requirements in these approaches is restricted to
the requirement analysis phase of the design process. In
these approaches, neither a relative measure nor an in-
dication for the implications of the patterns on the
non-functional requirements, were given. To improve
these approaches, we propose a new template repre-
sentation in Section 4 to show the implications of the
represented patterns on the non-functional require-
ments.
3. Design Pattern Template
In this section, the template pattern we propose for the
representation of design patterns for safety-critical em-
bedded applications is described. As depicted in Figure 1,
the upper part of the template includes the traditional
representation of a design pattern while a listing of the
pattern implications on the non-functional requirements
is given in the Implication section. Moreover, further
support is given by stating implementation issues, sum-
marizing the consequences and side effects as well as a
listing of related patterns.
Figure 1. The design pattern template
Design Pattern Representation for Safety-Critical Embedded Systems 3
Copyright © 2009 SciRes JSEA
Figure 2. Basic system without specfic safety requirements
The proposed design template includes a part for pat-
tern implications on the non-functional requirements reli-
ability, safety, cost, modifiability and execution time. To
allow a suitable description of these implications, the
changes/improvements of using the corresponding design
pattern are represented relative to a basic simple system
(see Figure 2). This basic system has a given reliability
(Rold), a given cost, a given modifiability and is resulting
in a given executing time. Moreover, this basic system
has no specific safety measurements.
4. The Implication On Non-Functional
Requirements
While the main part of the design pattern proposed does
not differ from well known approaches [18,19,20,21], the
part for the implications on the non-functional require-
ments is described in this section. As mentioned above,
the implications are stated relative to the basic system
without any specific safety method. In the following, the
determination of the five implications on non-functional
requirements is described:
Reliability: In this context, reliability is defined as the
probability that of a system or component to perform its
required functions correctly under stated conditions for a
specified period of time. This part of implications de-
scribes the relative improvement in the system’s reliabil-
ity relative to the maximum possible improvement2 in reli-
ability, which is defined in the following equation:
Reliability Improvement=%100
1×
old
oldnew
R
RR (1)
Rnew: The reliability after using this pattern.
Rold: The reliability of the basic system.
Safety: The safety of a system is usually determined by
the residual risk of operating this system (see e.g. [3]).
Therefore, the notion of risk can be used as a measure for
the assessment of safety-critical systems. The problem
concerning design patterns is that they describe an ab-
stract solution to a commonly recurring design problem.
As it is not related to a specific application or to a spe-
cific case, it is difficult to determine an actual value for
the possible residual risk without considering a concrete
application. To allow an indication of the safety that can
be achieved by the application of a specific design pattern,
existing recommendations given in safety standards are
used. In detail, it is stated to which Safety Integrity Level
(SIL) the pattern is recommended in a given safety stan-
dard. The safety integrity levels used here include the
levels SIL1 to SIL4 as they are defined in the standard
IEC61508 [3]. Additionally, the notation SIL0 is used in
this template to describe a system without specific safety
requirements. If measures are described in design pat-
terns that are not included in current safety standards,
these measures have to be assessed in an appropriate
manner, e.g. by comparing them to measures with known
recommendations.
Cost: The implications on costs include:
·The recurring cost per unit, which reflects the addi-
tional costs resulting from additional or specific hardware
components required by the design pattern.
·The development cost of applying this pattern.
Modifiability: This implication describes the degree to
which the system developed according to this design pat-
tern can be modified and changed.
Impact on execution time: With this implication, the
effect of the pattern on the total time of execution at run-
time is indicated. It shows the execution time overhead
that is resulting from the application of this pattern in the
worst and the average cases.
The application of the design pattern proposed, espe-
cially the use of the implication part introduced briefly in
this section, is described in form of two example patterns
in the following section.
5. Example Patterns
Two example patterns are presented in this section to
illustrate the application of the proposed approach: The
first pattern is a hardware and software pattern that is
expected to be suitable for complex and highly safety-
critical systems (Safety Executive Pattern). The second
pattern is a hybrid3 software fault tolerance method in-
tended to increase the reliability of the standard N-ver-
sion programming approach (Acceptance Voting Pat-
tern).
5.1 EXAMPLE 1
In this example pattern, the pattern originally described in
[10] is presented in our extended pattern representation
including also implications on non-functional require-
ments.
Pattern Name
Safety Executive Pattern (SEP)
Other Names
Safety Kernel Pattern
2Note: the maximum possible improvement in reliability is the differ-
ence between the reliability of the basic system and the maximum value
for reliability which is equal to 1 (Ideal case without failures).
3Note: A pattern is called hybrid if the pattern is composed of at least
two other
p
atterns.
4 Design Pattern Representation for Safety-Critical Embedded Systems
Copyright © 2009 SciRes JSEA
Type
Hardware and Software
Abstract
The Safety Executive Pattern can be considered as an
extension of the Watchdog Pattern4 targeting the problem
that a shutdown of the system by the actuation channel
itself might be critical in the presence of faults (shutdown
might fail or take too long). This problem occurs espe-
cially in those systems in which a complicated series of
steps involving several components is necessary to reach
a fail-safe state. Therefore, the Safety Executive Pattern
uses a watchdog in combination with an additional safety
executive component, which is responsible for the shut-
down of the system as soon as the watchdog sends a
shutdown signal (see also Figure 3. The safety executive
pattern). If the system has a safe state, the actuation
channel is shutdown via the safety executive component.
Otherwise, the safety executive component has to dele-
gate all actuations necessary to an additional fail-safe
processing channel.
Context
The application of this pattern is suitable in the following
context:
·The considered actuation channel requires a risk re-
duction by safety measures.
·The considered system has at least one safe state. If
this is not the case, an additional fail-safe processing
channel has to be applied to overtake necessary actions.
·A shutdown of the actuation channel is complex. As
an example, this is the case if several safety-related sys-
tem actions have to be controlled simultaneously.
·A sufficient determination of failures in the actuation
channel can be achieved by a watchdog.
Problem
Provision of a centralized and consistent method for
monitoring and controlling the execution of a complex
safety measure (shutdown or switch over to redundant
unit in case of failures).
Pattern Structure
The Safety Executive Pattern is based on an actuation
channel to perform the required functionality and an op-
tional fail-safe processing channel that is dedicated to the
execution and control of the fail-safe processing (see also
Figure 3). The central part of this pattern is the existence
of a centralized safety executive component coordinating
all safety-measures required to shut down the system or
to switch over to the fail-safe processing channel. The
safety executive component can also be used to control
multiple actuation channels in the system that each may
have multiple channels.
The components of the pattern depicted in Figure 3 are
described below:
·Input Data Source: This component represents the
source of information that is used as input to the system
under consideration. Typically, this data comes either
from the system user or from external sensors that are
used to monitor environmental variables such as: tem-
perature, pressure, speed, light, etc...
·Actuator(s): Actuators are the physical devices that
perform the action of the channel like: motors, switches,
heaters, signals, or any other device that performs a spe-
cific action. Often, there are more than one actuator in a
single channel.
·Actuation Channel: This channel represents a sub-
system that performs dedicated tasks in the overall system
by taking an input data from the input data source, per-
forms some transformation on this data, and then uses the
results to generate suitable commands to drive the actua-
tors.
·Fail-Safe Processing Channel: This is an optional
channel; it is dedicated to the execution and control of the
fail-safe processing. In the presence of a fault in the ac-
tuation channel, the safety executive turns off the actua-
tion channel, and the fail-safe channel takes over. If the
System doesn’t have a fail-Safe Channel, then the actua-
tion channels must have at least one reachable safe states.
·Data Acquisition (Input Processing): This part of the
channel collects the raw data from the input data source and
may perform some data formatting or transfor- mations.
·Data Processing (Transformation): This part may
contain multiple data transformation components; where
each one performs a single transformation or processing
on the received data to execute the desired algorithm in
order to generate the required control signals. The final
component of this part sends the computed output to the
output processing unit.
·Output Processing: This unit takes the computed data
from the data transformation unit and generates the final
data and the control signals to drive the actuators. It can
be considered as a device driver for the actuator.
·Integrity Check (Optional): This is an optional
component that is invoked by the watchdog to run a pe-
riodic Built-In Test (BIT) to verify all or a portion of the
internal functionality of the actuation channel.
·Time Base: This is an independent timing source
(timing circuit) that is used to drive the watchdog. This
time source has to be separate from the one used to drive
the actuation channel.
·Watchdog (WD): The watchdog receives liveness
messages (strokes) from the components of the actuation
channel in a predefined timeframe. If a stroke comes too
late or out of sequence, the watchdog considers this situa-
tion as a fault in the actuation channel and it issues a
shutdown signal to the actuation channel or initiates a
4 Note: The Watchdog Pattern is based on two components, the actuation
channel and a supervisor, called watchdog. The actuation channel typi-
cally triggers the watchdog at defined time intervals to demonstrate tha
t
the actuation channel is still active. More advanced approaches include
more sophisticated interactions between the actuation channel and the
watchdog to allow a higher degree of fault coverage (see e.g. [11]).
Design Pattern Representation for Safety-Critical Embedded Systems 5
Copyright © 2009 SciRes JSEA
corrective action through sending a command signal to
the optional integrity check. If the system contains multi-
ple actuation channels, then it may contain multiple
watchdogs, one per actuation channel.
·Safety Executive: This is the main component in this
safety executive pattern. It tracks and coordinates all
safety monitoring to ensure the execution of safety action
when appropriate. It consists of a safety coordinator that
controls safety measures and safety policies. The safety
executive component captures the shutdown signal from the
watchdog in the case of failure in the actuation channel.
·Safety Coordinator: The safety coordinator is used to
control and coordinate the safety processing that is man-
aged by the safety measures. It also executes the control
algorithms that are specified by the safety policies.
·Safety Measures: Include the detailed description of
the safety measures. The safety coordinator may control
multiple safety measures.
·Safety Policies: Each safety policy specifies a strat-
egy or control algorithm for the safety coordinator. It
involves a complicated sequence of steps that involve mul-
tiple safety measures. The reason for the separation of the
coordinator from the safety policies is to make the process
of changing and adapting a safety policy easier.
Implication
This section describes the implication of this pattern rela-
tive to the basic system without a specific safety method.
·Reliability
Let us have the following notations:
RAC: the reliability of the actuation channel. (Rold =
RAC)
RSC: the reliability of the fail-safe processing channel.
RSE: the reliability of the safety executive component.
C: the coverage factor which is defined as: the prob-
ability that a fault in an actuation channel will be identi-
fied by the safety executive and the fail-safe processing
channel will be activated.
Assume that the watchdogs are carefully designed with
reliability1.
The safety executive pattern will continue to work
without system failure as long as one of the following
two conditions holds:
There is no fault in the actuation channel.
There is a fault in the actuation channel and the
watchdog detects this fault and the safety executive initi-
ates a shutdown or activates the fail-safe processing
channel.
The new reliability after using this pattern (Rnew) is
equal to:
SCACSEAC RRCRRR)1(
+
=
(2)
In this equation, the first term represents the reliability
of the actuation channel while the second term represents
the reliability of the remaining parts in the case of failure
in the actuation channel.
Figure 3. The safety executive pattern
6 Design Pattern Representation for Safety-Critical Embedded Systems
Copyright © 2009 SciRes JSEA
The percentage improvement in reliability relative to
the maximum possible improvement is equal to:
%100
1
)1( ×
−−+
=
AC
ACSCACSEAC
R
RRRCRR (3)
%100×=SCSE RCR (4)
·Safety:
The safety executive pattern includes the following
four design techniques: program sequence monitoring
with a watchdog, test by redundant hardware (the watch-
dog that initiates the integrity check and BITs), safety
bag techniques5, and graceful degradation6. According to
the standard IEC 61508 [3], the recommendation for
these techniques is shown Table 1.
In general, we think that the combination of these
techniques and the development cost makes the safety
executive pattern suitable and highly recommended only
for very high critical applications with high safety integ-
rity levels (SIL4 and SIL3) and recommended for lower
levels (SIL2 and SIL 1).
·Cost:
This pattern is an expensive pattern with very high cost
since it consists of different components that involve high
recurring and development cost.
Recurring cost: It includes the cost of the following:
The actuation channel.
The fail-safe processing channel (if present).
The safety executive component.
Watchdogs and their independent timing source.
Development cost: In general, the development cost
for this pattern is very high since it includes a develop-
ment of three different systems (channels) that include
different architectures and different designs.
·Modifiability:
There are two types of possible modifications:
1) Actuation Channel: It is very simple to modify this
pattern by adding extra functionality to the actuation
channel. The only things that should be done: is to know
whether the new components need to send stroke mes-
sages to the watchdog.
2) Safety policy: One of the main features of this pat-
tern is the centralized safety processing which is per-
formed by the Safety Executive Component. The Safety
Executive separates the coordinator from the safety poli-
cies to simplify the change and modification of the safety
policy and to make it easier.
Table 1. Recommendations for safety integrity levels
Techniques SIL1 SIL2 SIL3SIL4
Program sequence monitoring (WD) HR HR HRHR
Test by redundant hardware R R R R
Safety bag techniques R R R
Graceful degradation R R HRHR
·Impact on execution time:
The actuation channel and the safety executive have
different CPUs and different memories, and they run si-
multaneously in parallel. Thus, there is no effect for the
safety executive component on the actuation channel dur-
ing the normal operation of the system except the execu-
tion of the periodic built in tests (BITs).
Implementation
The following points should be taken into consideration
during the implementation of this pattern:
·The actuation channel, the safety executive, and the
fail-safe processing channel run separately in parallel,
therefore each channel will run on its own processing
unit7 and own memory.
·The safety-critical information must be protected
against data corruption, e.g. by using CRCs or any other
method to detect data errors.
·The watchdog component is simple and often im-
plemented as a separate hardware device. It is capable of
detecting a variety of hardware and software fault. How-
ever, its actual diagnostic coverage depends on the integ-
rity check implemented in the actuation channel.
·To provide protection from faults in a common time
base, separate timing sources must be used for the
watchdog, the safety executive and the actuation channel.
Consequences and Side Effects
The main drawback of this pattern is the high complexity
of this pattern for implementation. Therefore it is used for
complex and highly safety-critical systems.
Related Patterns
The safety executive pattern is used for complex safety-
critical applications and it covers a large set of features,
provided by of the other patterns, such as sequence
monitoring provide by watchdog, switch-to-backup as in
the fail-safe channel. For simpler systems with simpler
safety requirements, other simpler patterns, such as
Watchdog pattern, Sanity Check pattern and Monitor
Actuator pattern [11], can be used.
5.2 EXAMPLE 2
This example pattern describes the pattern originally de-
scribed in [22] including the standard pattern components
as well as implications on non-functional requirements.
Pattern Name
Acceptance Voting Pattern
Other Names
5Note: A safety bag is an external monitor, implemented on an inde-
p
endent hardware component to ensure that the system perfor
m
s safe
actions [3].
6Note: The safety degradation is a technique that gives priorities to the
various functions to be carried out by the system. The design ensures
that if there are insufficient resources to carry out all the system func-
tions, the higher priority functions (the safety functions in our case) are
carried out in preference to the lower once [3].
7Note: a processing unit is a (programmable) electronic device execut-
ing a certain function. Typical examples are programmable logic con-
trollers, microcontrollers, and FPGAs. Moreover, an electronic device
might include more than one processing unit (e.g. multi-core architec-
tures). In this case, the analysis of the independence between these units
re
q
uires s
p
ecial care.
Design Pattern Representation for Safety-Critical Embedded Systems 7
Copyright © 2009 SciRes JSEA
---
Type
Software
Abstract
The Acceptance Voting pattern is a hybrid pattern that
incorporates the N-Version Programming Pattern with
Acceptance Tests (AT) used by the Recovery Block Pat-
tern. Similar to the normal N-version programming ap-
proach, this pattern is based on two or more diverse
software versions. Traditionally, these versions are func-
tionally equivalent and are developed by independent
teams from the same initial specification [23]. Moreover,
approaches to increase the diversity of the resulting soft-
ware versions could be applied already in the specifica-
tion (functional diversity, see e.g. [40]). In case of this
pattern, the output of each version is presented to an ac-
ceptance test to determine if the output is reasonable. The
outputs that pass the acceptance test are used as inputs to
a dynamic voter, which is executed to produce the correct
output according to a specific voting scheme. Depending
on the applications, the diverse software versions can be
executed on parallel hardware or sequentially on a single
hardware device.
Context
The application of this pattern is suitable in the following
context:
·Tolerance of software faults is required for safety
reasons (acceptance test)
·High reliability of the system’s output is required
(several software versions)
· The correctness of the results delivered by the diverse
software versions can be checked by an acceptance test.
·The development of diverse software versions is
possible (additional development costs, additional organ-
izational effort, and sufficient number of developers for
the development of each version).
Problem
Enable the tolerance of software faults that may remain
after the software development to target safety and reli-
ability requirements.
Pattern Structure
The Acceptance Voting Pattern (AV) is a hybrid pattern,
which represents an extension of the N-version program-
ming approach by incorporating this approach with the
acceptance test used in the recovery block approach. This
pattern includes N diverse software versions that are
typically executed in parallel to perform the required task.
The output of each version is tested for correctness using
an acceptance test and only those results that pass the
acceptance test are used by the voting algorithm to gen-
erate the final result. The goal of the Acceptance Voting
pattern is to increase the system’s reliability through a
combination of a fault detection scheme provided by the
acceptance test and a fault masking scheme provided by
N-version programming with voting.
The structure of this pattern is shown in Figure 4 and
the function of each component is described below:
·Input Data Source: This instance represents the
source of information that is used as input to the designed
system. Typically, this data comes either from a system user
or from external sensors used to monitor environmental
variables such as temperature, pressure, speed, or light.
·Output Data and Control Signals: The output data of
the voter module represents the final output data of the
designed system. This data may contain control signals to
activate actuators as motors, switches, heaters, or mes-
sages for other components outside the system.
·Version 1, 2...N: These are diverse software versions
implemented to fulfill the specified functionality. Typi-
cally, these versions result from an independent imple-
mentation by independent teams of software developers,
based on the same initial specification. Thus, these ver-
sions are performing roughly the same functionality on
the input data to produce the final result. Further aspects
of generating these diverse software versions can be
found in the implementation section below. Usually,
these diverse versions are executed in parallel on differ-
ent hardware devices to generate N outputs and each of
these outputs is presented to an acceptance test to check
them for correctness. Those results that pass the acceptance
test are processed by the dynamic voter module to deter-
mine the output data by applying a specific voting strategy.
Figure 4. The acceptance voting pattern
8 Design Pattern Representation for Safety-Critical Embedded Systems
Copyright © 2009 SciRes JSEA
·Acceptance Test (AT): This part of the software is
executed on the outcome of each version to confirm that
the result is reasonable and fulfills defined requirements
given in the software specification. The acceptance test
returns either true or false and may have several compo-
nents. Moreover, it may include checks for runtime errors
[24] and mechanisms for implicit error detection. Various
implementations of the acceptance test are possible rang-
ing from simple reasonableness checks to complex high-
coverage validators [25].
The reliability of the system’s output depends greatly
on the quality of the acceptance test (especially an ac-
ceptance test that detects faults in a correct output re-
duces the system’s reliability). Thus, this test should be
carefully designed and it is desirable that the acceptance
test is simple as well as easy to verify.
·Dynamic Voter: The voter reads the outputs that pass
the acceptance test and uses these results as inputs to the
voting algorithm in order to determine the final output
and control signals. The voter that is used in this pattern
should be dynamic8 due to the variable number of inputs
that ranges from 0 to N. Depending on the number of
outputs that pass the acceptance test, the voter may in-
clude the following different actions:
- In the case when no output passes the acceptance test,
it reports an overall system failure.
- In the case when one output passes the acceptance
test, it just forwards this output.
- In the case when two outputs pass the acceptance test,
the signal is only forwarded if both are equal (or the dif-
ference is within a defined tolerance). In the case of ine-
quality, the action depends on the required level of safety
and reliability, either an output is selected according to a
predefined order or an exception is raised to indicate a
failure.
- When the number of outputs that pass the acceptance
test is more than two, a voting technique is executed to
generate the final result.
Several voting techniques exist that can be used for
voting as majority voting (most commonly used tech-
nique), consensus voting [26], and maximum likelihood
voting [27]. The selection of these techniques depends
on the type of data, the deviation in the outputs of the
versions, the type of agreement [28], the output space
cardinality size9, the functionality of the voter [29], the
reliability of the different versions, and perhaps even
further factors.
Implication
This section describes the implication of this pattern us-
ing a majority voting approach relative to the basic sys-
tem without any specific safety method.
·Reliability:
Let us have the following notations:
f : the probability of failure in a version due to a bug in
its implementation.
E: the event that the output of a version is erroneous.
(
{
}
fEP
=
)
T: the event that the acceptance test reports that the
output is wrong.
N: the total number of different independent versions.
n: the number of versions that pass the acceptance test.
m: the agreement number which is equal to
2/)1(
+
n for the majority voting.
PTP: the probability that a version will pass the accep-
tance test, given that the outcome of a version is correct.
(True Positive case) (}|{ ETPP
TP =).
PFN: the probability that a version will fail the accep-
tance test, given that the outcome of a version is correct.
(False Negative case) (TPFN PETPP −== 1}|{ ).
PTN: the probability that a version will fail the accep-
tance test, given that the outcome of a version is wrong.
(True Negative case) (}|{ ETPP
TN
=
).
PFP: the probability that a version will pass the accep-
tance test, given that the outcome of a version is wrong.
(False Positive case) (TNFP PETPP −== 1}|{ ).
Rold: the reliability of system software with single ver-
sion (Rold = R).
Rnew: the reliability of system software with acceptance
voting pattern.
Assumptions:
- The voter is carefully designed and can be consid-
ered as fault free.
- The majority voting technique is used in the voter
software component.
- The failures in the different versions are statically
independent10 and the different versions have the same
probability of failure (f) and the same reliability (Ri = R).
At any given time, if the number of versions’ outputs that
pass the acceptance test and participate the voting is n,
then these outputs can be grouped into two parts:
- Correct Outputs (True Positive outputs that pass AT)
with probability=R*PTP.
- Incorrect Outputs( False Positive outputs that contain
undetected faults) with probability =
=
⋅− )1()1( TN
PR
FP
PR
)1( .
The probability that an output passes the test is equal
to:
FPTP PRRPTP)1(}{ −+= (5)
8Note: A voter is considered as dynamic if it accepts varying numbers o
f
input signals.
9Note: The cardinality size of an output space is the number of possible
different values for an output.
10Note: While the assumption of failure independence is not realistic fo
r
p
ractical software implementations, this assumption eases the calcula-
tion presented. The simplified calculation presented here already allows
certain reliability evaluations. Moreover, a dependency term could be
included into the calculations if an explicit consideration of dependen-
cies between the software versions should be considered. Further as-
p
ects of software diversity can be found in the implementation section
of this pattern description.
Design Pattern Representation for Safety-Critical Embedded Systems 9
Copyright © 2009 SciRes JSEA
The probability that an output does not pass the test is
equal to:
TNFN PRRPTp )1(}{ −+= (6)
The Probability that the voter gives a correct output,
given that n outputs passed the test, is equal to
()()
[]
in
FP
i
TP
n
mi PRRP
i
n
=
=1 (7)
The probability that n outputs from the total number of
outputs pass the acceptance test and give a correct result
in the majority voting is:
()()
[]
()
−+
=−−
=
nN
TNFN
in
FP
i
TP
n
miPRRPPRRP
i
n
n
N)1(1
(8)
The number of versions n that pass the acceptance to
produce a correct result can be 1, 2…N. Therefore, the
new reliability after using this pattern (Rnew) is equal to:
()()
[]
()
∑∑
=
−−
=
−+
=N
n
nN
TNFN
in
FP
i
TP
n
mi
new PRRPPRRP
i
n
n
N
R
1
)1(1
(9)
Finally, the percentage improvement in software reliability
relative to the maximum possible improvement is equal to:
%100
1
%100
1×
=R
RR
R
RR new
old
oldnew (10)
As shown in Equation (9) and (10), the reliability im-
provement in this pattern depends on the reliability and
number of versions N, and on the performance and the
effectiveness of the acceptance test used. The acceptance
test should be carefully designed, reasonably simple,
highly reliable, and with a high error detection coverage
in order to mask the faulty outputs from participating in
the voting step.
·Safety:
The presented pattern includes the concepts of diverse
programming and fault detection with acceptance test and
voting. According to the software requirements in the
standard IEC 61508-3 [3], the recommendations for these
techniques are shown in Table 2.
According to the last table, we think that this pattern is
suitable and highly recommended only for very high
critical applications with high safety integrity levels
(SIL4 and SIL3), recommended for lower level (SIL2)
and with no recommendation for SIL1.
Table 2. Recommendations for safety integrity levels
Techniques SIL1 SIL2 SIL3 SIL4
Diverse programming R R R HR
Fault detection and diag-
nosis (Voting) - R HR HR
Fault detection and diag-
nosis (Acceptance Test) - R HR HR
·Cost:
In comparison to the basic system, this pattern is re-
sulting in high additional costs. These costs can be di-
vided into two parts.
- Recurring cost: includes the cost for the N different
hardware units that are used for the parallel execution of
the N-version software. So, the recurring cost will be N
times (N*100%) comparing to the recurring cost for the
basic system that includes a single version. In this pattern,
the voter and the acceptance test are implemented in
software. Therefore, this pattern includes additional re-
curring cost for the used memory.
- Development cost: The development of N diverse
software versions will cost more than the development of
single version software. An estimation of the develop-
ment cost of N-version software has to consider the fol-
lowing aspects:
The N versions have the same specification, and only
one set of specification has to be developed [30].
The cost for developing N versions prior to the veri-
fication and the validation phase is N times the cost for
developing a single version [31].
The management of an N-version project imposes
overhead not found in traditional software development
[30].
The different versions can be used to validate each
other [30]. While this approach could be used to decrease
the cost for using verification and validation tools, it is
not recommended as all versions implemented might
contain a similar or even the same fault.
Exact information about the additional costs of creat-
ing N version instead of a single version is limited. The
estimated practical cost of development of multiversion
system showed that the costs increase sub-linearly with
the number of components [32]. Moreover, it is stated in
[33] that each additional version costs about 75-80% of a
single version.
In addition to the previous costs, this pattern includes
extra cost for developing and verifying an effective ac-
ceptance test.
·Modifiability:
The following possible modifications have to be con-
sidered:
1) Modification of a single version: It is possible to
modify a single version either to remove a newly discov-
ered fault or to improve a poorly programmed function
[34]. In this case, the initial specification remains without
any modification and the modification of this version is
similar to the modification of single version software
following a standard fault removal procedure.
2) Modification of all member versions: The reason for
modification of all N versions is either to add a new
functionality or to improve the overall performance of the
N-version software [34]. In this case, the initial specifica-
tion has to be modified and all N versions must be modi-
fied and tested independently by independent teams. In
general, the modification of N-version software is re-
10 Design Pattern Representation for Safety-Critical Embedded Systems
Copyright © 2009 SciRes JSEA
markably more difficult than the modification of single
version software.
3) Modification of the acceptance test (AT): The ac-
ceptance test can be considered as an independent soft-
ware module that is checking the output of each of the N
versions. Thus, this acceptance test can be easily modi-
fied without any influence on the different versions.
4) Modification of the voter: The separation of the
voting module from the N versions and the acceptance
test allows easy modification or changes of the voting
technique.
·Impact on Execution Time:
The diverse software versions in this pattern are exe-
cuted in parallel, ideally on N independent hardware de-
vices. As the execution times of these software versions
might differ as they are implemented differently, the
voter has to wait for the outputs of all software versions
to be checked by the acceptance test before applying the
voting algorithm. Thus, the total time of execution is de-
termined by the slowest version in addition to the typi-
cally relatively small time to execute the acceptance test
and the voting algorithm. In general, if we can neglect the
execution time of the acceptance test and the voter, then
the execution time of this pattern is slightly equal to sin-
gle version software.
It is also possible to execute the independent versions
followed by the acceptance test and voting algorithm
sequentially on a single hardware. However, the time of
execution will increase by N times of a single version.
This disadvantage11 makes the sequential execution less
attractive, especially for time critical applications.
Implementation
The acceptance voting pattern is a hybrid pattern that
combines the idea of N-version programming and fault
detection using an acceptance test. Therefore, the success
of this pattern depends on three factors:
1) The quality of the acceptance test is an important
factor. Thus, the acceptance test should be carefully de-
signed to detect most of the possible software faults.
2) The N diverse software versions and especially the
level of diversity between these versions to avoid com-
mon failures between different versions. In order to in-
crease the level of diversity and the independence of the
designed versions, the following have been recommended
in [30]:
·The use of a complete, correct, and carefully docu-
mented specification to prevent an error in the specifica-
tion from propagating to the different versions.
·The use of independent and isolated teams of pro-
grammers with diversity in their training and experience.
·The use of diverse algorithms and diverse imple-
mentation techniques.
·The use of diverse programming languages.
·The use of diverse compilers, development tools, and
test methods.
With respect to N-version programming, it has to be
noted that experiments have shown that developers which
implement the same function independently tend to make
the same faults. For this reason, the assumption of statis-
tically independent failure behavior of the software ver-
sions does not hold [35,36]. Approaches modeling this
dependency structure (e.g. [37]) and corresponding em-
pirical studies [38,39] are known which allow certain
(model-based) predictions of failure probabilities in sys-
tems built on N-version programming. The measures
presented above in this section try to decrease the de-
pendencies between the different software versions. The
assumption is that different development methodologies
lead to diversity in decision and thus diversity in the be-
havior of the resulting software. However, even with this
increased effort, the absence of undesired dependencies
between the diverse software versions cannot be guaran-
teed [36,40,41]. For this reason, it is recommended to
apply N-version programming in combination with fur-
ther software fault tolerance measures as the acceptance
tests applied in this pattern.
3) The use of a suitable voting technique to implement
the voting component such as:
·Majority Voting: It is the simplest and most common
used method that is used to find the output, where at least
2/)1(
+
n variant results agree.
·Plurality Voting (PV): It is a simple voter, that im-
plements m-out-of-n voting, where m is less than a strict
majority.
·Consensus Voting (CV) [26]: This voting method is
used for multiversion software with a small output space.
In this method, the result of the largest agreement number
is chosen as correct output.
·Maximum Likelihood Voting (MLV) [27]: In this
method, the voter uses the reliability of each software
version to make a more accurate estimation of the most
likely correct result.
·Adaptive Voting [42]: This technique introduces an
individual weighting factor to each version which is later
included in the voting procedure. These weighting factors
are dynamically changeable to model and manage differ-
ent quality levels of versions.
Consequences and Side Effects
Similar to the original N-version programming approach,
the drawbacks of the Acceptance Voting Pattern are seen
in the effort of developing N diverse software versions in
addition to the high dependency on the initial specifica-
tion which may propagate faults to all versions. With
respect to safety, the problem of dependent faults in all N
software versions is less critical in this pattern than in the
original N-version programming approach, as the accep-
tance test included represents an additional measure to
detect these faults.
11Note: Another disadvantage is that the execution on a single hardware
can tolerate only few hardware faults (certain transient faults) while the
approach on N different hardware devices can tolerate most transient
and permanent hardware faults. However, even in this case faults have
to be considered that could affect all N versions simultaneously.
Design Pattern Representation for Safety-Critical Embedded Systems 11
Copyright © 2009 SciRes JSEA
Related Patterns
In comparison to the basic system, the Acceptance Vot-
ing Pattern allows improvements in the reliability and the
safety of a software based system. As it is executed on
different hardware devices, it is possible to combine this
pattern with the Heterogeneous Design Pattern for the
design of these diverse hardware units to deal with sys-
tematic hardware faults. This combination will improve
the reliability and safety of the hardware as well as the
software.
6. Conclusions
The design of safety-critical embedded applications re-
quires an integration of the commonly used software and
hardware design methods. Therefore, the use of design
pattern is very promising in this application domain, if
the specific properties of embedded systems are consid-
ered in the pattern representation. In this paper, we pro-
posed an extended pattern representation for the design of
safety-critical embedded applications. This representation
focuses on the implications and side effects of the repre-
sented design method on the non-functional requirements
of the safety-critical embedded system including safety,
reliability, modifiability, cost and execution time. Two
example patterns have been used to show the effective-
ness of the proposed pattern representation. We expect
that this extended representation will guide the selection
of a suitable design as it allows evaluating alternative
patterns with respect to their implications.
7. Future Work
For a successful application of the proposed represen-
tation of design patterns for safety-critical embedded
systems, an integration of a higher number of design pat-
terns is desirable. For this reason, we currently construct
a pattern catalogue based on the proposed representation
by collecting and classifying commonly used hardware
and software design methods. Moreover, it is intended to
construct the catalogue such that an automatic recom-
mendation of suitable design methods for a given appli-
cation can be achieved in the future.
8. Acknowledgments
This work was supported by the German Academic Ex-
change Service (DAAD) under the program: Research
Grants for Doctoral Candidates and Young Academics
and Scientists.
REFERENCES
[1] C. Alexander, “A Pattern Language: Towns, Buildings,
Construction,” New York: Oxford University Press, 1977.
[2] E. Gama, R. Helm, R. Johnson, and J. Vlissides, “Design
patterns: Element of reusable object-oriented software,”
New York: Addison-Wesley, 1997.
[3] IEC61508 Functional safety for electrical/electronic/ pro-
grammable electronic safety-related systems, International
Electrotechnical Commission, 1998.
[4] A. Armoush, F. Salewski, and S. Kowalewski, “Effective
pattern representation for safety critical embedded sys-
tems,” International Conference on Computer Science and
Software Engineering (CSSE 2008), pp. 91-97, 2008.
[5] F. Buschmann, R. Meunier, H. Rohnert, P. Sommerlad, and
M. Stal “Pattern-oriented software architecture: A system of
patterns,” John Wiley & Sons, Inc., New York, NY, 1996.
[6] P. Coad, “Object-oriented patterns,” Communications of
the ACM, Vol. 35, pp. 152-159, 1992.
[7] K. Beck and W. Cunningham, “Using pattern languages
for object-oriented programs,” Presented at the OOP-
SLA-87 Workshop on Specification and Design for Ob-
ject-Oriented Programming.
[8] J. Coplien, “Idioms and patterns as architectural litera-
ture,” IEEE Software, Vol. 14, pp. 36-42, 1997.
[9] B. Appleton. “Patterns and software: Essential concept
and terminology,” available at <http://www.enteract.com
/~bradapp/docs/patterns-intro.html>.
[10] B. P. Douglass, “Doing hard time: Developing real-time
system with UML, objects, frameworks, and pattern,”
New York: Addison-Wesley, 1999.
[11] B. P. Douglass, “Real-time design patterns,” New York:
Addison-Wesley, 2003.
[12] D. Gross and E. Yu, “From non-functional requirements
to design through patterns,” Requirements Engineering,
Vol. 6, No. 1, pp. 18-36, 2002.
[13] J. Cleland-Huang and D. Schmelzer, “Dynamically trac-
ing non-functional requirements through design pattern
invariants,” Workshop on Traceability in Emerging Forms
of Software Engineering, in conjunction with IEEE Inter-
national Conference on Automated Software Engineering,
2003.
[14] J. Fletcher and J. Cleland-Huang, “Softgoal traceability
patterns,” in Proceedings of the 17th IEEE International
Symposium on Software Reliability Engineering (ISSRE
2006), pp. 363-374, 2006.
[15] L. Xu, H. Ziv, T. A. Alspaugh, and D. J. Richardson, “An
architectural pattern for non-functional dependability re-
quirements,” Journal of Systems and Software, Vol. 79,
No. 10, pp. 1370-1378, 2006.
[16] S. Konrad and B. Cheng, “Requirements patterns for em-
bedded systems,” in Proceedings of the IEEE Joint Inter-
national Requirements Engineering Conference (RE’02),
pp. 127-136, 2002.
[17] S. Konrad, B. Cheng, and L. Campbell, “Object analysis
patterns for embedded systems,” IEEE Transactions on
Software Engineering, Vol. 30, No. 12, pp. 970-992, 2004.
[18] K. Wolf and C. Liu, “New clients with old servers” A
Pattern Language for Client/Server Frameworks,” in Pat-
tern Languages of Program Design, J. Coplien and D.
Schmidt, Eds. Reading, MA: Addison Wesley, pp. 55-64,
1955.
[19] D. Riehle and H. Züllighoven, “A pattern language for
tool construction and integration based on the tools and
materials metaphor,” in Pattern Languages of Program
Design, J. Coplien and D. Schmidt, Eds. Reading, MA:
Addison Wesley, pp. 55-64, 1955.
[20] S. Adams, “Functionality ala carte,” in Pattern Languages
of Program Design, J. Coplien and D. Schmidt, Eds.
Reading, MA: Addison Wesley, pp. 55-64, 1955.
[21] R. Lajoie and R. K. Keller, “Design and reuse in object-
oriented frameworks: Patterns, contracts and motifs in
concert,” in Object-Oriented Technology for Database
and Software Systems, V. Alagar and R, Missaoui, Eds.
12 Design Pattern Representation for Safety-Critical Embedded Systems
Copyright © 2009 SciRes JSEA
Singapore: World Scientific Publishing, pp. 295-312,
1995.
[22] A. Athavale, “Performance evaluation of hybrid voting
schemes,” M. S. thesis, North Carolina State University,
Department of Computer Science, 1989.
[23] A. Avizienis and L. Chen, “On the implementation of
N-version programming for software fault tolerance dur-
ing execution,” in Proceedings of IEEE COMPSAC 77,
pp. 149-155, 1977.
[24] N. Storey, “Safety-Critical Computer Systems,” Boston:
Addison-Wesley, 1996.
[25] B. Prahami, “Design of reliable software via general combi-
nation of N-Version Programming and Acceptance Testing,”
in Proceedings of 7th International Symposium on Software
Reliability Engineering ISSRE’96, pp. 104-109, 1996.
[26] D. F. McAllister, C. E. Sun, and M. A. Vouk, “Reliability
of voting in fault-tolerant software systems for small out-
put spaces,” IEEE Transactions on Reliability, Vol. 39,
No. 5, pp. 524-534, 1990.
[27] Y. W. Leung, “Maximum likelihood voting for fault-
tolerant software with finite output space,” IEEE Transac-
tions on Reliability, Vol. 44, No. 3, 1995.
[28] G. Latif-Shabgahi, J. M. Bass, and S. Bennett, “A taxon-
omy for software voting algorithms used in safety-critical
systems,” IEEE Transactions, Reliability, Vol. 53, No. 3,
pp. 319-328, 2004.
[29] B. Parhami, “Voting algorithms,” IEEE Transactions on
Reliability, Vol. 43, pp. 617-629, 1994.
[30] I. Koren and C. M. Krishna, “Fault-tolerant systems,”
Elsevier, 2007.
[31] A. Avizienis, “The N-version approach to fault-tolerant
software,” IEEE Transactions on Software Engineering,
Vol. 11, No. 12, pp. 1491-1501, 1985.
[32] F. Daniels, K. Kim and M. A. Vouk, “The reliable hybrid
pattern: a generalized software fault tolerant design pat-
tern,” in Conference PloP’97, pp. 1-9, 1997.
[33] M. Lyu, “Handbook of software reliability engineering,”
New York: McGraw-Hill and IEEE Computer Society
Press, 1996.
[34] A. Avizienis, “The methodology of N-version programming,”
in Software Fault Tolerance, M. Lyu, Ed. New York:
Wiley, pp. 23-46, 1995.
[35] J. C. Knight and N. G. Leveson, “An experimental
evaluation of the assumption of independence in mul-
tiversion programming,” IEEE Transactions on Software
Engineering, Vol. 12, pp. 96-109, 1986.
[36] F. Salewski, D. Wilking, and S. Kowalewski, “The effect
of diverse hardware platforms on n-version programming in
embedded systems-an empirical evaluation,” in 3rd Inter-
national Workshop on Dependable Embedded Systems
(WDES’06), 2006.
[37] B. Littlewood and D. R. Miller, “Conceptual modeling of
coincident failures in multiversion software,” IEEE
Transactions on Software Engineering, 1989.
[38] J. G. W. Bentley, P. G. Bishop, and M. J. P. van der
Meulen, “An empirical exploration of the difficulty func-
tion,” in Computer Safety, Reliability and Security (Safe-
comp), 2004.
[39] X. Cai and M. R. Lyu, “An empirical study on reliability
modeling for diverse software systems,” 15th International
Symposium on Software Reliability Engineering (ISSRE),
2004.
[40] B. Littlewood, P. Popov and L. Strigini, “A note on mod-
eling functional diversity,” in Reliability Engineering and
System Safety, 1999.
[41] F. Salewski and S. Kowalewski, “Achieving highly reli-
able embedded software: An empirical evaluation of dif-
ferent approaches,” in Proceeding of 26th International Con-
ference on Computer Safety, Reliability and Security
(SAFECOMP’07), pp. 270-275, 2007.
[42] K. Kanoun, M. Kaaniche, C. Beounes, J. C. Laprie, and J.
Arlat, “Reliability growth of fault tolerant software,”
IEEE Transactions on Reliability, Vol. 42, No. 2, 1993.