«Software Fault Reporting Processes in Business-Critical Systems Jon Arvid Børretzen Doctoral Thesis Submitted for the partial fulfilment of the ...»
HazOp is a creative team method, using a set of guidewords to trigger creative thinking among the stakeholders and the cross-functional team in RUP. The guidewords are applied to all parts and aspects of the system concept plan and early design documents, to find possible deviations from design intentions that have to be handled. Examples of guidewords are MORE and LESS. This will mean an increase or decrease of some quantity. For example, by using the “MORE” guideword on “a customer client application”, you would have “MORE customer client applications”, which could spark ideas like “How will the system react if the servers get swamped with customer client requests?” and “How will we deal with many different client application versions making requests to the servers?” A HazOp study is conducted by a team consisting of four to eight persons with a detailed knowledge of the system to be analysed.
The main difference between a HazOp and a PHA is that PHA is a lighter method that needs less effort and available information than the HazOp method. Since HazOp is a more thorough and systematic analysis method, the results will be more specific. If there is enough information available for a HazOp study, and the development team can spare the effort, a HazOp study will most likely produce more precise and more suitable results for the safety requirement specification definition.
4. Integration: Using safety methods in the RUP Inception phase
In the inception phase we will focus on understanding the overall requirements and scoping the development effort. When a project goes through its inception phase, the
following artifacts will be established/produced:
• Requirements, leading to a System Test Plan
• Identification of key functionality
• Proposals for possible solutions
• Vision documents
• Internal business case
• Proof of concept The artifacts in bold are the ones that are interesting from a system-safe point of view, and the fact that the RUP inception phase requires development teams to produce such information eases the introduction of safety methods into the process. Because of RUP’s demands on information collection, using these methods do not lead to extensive extra work for the development team.
By using the safety methods we have proposed, we can produce safety requirements for the system. These are high-level requirements, and must be specified before the project goes from the inception to the elaboration phase. When the project moves on from the inception to the elaboration phase, identification of the business-critical aspects should be mostly complete; and we should have high confidence in having identified the requirements for those aspects.
The safety work in the project continues into the elaboration phase, and some of the methods, like Safety Case and Intent Specification will also be used when the project moves on to this phase.
4.1 Software Safety Case in a RUP context According to [Bishop98], we need the following information when producing a safety
• Information used to construct the safety argument
• Safety evidence
As indicated in 3.1, to implement a safety case we need to:
• make an explicit set of claims about the system
• produce the supporting evidence
• supply a set of safety arguments linking the claims to the evidence, shown in Figure 2
• make clear the assumptions and judgements underlying the arguments The safety case is broken down into claims about non-functional attributes for subsystems, such as reliability, availability, fail-safety, response time, robustness to overload, functional correctness, accuracy, usability, security, maintainability, modifiability, and so on.
The evidence used to support a safety case argument comes from:
• The design itself
• The development processes
• Simulation of problem solution proposals
• Prior experience from similar projects or problems Much of the work done early in conjunction with safety cases tries to identify possible hazards and risks, for instance by using methods like Preliminary Hazard Analysis (PHA) and Hazard and Operability Analysis (HazOp). These are especially useful in combination with Safety Case for identifying the risks and safety concerns that the safety case is going to handle. Also, methods like Failure Mode and Effects Analysis, Event Tree Analysis, Fault Tree Analysis and Cause Consequence Analysis can be used as tools to generate evidence for the safety case [Rausand91].
The need for concrete project artefacts as input in the safety case varies over the project phases, and is not strictly defined. Early on in a project, only a general system description is needed for making the safety requirements specification. When used in the inception phase, the Safety Case method will support the definition of a safety requirements specification document by forcing the developers to “prove” that their intended system can be trusted. When doing that, they will have to produce a set of safety requirements that will follow the project through its phases, and which will be updated along with the safety case documents.
The Safety Case method, when used to its full potential, will be too elaborate when not dealing with safety-critical projects. The main concept and structure will, however, help trace the connection between hazards and solutions through the design from top level down to detailed level implementation.
Much of the work that has to be performed when constructing a software safety case is to collect information and arrange this information in a way that shows the reasoning behind the safety case. Thus, the safety case does not in itself bring much new information into the project; it is mainly a way of structuring the information.
4.2 Preliminary Hazard Analysis and Hazard and Operability Analysis in a RUP context By performing a PHA or HazOp we can identify threats attached to both malicious actions and unintended design deviations, for instance as a result of unexpected use of the system or as a result of operators or users without necessary skills executing an unwanted activity.
To perform a PHA or HazOp, we only need a conceptual system description, and a description of the system’s environment. RUP encourages such information to be produced in the inception phase of a project. When a hazard is identified, either by PHA or HazOp, it is categorized and we have to decide if it is acceptable or if it needs further investigation. When trustworthiness is an issue, the hazard should be tracked in a hazard log and subjected to review along the development process. This makes a basis for further analysis, and produces elements to be considered for the safety requirement specification.
The result of a PHA or HazOp investigation is the identification of possible deviations from the intent of the system. For every deviation, the causes and consequences are examined and documented in a table. The results are used to focus work effort and to solve the problems identified. The results of PHA and HazOp are also incorporated into the safety case documents either as problems to be solved, or as evidence used in existing safety claim arguments.
4.3 Combining the methods
By introducing the use of Safety Case and PHA/HazOp into the RUP inception phase, we have a process where the system safety requirements are maintained in the safety case documents. PHA and HazOp studies on the system specification, together with its customer requirements and environment description, produces hazard identification logs that are incorporated into the safety case as issues to be handled. This also leads to revision of the safety requirements. Thus, the deviations found with PHA/HazOp will be covered by these requirements as shown in Figure 3.
From the inception phase of the development process, the safety requirements and safety case documents are used in the remaining phases where the information is used in the implementation of the system.
5. A small example Let us assume a business needing a database containing information about their customers and the customers’ credit information. When developing a computer system for this business, not only should we ask the business representatives which functions they need and what operating system they would like to run their system on, but we should also use proper methods to improve the development process with regard to business-critical issues. An example of an important requirement for such a system would be ensuring the correctness and validity of customers’ credit information. Any problems concerning this information in a system would seriously impact a company’s ability to operate satisfactorily.
The preliminary hazard analysis method will be helpful here, by making stakeholders think about each part of the planned system and any unwanted events that could occur.
By doing this, we will get a list of possible hazards that have to be eliminated, reduced or controlled. This adds directly to the safety requirements specification. An example is the potential event that the customer information database becomes erroneous, corrupt or deleted. By using a preliminary hazard analysis, we can identify the possible causes that can lead to this unwanted event, and add the necessary safety requirements.
We can use the system’s database as an example. In order to identify possible database problems – Dangers – we can consider each database item in turn and ask: “What will happen if this information is wrong or is missing?” If the identified effect could be dangerous for the system’s users or owner – Effects – we will have to consider how it could happen – Causes - and what possible barriers we could insert into the system. The PHA is documented in a table. The table, partly filled out for our example, is shown below in Table 1.
When using the safety case method, the developers will have to show that the way they want to implement a function or some part of the system is trustworthy. This is done by producing evidence and a reasoned argument that this way of doing things will be safe.
From Table 1, we see that for the customer’s credit information, the safety case should be able to document what the developers are going to do to make sure that the credit information used in billing situations is correct. Figure 4 shows a high level example of how this might look in a safety case diagram. The evidence may come from earlier experience with implementing such a solution, or the belief that their testing methods are sufficient to ensure safety.
The lowest level in the safety case in Figure 4 contains the evidences. In our case, these
evidences give rise to three types of requirements:
• Manual procedures. These are not realised in software but the need to perform manual checks will put extra functional requirements onto the system.
• The software. An example in Figure 4 is the need to implement a credit information consistency check.
• The process. The safety case requires us to put an extra effort into testing the database implementation. Most likely this will be realised either by allocating more effort to testing or to allocate a disproportional part of the testing effort to testing the database.
After using these methods for eliciting and documenting safety requirements, in the next development stages the developers will have to produce the evidence suggested in the diagram, show how the evidence supports the claims by making suitable arguments and finally document that the claims are supported by the evidence and arguments. Some examples of evidence are trusted components from a component repository, statistical evidence from simulation, or claims about sub-systems that are supported by evidence and arguments in their own right. Examples of relevant arguments are formal proof that two pieces of evidence together supports a claim, quantitative reasoning to establish a required numerical level, or compliance with some rules that have a link to the relevant attributes.
Further on in the development process, in the elaboration and construction phases, the evidence and arguments in the safety case will be updated with information as we get more knowledge about the system. Each piece of evidence and argumentation should be directly linked to some part of the system implementation. The responsibility of the safety case is to show that the selected barriers and their implementation are sufficient to prevent the dangerous event from taking place. When the evidence and arguments in the safety case diagram are implemented and later tested in the development process, the safety case documentation is updated to show that the safety case claim has been validated.
By using PHA to find potential hazards and deviations from intended operation, and Safety Case to document how we intend to solve these problems, we produce elements to the safety requirements specification, which without these methods may have been missed.
6. Conclusion and further work We have shown how the Preliminary Hazard Analysis, Hazard and Operability Analysis and Safety Case methods can be used together in the RUP inception phase, to help produce a safety requirements specification. The shown example is simple, but demonstrates how the combination of these methods will work in this context. By building on information made available in an iterative development process like RUP, we can use the presented methods to improve the process for producing a safety requirements specification.
As a development project moves into the proceeding phases, the need for safety effort will still remain to ensure the development of a trustworthy system. The other RUP phases contain different development activities and therefore different safety activities.
The BUCS project will make similar descriptions of the other RUP phases and show how safety related methods can be used beneficially also in these phases.
BUCS will also continue the effort in working with methods for improving safety
requirements collection, and will make contributions in the following areas: