«Software Fault Reporting Processes in Business-Critical Systems Jon Arvid Børretzen Doctoral Thesis Submitted for the partial fulfilment of the ...»
Because of the nature of historical data analysis, some of our research was based on bottom-up data collection. That is, we needed to examine the data material prior to being able to formulate research questions and goals. As Basili et al. states in [Basili94], data collection should ideally proceed in a top-down rather than a bottom up fashion, e.g. by employing GQM to define relevant metrics [Solingen99]. However, some
reasons for why bottom-up studies also are useful, are given in [Mohaghegi04c]:
1. There is a gap between the state of the art (best theories) and the state of the practice (current practices). Therefore, most data gathered in companies’ repositories are not collected following the GQM paradigm.
2. Many projects have been running for a while without having improvement programs and may later want to start one. The projects want to assess the usefulness of the data that is already collected and to relate data to goals (reverse GQM).
3. Even if a company has a measurement program with defined goals and metrics, these programs need improvements from bottom-up studies.
Exploring industrial data repositories can be part of an exploratory study (identifying relations and trends in data) or a formal study (confirmatory) to validate other or newer theories than those originally underlying the collected data.
3.3 Research approach and research design This section explains the research design used to collect and analyze the relevant data.
The thesis combines qualitative and quantitative techniques, mainly by using quantitative studies on historical data sources and qualitative studies on practice and
processes. The reasons for combining these different types of studies are the following:
• By doing quantitative studies of ongoing commercial projects or by reusing historical data, we could collect information about real life projects.
• Results of these studies were confirmed by other studies using other and often qualitative methods, thus triangulating the data and results.
The research design for each individual study has been both bottom-up and top-down.
The chosen design has depended on the maturity of the research and available information. Some of the research questions were a result of our literature studies and common work in the BUCS project, in a top-down manner. Other research questions were bottom-up, because of the available data sets and the actual practices in the organizations we studied.
The research can be split into three phases, as shown in Figure 1-1:
Phase 1: Literature studies of state-of-the-art and industrial interviews to increase the understanding of practice (top-down research qustions) (Study 1 and 2).
Phase 2: Quantitative studies of fault reports. This started with a bottom-up exploratory (Study 3) study and continues with top-down confirmatory studies (Study 4).
Phase 3: Qualitative studies to expand the knowledge gained from the quantitative studies (top-down research questions) (Study 5, 6 and 7).
Sections 3.3.1 through 3.3.7 explain the research design and practical setting for each of the studies that make up this thesis.
3.3.1 Study 1: Interviews with company representatives To establish a basis for the most commonly used methods and most common problems encountered in companies that develop business-critical software, several semistructured interviews were carried out with representatives from cooperating companies.
These companies were chosen both for relevance to ‘business-critical’ issues, but also in some ways out of convenience of location and availability.
Before the interviews, a list of topics were discussed and decided, on which the later interviews/talks with the company representatives were based.
Research questions for Study 1:
RQ.S1.a: How the use of well known software development methods may improve business-critical system development?
RQ.S1.b: Do companies know much about safety-critical methods at all? If so, how do they view the possibility of using such safety methods to improve business-critical system development?
RQ.S1.c: What are the most common reliability/safety-related problems in businesscritical system development? – We must identify the most important factors leading to failures or accidents.
We also wanted answers to questions such as:
• What are the most important hindrances for achieving high quality products when developing business-critical software?
• How does industry handle these problems now?
• What are the most important problems encountered during the operation of business-critical software?
• How can we remove or reduce these problems by changing the way businesscritical systems are developed, operated and maintained?
Validity comment. The main validity concerns in this study would be the relatively low number of respondents and that the interviews were carried out by four different researchers.
3.3.2 Study 2: Literature review - Software Criticality Techniques, Fault reporting and management literature.
This study proposed a way to integrate software criticality techniques into a common development regime like RUP. Taking the results from Study 1 into account, together with a literature review of state-of-the-art in software engineering and safety methods, we sought to combine the common and the special, by introducing special techniques from safety related development into the common way of developing business-critical software.
The research questions for Study 2 were:
RQ.S2.a: Which software criticality analysis techniques were most eligible for introduction into a common development framework?
RQ.S2.b: Where in the development process would introduction of such techniques be most effective or easiest to implement?
Validity comment. Being a literature review, we would not be able to validate any findings further than referring to literature.
3.3.3 Study 3: First Empirical analysis of software faults in industrial projects This study looks at when and how faults have been introduced into a system under development, and how they have been found and dealt with. By analysing fault-/change reports for several (semi-)completed development projects, we wanted to investigate if there are common causes for faults being introduced and not being discovered early enough. The goal was to improve the knowledge about why and how faults are introduced, and how we can identify and rectify them earlier in the software.
This study is based on historical data collection/data mining, where the data consists of fault reports we have received from four commercial projects in four different
companies. The steps of the study were the following:
1. Define study goals and research questions.
2. Contact eligible companies for cooperation.
3. Select suitable projects for study and agree on cooperation practicalities.
4. Collect and convert data from projects.
5. Filter data – extracting only fault reports from the total data sets (which in some cases included change reports), and removing duplicate data.
6. Categorize faults according to fault type, software module and severity.
7. Analyze resulting data sets by comparing project internal data, as well as projects against each other.
This investigation was mostly a bottom-up process, because of the initial uncertainty about the available data from the potential participants. After establishing a dialogue with the participating projects, and acquiring the actual fault reports, our initial research questions and goals were altered accordingly.
Initially we wanted to find which types of faults that are most frequent, and if there are some parts of the systems with higher fault-density than others. This also helps show if the pre-defined fault taxonomy is suitable. When we know which types of faults dominate and where these faults appear in the systems, we can focus on the most severe faults to identify the most important targets for later improvement work.
The research questions for Study 3 are:
RQ.S3.a: Which types of faults are most typical for the different software components and parts?
RQ.S3.b: Are certain types of faults considered to be more severe than others by the developers?
Validity comment. Since the number of projects would not be large, we knew that external validity was a concern. The differences in domain, environments and fault reporting procedure, added to these concerns.
3.3.4 Study 4: Second Empirical analysis of software faults in industrial projects This study was based on the lessons learned in Study 3, with somewhat refined metrics to make sure the data material was more suitable for this type of study. The research design was similar to that of Study 3, i.e. it was a historical data collection/data mining study to further explore and confirm the issues from Study 3. In this study, we had access to five projects from one company.
This investigation was a top-down study, as we had identified our research goals before
initiating the study. The research questions for Study 4 are:
RQ.S4.a: Which types of faults are the most common for the studied projects?
RQ.S4.b: Which fault types are rated as the most severe faults?
RQ.S4.c: How do the results of this study compare with our previous fault report study (Study3)?
Validity comment. For this study, we had more fault reports and more projects to study, but everything would be collected from the same organization. Again this would impact external validity.
3.3.5 Study 5: Interviews focusing on empirical results Study 5 was a qualitative study where we interviewed representatives that had been involved in the five projects we studied in Study 4. We performed semi-structured interviews using an interview guide with seven main topics and 32 questions.
We selected interviewees who had been actively involved in some of the five projects we had studied in this organization before and who also had hands-on experience with fault management in the same projects. The interviews were conducted as open-ended interviews, with the same questions asked to each interviewee. However, the interviewees were given room to talk about what they felt was important within the topic of the question.
Each question in the interview guide was related to one or more local research questions, and the different responses for each question were compared to extract answers related to the research questions. In line with using the constant comparison method, we coded each answer into groups. The codes were postformed, i.e. constructed as a part of the coding process, since the interviews were open-ended. Additionally, we received feedback about the topic at hand through discussions and comments during two workshops that were held in the organization in conjunction with the fault report study and interviews.
This study is based on the results from Study 4, on fault reports. The main research questions for this study were therefore derived from the researchers’ viewpoint in Study 4.
Firstly, we wished to see if the experience of the practitioners in the actual projects was similar to the analysis results we had found. Secondly, we wanted to draw on their experience to hear if they thought a common fault type classification scheme could be helpful towards improving their development processes. We also wanted to hear their opinions on possibly increasing the effort in data collection and fault report analysis in order to improve their software development processes. Lastly, we wanted to ask them where they thought that there was most potential of improvement in their fault management system, to elicit areas that they felt were lacking in their current fault reporting process.
This lead to the following four research questions for Study 5:
RQ.S5.a: How can the large number of identified faults from early development phases be explained?
RQ.S5.b: Can the introduction of a standard fault classification scheme like Orthogonal Defect Classification (ODC) be useful to improve development processes?
RQ.S5.c: Do they see feedback from fault report analysis as a useful software process improvement tool?
RQ.S5.d: Do they see any potential improvement areas in their fault management system?
Summed up, the main topics covered in the interviews were:
• The results from our quantitative Study 4 of their development projects,
• The organization’s own measurements of faults,
• Their existing quality and fault management system,
• Fault categorization and fault management,
• Communicating feedback from fault reporting to developers,
• Attitudes to process change and quality improvement for fault management.
Validity comment. The interviews, transcription and data coding would all be performed by one person, which was a threat to internal validity. In addition there was a relatively low number of interviews, which would affect external validity.
3.3.6 Study 6: Comparing results from Hazard Analysis and analysis of Faults This study was prompted by our experiences with fault report analysis, and how some of the faults were comparable to hazards identified from hazard analysis.
By conducting a qualitative hazard analysis of a small existing web-application and database concept/specification, and comparing the results with a quantitative fault report analysis of the actual completed system, we wanted to explore the possibility of using the PHA hazard analysis method to reduce the number of faults being introduced into a system.
The fault report analysis was performed in the same manner as in Studies 3 and 4, and applied on the fault reports we received from the maintainers of the DAIM system. The hazard analysis of the DAIM system was performed by a group of BUCS project researchers, and was performed in a series of PHA sessions. Finally the results of the two analyses were compared.
The three research questions for Study 6 were the following:
RQ.S6.a: What kind of faults in terms of Orthogonal Defect Classification (ODC) fault types does the PHA technique help elicit?