Doctor's Theses (authored and supervised):
"Quality Prediction and Evaluation Models for Products and Processes in Distributed Software Development";
Supervisor, Reviewer: S. Biffl, A. Tjoa;
Institut für Softwaretechnik und Interaktive Systeme,
oral examination: 2008-11-05.
Modern large-scale software development is typically organized in distributed, often globally
dispersed, environments. Distributed Software Development (DSD) projects occur both in
traditional organizations and increasingly sophisticated open source development initiatives.
Leading roles in DSD, such as the project manager and the quality manager, need to evaluate the
actual project progress (e.g., quality of products produced and activities conducted) and trustworthy
models for the prediction of future product quality such as release candidates. However, in a DSD
context the human reporting of progress becomes increasingly complex and the reliability can
become questionable, particularly if face-to-face meetings are not possible that allow to personally
checking the validity of high-level estimates such as the readiness of a software version for release.
For steering the project, e.g., by deciding to release a current software version or to wait and rework
parts of the software, project managers need 1. a quality evaluation framework that defines
what data to collect and convert into meaningful numbers on a higher level; 2. an approach to check
data for validity on all levels; 3. an approach to predict the quality of future products with feedback
on the likely accuracy of the prediction result.
In this work we focus on absence of defects as the major quality criterion of a software product,
where defects are deviations from requirements that need to be repaired. The focus on defect
counting and defect prediction are particularly important as defects decease value for users, and
other quality criteria (e.g. reliability, security, maintainability, usability) can be formulated as
requirements and thus defects can also cover these criteria. Hence, the terms of quality evaluation in
our context is focus on counting defects and collect data that is related to quality improvement (e.g.,
development processes that related to defect removal activities). While quality prediction consists
of collecting historical data from project data sources to construct prediction models that can be
used to estimate number of defects or defective work products prior to release.
Unfortunately, while there are quality evaluation frameworks in traditional software metrics, to our
knowledge there is no appropriate framework available that can be calibrated to modern DSD
environments such as Open Source Software (OSS) environments. Moreover, our systematical
literature review found that prediction models in distributed development settings have to cope with
the following limitations such as a) unsystematic quality prediction planning, b) models based on
product metrics alone shown sufficient accuracy but poor reliability in particular for projects with short development cycle such as in many OSS projects c) insufficient quality of collected data
originated from heterogeneous project data sources.
Key research questions of this thesis are a) to investigate for large OSS projects the most important
development artifact and process attributes that can indicate software product quality for project
progress evaluation and b) to investigate the accuracy and reliability of advanced models for
objective quality prediction in the context of DSD projects.
Main research contributions of this work are:
1. Process quality metrics, so-called project "health indicators", which capture correlated
development activities in product quality improvement, and propose ways to evaluate such
"health indicators" in DSD projects.
2. Quality indicators for software products that provide a range for the likely value of the indicators
rather than a fixed value without indication of data volatility.
3. Research roadmap for software defect prediction based on systematical literature review
4. Structured framework for quality prediction in distributed software development settings
5. Improved quality prediction methods based on product and process metrics that can be collected
from heterogeneous project repositories (e.g. issue tracker, source code management tool).
6. Empirical evaluation in a range of real-world distributed software engineering environments:
case studies from different contexts OSS projects.
The general approach is based on the following assumption: in a sufficiently stable process context
a sequence of observations on product quality and context parameters allow a) identifying
significant context parameters and b) prediction of product quality at a point in time based on the
measured context parameters. We use this approach in the context of DSD (OSS) projects; however,
there are many other potential application areas such as quality evaluation and prediction for
operational software systems.
Created from the Publication Database of the Vienna University of Technology.