3° DAUIN Lunch Seminar. Data quality: standards and applications
The recent emphasis on big-data, linked-data, and data analytics made the attention of software professionals shifted from purely software aspects to progressively include more and more data issues. The basic GIGO principle reported above explains the increasing importance of data and information quality.
The recently published standards on data quality (ISO/IEC 25012 and ISO/IEC 25024) within the SQuaRE project (Software product Quality Requirements and Evaluation) make data a first-class quality subject. Such a paradigm shift is relatively recent and is a topic seldom addressed in practice and research.
The importance of data quality lies in several aspects:
- the basic GIGO principle,
- the increasing amount and variety of data processed by software applications,
- the need for a sound basis to provide credentials and possibly certification for data used and produced by software systems.
The challenges for data quality assessment are the practical implementation of the standard measures, as well as the organization of the quality assessment process, the definition of quality frameworks to include unstructured data, such as semantic knowledge bases, and the automation of the measurement activities to make the assessment scale to large data sets.
The goal of this talk is to provide an introduction to data quality, present the relative ISO standard and report on the issues concerning the application of data quality to open-data (OD) and open-government-data (OGD).
Marco Torchiano is an associate professor at the Control and Computer Engineering Dept. of Politecnico di Torino, Italy; he has been post-doctoral research fellow at Norwegian University of Science and Technology (NTNU), Norway. He received an MSc and a PhD in Computer Engineering from Politecnico di Torino. He is Senior Member of the IEEE and member of the software engineering committee of UNINFO (part of ISO/IEC JTC 1). He is author or co-author of over 130 research papers published in international journals and conferences, of the book ‘Software Development—Case studies in Java’ from Addison-Wesley, and co-editor of the book ‘Developing Services for the Wireless Internet’ from Springer. He recently was a visiting professor at Polytechnique Montréal on the software energy consumption topic. His current research interests are: green software, UI testing methodologies, open-data quality, and modeling notations. The methodological approach he adopts is that of empirical software engineering.