ESEC/FSE 2023 CoLos
31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2023)
Powered by
Conference Publishing Consulting

3rd International Workshop on Software Engineering and AI for Data Quality in Cyber-Physical Systems/Internet of Things (SEA4DQ 2023), December 4, 2023, San Francisco, CA, USA

SEA4DQ 2023 – Proceedings

Contents - Abstracts - Authors

3rd International Workshop on Software Engineering and AI for Data Quality in Cyber-Physical Systems/Internet of Things (SEA4DQ 2023)


Title Page

Welcome from the Chairs
Welcome to the 3rd International Workshop on Software Engineering and AI for Data Quality (SEA4DQ 2023) taking place on December 4, 2023, co-located with the ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) in San Francisco, CA, USA.

SEA4DQ 2023 Organization
Committee listings


A Nearest Neighbor-Based Concept Drift Detection Strategy for Reliable Tool Condition Monitoring
Nicolas Jourdan ORCID logo and Joachim Metternich ORCID logo
(TU Darmstadt, Germany)
Condition monitoring is one of the most prominent industrial use cases for machine learning today. As condition monitoring applications are commonly developed using static training datasets, their long-term performance is vulnerable to concept drift in the form of time-dependent changes in environmental and operating conditions as well as data quality problems or sensor drift. When the data distribution changes, machine learning models can fail catastrophically. We show that two-sample tests of homogeneity, which form the basis of most of the available concept drift detection strategies, fail in this domain, as the live data is highly correlated and does not follow the assumption of being independent and identically distributed (i.i.d.) that is often made in academia. We propose a novel drift detection approach called Localized Reference Drift Detection (LRDD) to address this challenge by refining the reference set for the two-sample tests. We demonstrate the performance of the proposed approach in a preliminary evaluation on a tool condition monitoring case study.

Publisher's Version
Enhancing Data Quality in Large-Scale Software Systems for Industrial Automation
Valentina Golendukhina ORCID logo, Lisa Sonnleithner ORCID logo, and Michael Felderer ORCID logo
(University of Innsbruck, Austria; JKU Linz, Austria; DLR, Germany)
Modern industrial systems have become highly automated and data-driven, generating large volumes of data through sophisticated machinery. However, the quality of the collected data is not always optimal, whereas monitoring data quality is challenging due to real-time data constraints. While significant research has been done on data validation of the exported and prepared data, there is no research on implementing data quality practices with programming languages and tools that directly interact with hardware in the domain of cyber-physical production systems (CPPSs), such as IEC 61499 and IEC 61131-3, i.e., software on level 1 of the automation pyramid. By examining a plant-building company, this short paper explores the challenges and opportunities for data quality management at L1 including knowledge transfer, data compression, and metadata formulation, and suggests possible data validation techniques.

Publisher's Version
Data Pre-processing and Sensor-Fusion for Multivariate Statistical Process Control of an Extrusion Process
Frank Westad ORCID logo, Lars Lodgaard ORCID logo, and Torbjørn Pedersen ORCID logo
(Idletechs, Norway; Norwegian University of Science and Technology, Norway; Benteler Automotive, Norway)
In most manufacturing processes, data related to a product are collected across several process steps. Ensuring good data quality is essential for subsequent process modeling, monitoring, and control. Although data for a given process might already be available in digitized form in the process control systems or industrial databases, it is in most cases not so that the data can directly be used in its original form for process modeling. Pre-processing is often needed before modeling, which may include operations such as time alignment by handling different sampling frequencies and lag time, handling of missing values, and detection of sample outliers. Specific considerations must be made for processes with both continuous and batch process steps due to different data structures. This paper describes an industrial use case for extrusion monitoring starting from structured raw data and ending up with real-time multivariate statistical process control (MSPC) applying a sensor-fusion approach and feature extraction. The MSPC also enables in-depth analysis for identifying process variables in the case of samples lying outside of the normal operating conditions (NOC).

Publisher's Version
Challenges for Predictive Quality in Multi-stage Manufacturing: Insights from Literature Review
Beatriz Bretones CassoliORCID logo and Joachim Metternich ORCID logo
(TU Darmstadt, Germany)
This paper investigates data quality challenges in applying predictive quality solutions for multi stage discrete manufacturing. Through an analysis of existing research via systematic literature search, we highlight key obstacles that affect the implementation of machine learning approaches for quality control, such as the quantity and quality of available datasets for model training and testing and available quality labels for supervised training. Our findings underscore the necessity of addressing these challenges to enhance the accuracy and scalability of predictive quality models.

Publisher's Version

proc time: 1.91