Workshop SEA4DQ 2022 – Author Index |
Contents -
Abstracts -
Authors
|
Aamodt, Arianeh |
SEA4DQ '22: "Data Quality Issues for Vibration ..."
Data Quality Issues for Vibration Sensors: A Case Study in Ferrosilicon Production
Maryna Waszak, Terje Moen, Sølve Eidnes, Alexander Stasik, Anders Hansen, Gregory Bouquet, Antoine Pultier, Xiang Ma, Idar Tørlen, Bjørn Henriksen, Arianeh Aamodt, and Dumitru Roman (SINTEF, Norway; Elkem, Norway) Digitisation in the mining and metal processing industries plays a key role in their modernisation. Production processes are more and more supported by a variety of sensors that produce large amounts of data that meant to provide insights into the performance of production infrastructures. In the metal processing industry vibration sensors are essential in the monitoring of the production infrastructure. In this position paper we present the installation of vibration sensors in a real industrial environment and discuss the data quality issues we encountered while using such sensors. @InProceedings{SEA4DQ22p22, author = {Maryna Waszak and Terje Moen and Sølve Eidnes and Alexander Stasik and Anders Hansen and Gregory Bouquet and Antoine Pultier and Xiang Ma and Idar Tørlen and Bjørn Henriksen and Arianeh Aamodt and Dumitru Roman}, title = {Data Quality Issues for Vibration Sensors: A Case Study in Ferrosilicon Production}, booktitle = {Proc.\ SEA4DQ}, publisher = {ACM}, pages = {22--23}, doi = {10.1145/3549037.3561273}, year = {2022}, } Publisher's Version |
|
Bouquet, Gregory |
SEA4DQ '22: "Data Quality Issues for Vibration ..."
Data Quality Issues for Vibration Sensors: A Case Study in Ferrosilicon Production
Maryna Waszak, Terje Moen, Sølve Eidnes, Alexander Stasik, Anders Hansen, Gregory Bouquet, Antoine Pultier, Xiang Ma, Idar Tørlen, Bjørn Henriksen, Arianeh Aamodt, and Dumitru Roman (SINTEF, Norway; Elkem, Norway) Digitisation in the mining and metal processing industries plays a key role in their modernisation. Production processes are more and more supported by a variety of sensors that produce large amounts of data that meant to provide insights into the performance of production infrastructures. In the metal processing industry vibration sensors are essential in the monitoring of the production infrastructure. In this position paper we present the installation of vibration sensors in a real industrial environment and discuss the data quality issues we encountered while using such sensors. @InProceedings{SEA4DQ22p22, author = {Maryna Waszak and Terje Moen and Sølve Eidnes and Alexander Stasik and Anders Hansen and Gregory Bouquet and Antoine Pultier and Xiang Ma and Idar Tørlen and Bjørn Henriksen and Arianeh Aamodt and Dumitru Roman}, title = {Data Quality Issues for Vibration Sensors: A Case Study in Ferrosilicon Production}, booktitle = {Proc.\ SEA4DQ}, publisher = {ACM}, pages = {22--23}, doi = {10.1145/3549037.3561273}, year = {2022}, } Publisher's Version |
|
Eidnes, Sølve |
SEA4DQ '22: "Data Quality Issues for Vibration ..."
Data Quality Issues for Vibration Sensors: A Case Study in Ferrosilicon Production
Maryna Waszak, Terje Moen, Sølve Eidnes, Alexander Stasik, Anders Hansen, Gregory Bouquet, Antoine Pultier, Xiang Ma, Idar Tørlen, Bjørn Henriksen, Arianeh Aamodt, and Dumitru Roman (SINTEF, Norway; Elkem, Norway) Digitisation in the mining and metal processing industries plays a key role in their modernisation. Production processes are more and more supported by a variety of sensors that produce large amounts of data that meant to provide insights into the performance of production infrastructures. In the metal processing industry vibration sensors are essential in the monitoring of the production infrastructure. In this position paper we present the installation of vibration sensors in a real industrial environment and discuss the data quality issues we encountered while using such sensors. @InProceedings{SEA4DQ22p22, author = {Maryna Waszak and Terje Moen and Sølve Eidnes and Alexander Stasik and Anders Hansen and Gregory Bouquet and Antoine Pultier and Xiang Ma and Idar Tørlen and Bjørn Henriksen and Arianeh Aamodt and Dumitru Roman}, title = {Data Quality Issues for Vibration Sensors: A Case Study in Ferrosilicon Production}, booktitle = {Proc.\ SEA4DQ}, publisher = {ACM}, pages = {22--23}, doi = {10.1145/3549037.3561273}, year = {2022}, } Publisher's Version |
|
Felderer, Michael |
SEA4DQ '22: "Preliminary Findings on the ..."
Preliminary Findings on the Occurrence and Causes of Data Smells in a Real-World Business Travel Data Processing Pipeline
Valentina Golendukhina, Harald Foidl, Michael Felderer, and Rudolf Ramler (University of Innsbruck, Austria; Software Competence Center Hagenberg, Austria) Detection of poor quality data is crucial for enhancing data-driven systems' quality. Although there is a lot of research on data validation, the topic of potential data quality issues is still underexplored. Such latent issues or data smells can often stay undetected and lead to the poor future performance of data-intensive systems. Detecting data smells is not trivial and requires knowledge about their causes. In this paper, we present the preliminary findings on the causes and severity of data smells based on a study of a real-world business travel data set and the data processing pipeline behind it. The results show that data smells exist in this data set and cause severe problems. Although many data smells already occur in raw data, some smells are created during the transformation and enrichment stages of the data processing pipeline. These findings indicate the importance of the data pipeline itself for future research on data smells. Thus, this article proposes potential future work in this area. @InProceedings{SEA4DQ22p18, author = {Valentina Golendukhina and Harald Foidl and Michael Felderer and Rudolf Ramler}, title = {Preliminary Findings on the Occurrence and Causes of Data Smells in a Real-World Business Travel Data Processing Pipeline}, booktitle = {Proc.\ SEA4DQ}, publisher = {ACM}, pages = {18--21}, doi = {10.1145/3549037.3561275}, year = {2022}, } Publisher's Version |
|
Foidl, Harald |
SEA4DQ '22: "Preliminary Findings on the ..."
Preliminary Findings on the Occurrence and Causes of Data Smells in a Real-World Business Travel Data Processing Pipeline
Valentina Golendukhina, Harald Foidl, Michael Felderer, and Rudolf Ramler (University of Innsbruck, Austria; Software Competence Center Hagenberg, Austria) Detection of poor quality data is crucial for enhancing data-driven systems' quality. Although there is a lot of research on data validation, the topic of potential data quality issues is still underexplored. Such latent issues or data smells can often stay undetected and lead to the poor future performance of data-intensive systems. Detecting data smells is not trivial and requires knowledge about their causes. In this paper, we present the preliminary findings on the causes and severity of data smells based on a study of a real-world business travel data set and the data processing pipeline behind it. The results show that data smells exist in this data set and cause severe problems. Although many data smells already occur in raw data, some smells are created during the transformation and enrichment stages of the data processing pipeline. These findings indicate the importance of the data pipeline itself for future research on data smells. Thus, this article proposes potential future work in this area. @InProceedings{SEA4DQ22p18, author = {Valentina Golendukhina and Harald Foidl and Michael Felderer and Rudolf Ramler}, title = {Preliminary Findings on the Occurrence and Causes of Data Smells in a Real-World Business Travel Data Processing Pipeline}, booktitle = {Proc.\ SEA4DQ}, publisher = {ACM}, pages = {18--21}, doi = {10.1145/3549037.3561275}, year = {2022}, } Publisher's Version |
|
Golendukhina, Valentina |
SEA4DQ '22: "Preliminary Findings on the ..."
Preliminary Findings on the Occurrence and Causes of Data Smells in a Real-World Business Travel Data Processing Pipeline
Valentina Golendukhina, Harald Foidl, Michael Felderer, and Rudolf Ramler (University of Innsbruck, Austria; Software Competence Center Hagenberg, Austria) Detection of poor quality data is crucial for enhancing data-driven systems' quality. Although there is a lot of research on data validation, the topic of potential data quality issues is still underexplored. Such latent issues or data smells can often stay undetected and lead to the poor future performance of data-intensive systems. Detecting data smells is not trivial and requires knowledge about their causes. In this paper, we present the preliminary findings on the causes and severity of data smells based on a study of a real-world business travel data set and the data processing pipeline behind it. The results show that data smells exist in this data set and cause severe problems. Although many data smells already occur in raw data, some smells are created during the transformation and enrichment stages of the data processing pipeline. These findings indicate the importance of the data pipeline itself for future research on data smells. Thus, this article proposes potential future work in this area. @InProceedings{SEA4DQ22p18, author = {Valentina Golendukhina and Harald Foidl and Michael Felderer and Rudolf Ramler}, title = {Preliminary Findings on the Occurrence and Causes of Data Smells in a Real-World Business Travel Data Processing Pipeline}, booktitle = {Proc.\ SEA4DQ}, publisher = {ACM}, pages = {18--21}, doi = {10.1145/3549037.3561275}, year = {2022}, } Publisher's Version |
|
Hansen, Anders |
SEA4DQ '22: "Data Quality Issues for Vibration ..."
Data Quality Issues for Vibration Sensors: A Case Study in Ferrosilicon Production
Maryna Waszak, Terje Moen, Sølve Eidnes, Alexander Stasik, Anders Hansen, Gregory Bouquet, Antoine Pultier, Xiang Ma, Idar Tørlen, Bjørn Henriksen, Arianeh Aamodt, and Dumitru Roman (SINTEF, Norway; Elkem, Norway) Digitisation in the mining and metal processing industries plays a key role in their modernisation. Production processes are more and more supported by a variety of sensors that produce large amounts of data that meant to provide insights into the performance of production infrastructures. In the metal processing industry vibration sensors are essential in the monitoring of the production infrastructure. In this position paper we present the installation of vibration sensors in a real industrial environment and discuss the data quality issues we encountered while using such sensors. @InProceedings{SEA4DQ22p22, author = {Maryna Waszak and Terje Moen and Sølve Eidnes and Alexander Stasik and Anders Hansen and Gregory Bouquet and Antoine Pultier and Xiang Ma and Idar Tørlen and Bjørn Henriksen and Arianeh Aamodt and Dumitru Roman}, title = {Data Quality Issues for Vibration Sensors: A Case Study in Ferrosilicon Production}, booktitle = {Proc.\ SEA4DQ}, publisher = {ACM}, pages = {22--23}, doi = {10.1145/3549037.3561273}, year = {2022}, } Publisher's Version |
|
Henriksen, Bjørn |
SEA4DQ '22: "Data Quality Issues for Vibration ..."
Data Quality Issues for Vibration Sensors: A Case Study in Ferrosilicon Production
Maryna Waszak, Terje Moen, Sølve Eidnes, Alexander Stasik, Anders Hansen, Gregory Bouquet, Antoine Pultier, Xiang Ma, Idar Tørlen, Bjørn Henriksen, Arianeh Aamodt, and Dumitru Roman (SINTEF, Norway; Elkem, Norway) Digitisation in the mining and metal processing industries plays a key role in their modernisation. Production processes are more and more supported by a variety of sensors that produce large amounts of data that meant to provide insights into the performance of production infrastructures. In the metal processing industry vibration sensors are essential in the monitoring of the production infrastructure. In this position paper we present the installation of vibration sensors in a real industrial environment and discuss the data quality issues we encountered while using such sensors. @InProceedings{SEA4DQ22p22, author = {Maryna Waszak and Terje Moen and Sølve Eidnes and Alexander Stasik and Anders Hansen and Gregory Bouquet and Antoine Pultier and Xiang Ma and Idar Tørlen and Bjørn Henriksen and Arianeh Aamodt and Dumitru Roman}, title = {Data Quality Issues for Vibration Sensors: A Case Study in Ferrosilicon Production}, booktitle = {Proc.\ SEA4DQ}, publisher = {ACM}, pages = {22--23}, doi = {10.1145/3549037.3561273}, year = {2022}, } Publisher's Version |
|
Jilani, Muhammad Taha |
SEA4DQ '22: "Effect of Time Patterns in ..."
Effect of Time Patterns in Mining Process Invariants for Industrial Control Systems: An Experimental Study
Muhammad Azmi Umer, Aditya Mathur, and Muhammad Taha Jilani (CodeX, Pakistan; Karachi Institute of Economics and Technology, Pakistan; Singapore University of Technology and Design, Singapore) Machine Learning is playing a crucial role in the design of intrusion detectors for Industrial Control Systems (ICS). Intrusion Detection Systems (IDS) rely on data obtained from an operational ICS. Such datasets contain multiple time series, one for each process variable. In this work, we explore how such time series can be exploited to understand the effect of time patterns in mining the process invariants, i.e., conditions on process state variables. We use the knowledge gained through the time patterns to determine the optimal data collection size for generating the invariants. The study reported here was conducted using the operational data obtained from a water treatment plant. @InProceedings{SEA4DQ22p10, author = {Muhammad Azmi Umer and Aditya Mathur and Muhammad Taha Jilani}, title = {Effect of Time Patterns in Mining Process Invariants for Industrial Control Systems: An Experimental Study}, booktitle = {Proc.\ SEA4DQ}, publisher = {ACM}, pages = {10--17}, doi = {10.1145/3549037.3561274}, year = {2022}, } Publisher's Version |
|
Khomh, Foutse |
SEA4DQ '22: "Data Quality and Model Under-Specification ..."
Data Quality and Model Under-Specification Issues (Keynote)
Foutse Khomh (Polytechnique Montréal, Canada) Nowadays, we are witnessing an increasing demand in both industry and academia for exploiting Deep Learning (DL) to solve complex real-world problems. However, the performance of these high-capacity learners is currently bounded by the quality and volume of their underlying training data. The use of incomplete, erroneous, or inappropriate training data, and the implementation of poor data management practices in a training pipeline often result into unreliable, biased, or under specified models. In this talk, I will report about some recent research works that we have conducted to identify best practices of data management for DL. I will also report about recent techniques and tools that we have developed to help detect the root cause of model under-specification issues early on during a DL training process. @InProceedings{SEA4DQ22p2, author = {Foutse Khomh}, title = {Data Quality and Model Under-Specification Issues (Keynote)}, booktitle = {Proc.\ SEA4DQ}, publisher = {ACM}, pages = {2--2}, doi = {10.1145/3549037.3570195}, year = {2022}, } Publisher's Version |
|
Mathur, Aditya |
SEA4DQ '22: "Effect of Time Patterns in ..."
Effect of Time Patterns in Mining Process Invariants for Industrial Control Systems: An Experimental Study
Muhammad Azmi Umer, Aditya Mathur, and Muhammad Taha Jilani (CodeX, Pakistan; Karachi Institute of Economics and Technology, Pakistan; Singapore University of Technology and Design, Singapore) Machine Learning is playing a crucial role in the design of intrusion detectors for Industrial Control Systems (ICS). Intrusion Detection Systems (IDS) rely on data obtained from an operational ICS. Such datasets contain multiple time series, one for each process variable. In this work, we explore how such time series can be exploited to understand the effect of time patterns in mining the process invariants, i.e., conditions on process state variables. We use the knowledge gained through the time patterns to determine the optimal data collection size for generating the invariants. The study reported here was conducted using the operational data obtained from a water treatment plant. @InProceedings{SEA4DQ22p10, author = {Muhammad Azmi Umer and Aditya Mathur and Muhammad Taha Jilani}, title = {Effect of Time Patterns in Mining Process Invariants for Industrial Control Systems: An Experimental Study}, booktitle = {Proc.\ SEA4DQ}, publisher = {ACM}, pages = {10--17}, doi = {10.1145/3549037.3561274}, year = {2022}, } Publisher's Version |
|
Ma, Xiang |
SEA4DQ '22: "Data Quality Issues in Solar ..."
Data Quality Issues in Solar Panels Installations: A Case Study
Dumitru Roman, Antoine Pultier, Xiang Ma, Ahmet Soylu, and Alexander G. Ulyashin (SINTEF, Norway; Oslo Metropolitan University, Norway) Solar photovoltaics (PV) is becoming an important source of global electricity generation. Modern PV installations come with a variety of sensors attached to them for monitoring purposes (e.g., maintenance, prediction of electricity generation, etc.). Data collection (and implicitly the quality of data) from PV systems is becoming essential in this context. In this position paper, we introduce a modern PV mini power plant demo site setup for research purposes and discuss the data quality issues we encountered in operating the power plant. @InProceedings{SEA4DQ22p24, author = {Dumitru Roman and Antoine Pultier and Xiang Ma and Ahmet Soylu and Alexander G. Ulyashin}, title = {Data Quality Issues in Solar Panels Installations: A Case Study}, booktitle = {Proc.\ SEA4DQ}, publisher = {ACM}, pages = {24--25}, doi = {10.1145/3549037.3564120}, year = {2022}, } Publisher's Version SEA4DQ '22: "Data Quality Issues for Vibration ..." Data Quality Issues for Vibration Sensors: A Case Study in Ferrosilicon Production Maryna Waszak, Terje Moen, Sølve Eidnes, Alexander Stasik, Anders Hansen, Gregory Bouquet, Antoine Pultier, Xiang Ma, Idar Tørlen, Bjørn Henriksen, Arianeh Aamodt, and Dumitru Roman (SINTEF, Norway; Elkem, Norway) Digitisation in the mining and metal processing industries plays a key role in their modernisation. Production processes are more and more supported by a variety of sensors that produce large amounts of data that meant to provide insights into the performance of production infrastructures. In the metal processing industry vibration sensors are essential in the monitoring of the production infrastructure. In this position paper we present the installation of vibration sensors in a real industrial environment and discuss the data quality issues we encountered while using such sensors. @InProceedings{SEA4DQ22p22, author = {Maryna Waszak and Terje Moen and Sølve Eidnes and Alexander Stasik and Anders Hansen and Gregory Bouquet and Antoine Pultier and Xiang Ma and Idar Tørlen and Bjørn Henriksen and Arianeh Aamodt and Dumitru Roman}, title = {Data Quality Issues for Vibration Sensors: A Case Study in Ferrosilicon Production}, booktitle = {Proc.\ SEA4DQ}, publisher = {ACM}, pages = {22--23}, doi = {10.1145/3549037.3561273}, year = {2022}, } Publisher's Version |
|
Metzger, Andreas |
SEA4DQ '22: "Data Quality Issues in Online ..."
Data Quality Issues in Online Reinforcement Learning for Self-Adaptive Systems (Keynote)
Andreas Metzger (University of Duisburg-Essen, Germany) Online reinforcement learning is an emerging machine learning approach that addresses the challenge of design-time uncertainty faced when building self-adaptive systems. Online reinforcement learning means that the self-adaptive system can learn from data only available at run time. After introducing the fundamentals of self-adaptive systems and reinforcement learning, the keynote discusses three relevant issues and recent solutions related to data quality in online reinforcement learning for self-adaptive systems. @InProceedings{SEA4DQ22p1, author = {Andreas Metzger}, title = {Data Quality Issues in Online Reinforcement Learning for Self-Adaptive Systems (Keynote)}, booktitle = {Proc.\ SEA4DQ}, publisher = {ACM}, pages = {1--1}, doi = {10.1145/3549037.3570194}, year = {2022}, } Publisher's Version |
|
Moen, Terje |
SEA4DQ '22: "Data Quality Issues for Vibration ..."
Data Quality Issues for Vibration Sensors: A Case Study in Ferrosilicon Production
Maryna Waszak, Terje Moen, Sølve Eidnes, Alexander Stasik, Anders Hansen, Gregory Bouquet, Antoine Pultier, Xiang Ma, Idar Tørlen, Bjørn Henriksen, Arianeh Aamodt, and Dumitru Roman (SINTEF, Norway; Elkem, Norway) Digitisation in the mining and metal processing industries plays a key role in their modernisation. Production processes are more and more supported by a variety of sensors that produce large amounts of data that meant to provide insights into the performance of production infrastructures. In the metal processing industry vibration sensors are essential in the monitoring of the production infrastructure. In this position paper we present the installation of vibration sensors in a real industrial environment and discuss the data quality issues we encountered while using such sensors. @InProceedings{SEA4DQ22p22, author = {Maryna Waszak and Terje Moen and Sølve Eidnes and Alexander Stasik and Anders Hansen and Gregory Bouquet and Antoine Pultier and Xiang Ma and Idar Tørlen and Bjørn Henriksen and Arianeh Aamodt and Dumitru Roman}, title = {Data Quality Issues for Vibration Sensors: A Case Study in Ferrosilicon Production}, booktitle = {Proc.\ SEA4DQ}, publisher = {ACM}, pages = {22--23}, doi = {10.1145/3549037.3561273}, year = {2022}, } Publisher's Version |
|
Myrseth, Per |
SEA4DQ '22: "Data Quality as a Microservice: ..."
Data Quality as a Microservice: An Ontology and Rule Based Approach for Quality Assurance of Sensor Data in Manufacturing Machines
Jørgen Stang, Dirk Walther, and Per Myrseth (DNV, Norway) The manufacturing industry is continuously looking for production improvements resulting in high quality production, reduced waste and competitive advantages. In this article, ontologies, semantic rule logic and microservices have been deployed to suggest a system for quality assurance of manufacturing machine data. The existing upper ontology for manufacturing service description has been used to define both the physical assets as well as the data quality requirements. The system is used to both operationalize data quality monitoring by semantic technology as well as enabling up-front modelling of data quality requirements. The approach is illustrated by a specific speed-feed case for manufacturing machines but could easily be extended to other manufacturing use-cases or even to other industries. @InProceedings{SEA4DQ22p3, author = {Jørgen Stang and Dirk Walther and Per Myrseth}, title = {Data Quality as a Microservice: An Ontology and Rule Based Approach for Quality Assurance of Sensor Data in Manufacturing Machines}, booktitle = {Proc.\ SEA4DQ}, publisher = {ACM}, pages = {3--9}, doi = {10.1145/3549037.3561272}, year = {2022}, } Publisher's Version |
|
Pultier, Antoine |
SEA4DQ '22: "Data Quality Issues in Solar ..."
Data Quality Issues in Solar Panels Installations: A Case Study
Dumitru Roman, Antoine Pultier, Xiang Ma, Ahmet Soylu, and Alexander G. Ulyashin (SINTEF, Norway; Oslo Metropolitan University, Norway) Solar photovoltaics (PV) is becoming an important source of global electricity generation. Modern PV installations come with a variety of sensors attached to them for monitoring purposes (e.g., maintenance, prediction of electricity generation, etc.). Data collection (and implicitly the quality of data) from PV systems is becoming essential in this context. In this position paper, we introduce a modern PV mini power plant demo site setup for research purposes and discuss the data quality issues we encountered in operating the power plant. @InProceedings{SEA4DQ22p24, author = {Dumitru Roman and Antoine Pultier and Xiang Ma and Ahmet Soylu and Alexander G. Ulyashin}, title = {Data Quality Issues in Solar Panels Installations: A Case Study}, booktitle = {Proc.\ SEA4DQ}, publisher = {ACM}, pages = {24--25}, doi = {10.1145/3549037.3564120}, year = {2022}, } Publisher's Version SEA4DQ '22: "Data Quality Issues for Vibration ..." Data Quality Issues for Vibration Sensors: A Case Study in Ferrosilicon Production Maryna Waszak, Terje Moen, Sølve Eidnes, Alexander Stasik, Anders Hansen, Gregory Bouquet, Antoine Pultier, Xiang Ma, Idar Tørlen, Bjørn Henriksen, Arianeh Aamodt, and Dumitru Roman (SINTEF, Norway; Elkem, Norway) Digitisation in the mining and metal processing industries plays a key role in their modernisation. Production processes are more and more supported by a variety of sensors that produce large amounts of data that meant to provide insights into the performance of production infrastructures. In the metal processing industry vibration sensors are essential in the monitoring of the production infrastructure. In this position paper we present the installation of vibration sensors in a real industrial environment and discuss the data quality issues we encountered while using such sensors. @InProceedings{SEA4DQ22p22, author = {Maryna Waszak and Terje Moen and Sølve Eidnes and Alexander Stasik and Anders Hansen and Gregory Bouquet and Antoine Pultier and Xiang Ma and Idar Tørlen and Bjørn Henriksen and Arianeh Aamodt and Dumitru Roman}, title = {Data Quality Issues for Vibration Sensors: A Case Study in Ferrosilicon Production}, booktitle = {Proc.\ SEA4DQ}, publisher = {ACM}, pages = {22--23}, doi = {10.1145/3549037.3561273}, year = {2022}, } Publisher's Version |
|
Ramler, Rudolf |
SEA4DQ '22: "Preliminary Findings on the ..."
Preliminary Findings on the Occurrence and Causes of Data Smells in a Real-World Business Travel Data Processing Pipeline
Valentina Golendukhina, Harald Foidl, Michael Felderer, and Rudolf Ramler (University of Innsbruck, Austria; Software Competence Center Hagenberg, Austria) Detection of poor quality data is crucial for enhancing data-driven systems' quality. Although there is a lot of research on data validation, the topic of potential data quality issues is still underexplored. Such latent issues or data smells can often stay undetected and lead to the poor future performance of data-intensive systems. Detecting data smells is not trivial and requires knowledge about their causes. In this paper, we present the preliminary findings on the causes and severity of data smells based on a study of a real-world business travel data set and the data processing pipeline behind it. The results show that data smells exist in this data set and cause severe problems. Although many data smells already occur in raw data, some smells are created during the transformation and enrichment stages of the data processing pipeline. These findings indicate the importance of the data pipeline itself for future research on data smells. Thus, this article proposes potential future work in this area. @InProceedings{SEA4DQ22p18, author = {Valentina Golendukhina and Harald Foidl and Michael Felderer and Rudolf Ramler}, title = {Preliminary Findings on the Occurrence and Causes of Data Smells in a Real-World Business Travel Data Processing Pipeline}, booktitle = {Proc.\ SEA4DQ}, publisher = {ACM}, pages = {18--21}, doi = {10.1145/3549037.3561275}, year = {2022}, } Publisher's Version |
|
Roman, Dumitru |
SEA4DQ '22: "Data Quality Issues in Solar ..."
Data Quality Issues in Solar Panels Installations: A Case Study
Dumitru Roman, Antoine Pultier, Xiang Ma, Ahmet Soylu, and Alexander G. Ulyashin (SINTEF, Norway; Oslo Metropolitan University, Norway) Solar photovoltaics (PV) is becoming an important source of global electricity generation. Modern PV installations come with a variety of sensors attached to them for monitoring purposes (e.g., maintenance, prediction of electricity generation, etc.). Data collection (and implicitly the quality of data) from PV systems is becoming essential in this context. In this position paper, we introduce a modern PV mini power plant demo site setup for research purposes and discuss the data quality issues we encountered in operating the power plant. @InProceedings{SEA4DQ22p24, author = {Dumitru Roman and Antoine Pultier and Xiang Ma and Ahmet Soylu and Alexander G. Ulyashin}, title = {Data Quality Issues in Solar Panels Installations: A Case Study}, booktitle = {Proc.\ SEA4DQ}, publisher = {ACM}, pages = {24--25}, doi = {10.1145/3549037.3564120}, year = {2022}, } Publisher's Version SEA4DQ '22: "Data Quality Issues for Vibration ..." Data Quality Issues for Vibration Sensors: A Case Study in Ferrosilicon Production Maryna Waszak, Terje Moen, Sølve Eidnes, Alexander Stasik, Anders Hansen, Gregory Bouquet, Antoine Pultier, Xiang Ma, Idar Tørlen, Bjørn Henriksen, Arianeh Aamodt, and Dumitru Roman (SINTEF, Norway; Elkem, Norway) Digitisation in the mining and metal processing industries plays a key role in their modernisation. Production processes are more and more supported by a variety of sensors that produce large amounts of data that meant to provide insights into the performance of production infrastructures. In the metal processing industry vibration sensors are essential in the monitoring of the production infrastructure. In this position paper we present the installation of vibration sensors in a real industrial environment and discuss the data quality issues we encountered while using such sensors. @InProceedings{SEA4DQ22p22, author = {Maryna Waszak and Terje Moen and Sølve Eidnes and Alexander Stasik and Anders Hansen and Gregory Bouquet and Antoine Pultier and Xiang Ma and Idar Tørlen and Bjørn Henriksen and Arianeh Aamodt and Dumitru Roman}, title = {Data Quality Issues for Vibration Sensors: A Case Study in Ferrosilicon Production}, booktitle = {Proc.\ SEA4DQ}, publisher = {ACM}, pages = {22--23}, doi = {10.1145/3549037.3561273}, year = {2022}, } Publisher's Version |
|
Soylu, Ahmet |
SEA4DQ '22: "Data Quality Issues in Solar ..."
Data Quality Issues in Solar Panels Installations: A Case Study
Dumitru Roman, Antoine Pultier, Xiang Ma, Ahmet Soylu, and Alexander G. Ulyashin (SINTEF, Norway; Oslo Metropolitan University, Norway) Solar photovoltaics (PV) is becoming an important source of global electricity generation. Modern PV installations come with a variety of sensors attached to them for monitoring purposes (e.g., maintenance, prediction of electricity generation, etc.). Data collection (and implicitly the quality of data) from PV systems is becoming essential in this context. In this position paper, we introduce a modern PV mini power plant demo site setup for research purposes and discuss the data quality issues we encountered in operating the power plant. @InProceedings{SEA4DQ22p24, author = {Dumitru Roman and Antoine Pultier and Xiang Ma and Ahmet Soylu and Alexander G. Ulyashin}, title = {Data Quality Issues in Solar Panels Installations: A Case Study}, booktitle = {Proc.\ SEA4DQ}, publisher = {ACM}, pages = {24--25}, doi = {10.1145/3549037.3564120}, year = {2022}, } Publisher's Version |
|
Stang, Jørgen |
SEA4DQ '22: "Data Quality as a Microservice: ..."
Data Quality as a Microservice: An Ontology and Rule Based Approach for Quality Assurance of Sensor Data in Manufacturing Machines
Jørgen Stang, Dirk Walther, and Per Myrseth (DNV, Norway) The manufacturing industry is continuously looking for production improvements resulting in high quality production, reduced waste and competitive advantages. In this article, ontologies, semantic rule logic and microservices have been deployed to suggest a system for quality assurance of manufacturing machine data. The existing upper ontology for manufacturing service description has been used to define both the physical assets as well as the data quality requirements. The system is used to both operationalize data quality monitoring by semantic technology as well as enabling up-front modelling of data quality requirements. The approach is illustrated by a specific speed-feed case for manufacturing machines but could easily be extended to other manufacturing use-cases or even to other industries. @InProceedings{SEA4DQ22p3, author = {Jørgen Stang and Dirk Walther and Per Myrseth}, title = {Data Quality as a Microservice: An Ontology and Rule Based Approach for Quality Assurance of Sensor Data in Manufacturing Machines}, booktitle = {Proc.\ SEA4DQ}, publisher = {ACM}, pages = {3--9}, doi = {10.1145/3549037.3561272}, year = {2022}, } Publisher's Version |
|
Stasik, Alexander |
SEA4DQ '22: "Data Quality Issues for Vibration ..."
Data Quality Issues for Vibration Sensors: A Case Study in Ferrosilicon Production
Maryna Waszak, Terje Moen, Sølve Eidnes, Alexander Stasik, Anders Hansen, Gregory Bouquet, Antoine Pultier, Xiang Ma, Idar Tørlen, Bjørn Henriksen, Arianeh Aamodt, and Dumitru Roman (SINTEF, Norway; Elkem, Norway) Digitisation in the mining and metal processing industries plays a key role in their modernisation. Production processes are more and more supported by a variety of sensors that produce large amounts of data that meant to provide insights into the performance of production infrastructures. In the metal processing industry vibration sensors are essential in the monitoring of the production infrastructure. In this position paper we present the installation of vibration sensors in a real industrial environment and discuss the data quality issues we encountered while using such sensors. @InProceedings{SEA4DQ22p22, author = {Maryna Waszak and Terje Moen and Sølve Eidnes and Alexander Stasik and Anders Hansen and Gregory Bouquet and Antoine Pultier and Xiang Ma and Idar Tørlen and Bjørn Henriksen and Arianeh Aamodt and Dumitru Roman}, title = {Data Quality Issues for Vibration Sensors: A Case Study in Ferrosilicon Production}, booktitle = {Proc.\ SEA4DQ}, publisher = {ACM}, pages = {22--23}, doi = {10.1145/3549037.3561273}, year = {2022}, } Publisher's Version |
|
Tørlen, Idar |
SEA4DQ '22: "Data Quality Issues for Vibration ..."
Data Quality Issues for Vibration Sensors: A Case Study in Ferrosilicon Production
Maryna Waszak, Terje Moen, Sølve Eidnes, Alexander Stasik, Anders Hansen, Gregory Bouquet, Antoine Pultier, Xiang Ma, Idar Tørlen, Bjørn Henriksen, Arianeh Aamodt, and Dumitru Roman (SINTEF, Norway; Elkem, Norway) Digitisation in the mining and metal processing industries plays a key role in their modernisation. Production processes are more and more supported by a variety of sensors that produce large amounts of data that meant to provide insights into the performance of production infrastructures. In the metal processing industry vibration sensors are essential in the monitoring of the production infrastructure. In this position paper we present the installation of vibration sensors in a real industrial environment and discuss the data quality issues we encountered while using such sensors. @InProceedings{SEA4DQ22p22, author = {Maryna Waszak and Terje Moen and Sølve Eidnes and Alexander Stasik and Anders Hansen and Gregory Bouquet and Antoine Pultier and Xiang Ma and Idar Tørlen and Bjørn Henriksen and Arianeh Aamodt and Dumitru Roman}, title = {Data Quality Issues for Vibration Sensors: A Case Study in Ferrosilicon Production}, booktitle = {Proc.\ SEA4DQ}, publisher = {ACM}, pages = {22--23}, doi = {10.1145/3549037.3561273}, year = {2022}, } Publisher's Version |
|
Ulyashin, Alexander G. |
SEA4DQ '22: "Data Quality Issues in Solar ..."
Data Quality Issues in Solar Panels Installations: A Case Study
Dumitru Roman, Antoine Pultier, Xiang Ma, Ahmet Soylu, and Alexander G. Ulyashin (SINTEF, Norway; Oslo Metropolitan University, Norway) Solar photovoltaics (PV) is becoming an important source of global electricity generation. Modern PV installations come with a variety of sensors attached to them for monitoring purposes (e.g., maintenance, prediction of electricity generation, etc.). Data collection (and implicitly the quality of data) from PV systems is becoming essential in this context. In this position paper, we introduce a modern PV mini power plant demo site setup for research purposes and discuss the data quality issues we encountered in operating the power plant. @InProceedings{SEA4DQ22p24, author = {Dumitru Roman and Antoine Pultier and Xiang Ma and Ahmet Soylu and Alexander G. Ulyashin}, title = {Data Quality Issues in Solar Panels Installations: A Case Study}, booktitle = {Proc.\ SEA4DQ}, publisher = {ACM}, pages = {24--25}, doi = {10.1145/3549037.3564120}, year = {2022}, } Publisher's Version |
|
Umer, Muhammad Azmi |
SEA4DQ '22: "Effect of Time Patterns in ..."
Effect of Time Patterns in Mining Process Invariants for Industrial Control Systems: An Experimental Study
Muhammad Azmi Umer, Aditya Mathur, and Muhammad Taha Jilani (CodeX, Pakistan; Karachi Institute of Economics and Technology, Pakistan; Singapore University of Technology and Design, Singapore) Machine Learning is playing a crucial role in the design of intrusion detectors for Industrial Control Systems (ICS). Intrusion Detection Systems (IDS) rely on data obtained from an operational ICS. Such datasets contain multiple time series, one for each process variable. In this work, we explore how such time series can be exploited to understand the effect of time patterns in mining the process invariants, i.e., conditions on process state variables. We use the knowledge gained through the time patterns to determine the optimal data collection size for generating the invariants. The study reported here was conducted using the operational data obtained from a water treatment plant. @InProceedings{SEA4DQ22p10, author = {Muhammad Azmi Umer and Aditya Mathur and Muhammad Taha Jilani}, title = {Effect of Time Patterns in Mining Process Invariants for Industrial Control Systems: An Experimental Study}, booktitle = {Proc.\ SEA4DQ}, publisher = {ACM}, pages = {10--17}, doi = {10.1145/3549037.3561274}, year = {2022}, } Publisher's Version |
|
Walther, Dirk |
SEA4DQ '22: "Data Quality as a Microservice: ..."
Data Quality as a Microservice: An Ontology and Rule Based Approach for Quality Assurance of Sensor Data in Manufacturing Machines
Jørgen Stang, Dirk Walther, and Per Myrseth (DNV, Norway) The manufacturing industry is continuously looking for production improvements resulting in high quality production, reduced waste and competitive advantages. In this article, ontologies, semantic rule logic and microservices have been deployed to suggest a system for quality assurance of manufacturing machine data. The existing upper ontology for manufacturing service description has been used to define both the physical assets as well as the data quality requirements. The system is used to both operationalize data quality monitoring by semantic technology as well as enabling up-front modelling of data quality requirements. The approach is illustrated by a specific speed-feed case for manufacturing machines but could easily be extended to other manufacturing use-cases or even to other industries. @InProceedings{SEA4DQ22p3, author = {Jørgen Stang and Dirk Walther and Per Myrseth}, title = {Data Quality as a Microservice: An Ontology and Rule Based Approach for Quality Assurance of Sensor Data in Manufacturing Machines}, booktitle = {Proc.\ SEA4DQ}, publisher = {ACM}, pages = {3--9}, doi = {10.1145/3549037.3561272}, year = {2022}, } Publisher's Version |
|
Waszak, Maryna |
SEA4DQ '22: "Data Quality Issues for Vibration ..."
Data Quality Issues for Vibration Sensors: A Case Study in Ferrosilicon Production
Maryna Waszak, Terje Moen, Sølve Eidnes, Alexander Stasik, Anders Hansen, Gregory Bouquet, Antoine Pultier, Xiang Ma, Idar Tørlen, Bjørn Henriksen, Arianeh Aamodt, and Dumitru Roman (SINTEF, Norway; Elkem, Norway) Digitisation in the mining and metal processing industries plays a key role in their modernisation. Production processes are more and more supported by a variety of sensors that produce large amounts of data that meant to provide insights into the performance of production infrastructures. In the metal processing industry vibration sensors are essential in the monitoring of the production infrastructure. In this position paper we present the installation of vibration sensors in a real industrial environment and discuss the data quality issues we encountered while using such sensors. @InProceedings{SEA4DQ22p22, author = {Maryna Waszak and Terje Moen and Sølve Eidnes and Alexander Stasik and Anders Hansen and Gregory Bouquet and Antoine Pultier and Xiang Ma and Idar Tørlen and Bjørn Henriksen and Arianeh Aamodt and Dumitru Roman}, title = {Data Quality Issues for Vibration Sensors: A Case Study in Ferrosilicon Production}, booktitle = {Proc.\ SEA4DQ}, publisher = {ACM}, pages = {22--23}, doi = {10.1145/3549037.3561273}, year = {2022}, } Publisher's Version |
26 authors
proc time: 4.6