Automotive E-SSENTIALS

Your regular update for technical and industry information

Your regular update for technical and industry information

The openGENESIS collaboration released first results how to achieve reliable AI data

Digital Transformation GenesisThe openGENESIS collaboration platform for the assessment of Artificial Intelligence finished its first spotlight project "A reliable AI data labeling process". openGENESIS is hosted as a working group within the Eclipse Foundation.


As part of this project Incenda AI GmbH and TÜV SÜD published a white paper presenting a lifecycle and development process for AI systems, with a special focus on data quality and creating high quality labels.


This publication marks the first freely available process for labeling data for machine learning, a prerequisite for developing high quality AI systems based on data. Therefore, openGENESIS took the first step towards safe and trustful AI, an important requirement for autonomous vehicles.


For further information, and for discussing this topic, please contact our TÜV SÜD experts directly.


Download our white paper


White paper abstract:

Due to its wide success in recent years, Artificial Intelligence (AI) is being used in more and more systems. As established Software Engineering practices, including development processes, fail to capture the complexity and additional challenges of developing AI systems, many Software Developers struggle using AI, especially in safety critical areas like healthcare or automotive.

One of the AI methods with the highest impact so far is supervised Machine Learning (ML). The performance of supervised ML is determined to a large extent by the data used to train and evaluate the developed models and the application of established Software Engineering practices. Common issues include data and label quality, immature frameworks and processes for supervised ML development, a lack of traceability of requirements to implementation and limited transparency of some models.

The contribution of this whitepaper is the discussion and establishment of a sound supervised ML lifecycle, with a focus on data quality, from the intent of developing a system using supervised ML to the decommissioning of the developed system. The different steps of the lifecycle are detailed and a deep-dive into the labeling steps is provided by defining a labeling process. The discussion includes activities that are recommended to be performed in order to create high quality labels and raises typical issues during labeling.

Next Steps

Site Selector