Domain experts usually set standards - but not by standard procedure this time! To train and evaluate the performance of machine learning systems, human judgements are indispensable. With the aim of making terminological entries from different standards more accessible for harmonisation, the Harbsafe project – a cooperation by VDE|DKE and TU Braunschweig - aims to group semantically similar entries using techniques from computational linguistics. Especially standardisation experts can profit from such an assistance.
A solid foundation is the new gold standard Harbsafe-162 – a data set that has been developed with voluntary domain experts and serves as a yardstick for the evaluation of text similarity measures. It differs from comparable data sets by content and structure: The items belong to the realm of languages for special purposes and are complex concept-oriented terminological entries. The data set may therefore be of interest to the international research community in computational linguistics. VDE|DKE is accordingly presenting Harbsafe-162 as a gold standard for representations of concepts.
Several domain experts participated in a rating task and judged the conceptual similarity of 162 pairs of terminological entries. These have been extracted from technical standards in the areas of functional safety, cybersecurity, and dependability – domains that increasingly converge due to digitalisation. The rating scale ranges from 4 (very similar) to 0 (totally dissimilar and unrelated). Harbsafe’s inter-rater reliability is competitive with similar lexical data sets (Krippendorff’s α=0.7, Spearman‘s ρ=0.74). Harbsafe-162 includes median and average ratings of the domain experts as well as the standard deviation. A detailed discussion of Harbsafe-162 is currently in preparation.
The ongoing project Harbsafe will focus on the software solution for the assistance of standardisation experts – the Terminology Harmonisation Dashboard. Project developments will be presented at a Harbsafe workshop with standardisation experts that will be held in Frankfurt am Main on January 30th 2019.
The project is funded by BMWi, the German Federal Ministry for Economic Affairs and Energy (Grants: 03TNG006A and 03TNG006B).