linkky in softwares (2022-05-09)
Manuel Atencia, Jérôme David, Jérôme Euzenat, Amedeo Napoli, Jérémy Vizzini, Link key candidate extraction with relational concept analysis, Discrete applied mathematics 273:2-20, 2020
Linked data aims at publishing data expressed in RDF (Resource Description Framework) at the scale of the worldwide web. These datasets interoperate by publishing links which identify individuals across heterogeneous datasets. Such links may be found by using a generalisation of keys in databases, called link keys, which apply across datasets. They specify the pairs of properties to compare for linking individuals belonging to different classes of the datasets. Here, we show how to recast the proposed link key extraction techniques for RDF datasets in the framework of formal concept analysis. We define a formal context, where objects are pairs of resources and attributes are pairs of properties, and show that formal concepts correspond to link key candidates. We extend this characterisation to the full RDF model including non functional properties and interdependent link keys. We show how to use relational concept analysis for dealing with cyclic dependencies across classes and hence link keys. Finally, we discuss an implementation of this framework.
Formal Concept Analysis, Relational Concept Analysis, Linked data, Link key, Data interlinking, Resource Description Framework
Manuel Atencia, Jérôme David, Jérôme Euzenat, Amedeo Napoli, Jérémy Vizzini, A guided walk into link key candidate extraction with relational concept analysis, in: Claudia d'Amato, Lalana Kagal (eds), Proc. on journal track of the International semantic web conference, Auckland (NZ), 2019
Data interlinking is an important task for linked data interoperability. One of the possible techniques for finding links is the use of link keys which generalise relational keys to pairs of RDF models. We show how link key candidates may be directly extracted from RDF data sets by encoding the extraction problem in relational concept analysis. This method deals with non functional properties and circular dependent link key expressions. As such, it generalises those presented for non dependent link keys and link keys over the relational model. The proposed method is able to return link key candidates involving several classes at once.
Formal Concept Analysis, Relational Concept Analysis, Linked data, Link key, Data interlinking, Resource Description Framework
Jérôme David, Jérôme Euzenat, Jérémy Vizzini, Linkky: Extraction de clés de liage par une adaptation de l'analyse relationnelle de concepts, in: Actes 29e journées francophones sur Ingénierie des connaissances (IC), Nancy (FR), pp271-274, 2018
RDF, Clé de liage, Liage de données, Analyse relationelle de concepts, Analyse formelle de concepts, Network of ontologies
Jérémy Vizzini, Data interlinking with relational concept analysis, Master's thesis, Université Grenoble Alpes, Grenoble (FR), 2017
Vast amounts of RDF data are made available on the web by various institutions providing overlapping information. To be fully exploited, different representations of the same object across various data sets have to be identified. This is what is called data interlinking. One novel way to generate such links is to use link keys. Link keys generalise database keys by applying them across two data sets. The structure of RDF makes this problem much more complex than for relational databases for several reasons. An instance can have multiple values for a given attribute. Moreover, values of properties are not necessarily datatypes but instances of the graph. A first method has been designed to extract and select link keys from two classes of objects which deals with multiple values but not object values. Moreover, the extraction step has been rephrased in formal concept analysis (FCA) allowing to generate link keys across relational tables. Our aim is to extend this work so that it can deal with multiple values. Then, we show how to use it to deal with object values when the data set is cycle free. This encoding does not necessarily generate the optimal link keys. Hence, we use relational concept analysis (RCA), an extension of FCA taking relations between concepts into account. We show that a new expression of this problem is able to extract the optimal link keys even in the presence of circularities. Moreover, the elaborated process does not require information about the alignments of the ontologies to find out for which pairs of classes to extract link keys. We implemented these methods and evaluated them by reproducing the experiments made in previous studies. This shows that the method extracts the expected results as well as (also expected) scalability issues.