Data-driven ontology engineering with Relational Concept Analysis

Formal Concept Analysis
(FCA) provides a knowledge discovery framework enabling
both (1) conceptual clustering of data objects and (2) pattern/association discovery.
It was thought as a mathematical approach to the design of concept hierarchies (called
concept lattices) from a sets of observations (introduced as object x attribute tables,
called formal contexts). FCA, as most data mining approaches focuses on a single data table,
whereas Linked Data are inherently multi-table, a.k.a. multi-relational. **Relational
Concept Analysis** (RCA) is a Multi-relational data mining (MRDM)
method extending FCA.

RCA has been applied to practical problems from a wide range of fields such as software
engineering, hydroecology, neurology,
data interlinking, linguistics. **In this tutorial**, we will focus on the way RCA
can support various ontology engineering tasks. First we bring to the audience an understanding
of the mathematical foundations of the RCA method and the algorithms used in the iterative
lattice construction. We present existing tools as well as examples of RCA applications
from the literature. In the second part, the focus will be on the intricate links between
RCA and ontologies. We present a small number of ontology engineering scenarios and show
how RCA-based tools support them through proper analysis of the data.

#### Formal Concept Analysis

- After providing some background on lattices, we show how FCA
organizes objects a lattice of conceptually described clusters. We
also present the way the lattice yields patterns and association
rules and introduce interest metrics to score these
#### Relational Concept Analysis

- We show how RCA encodes Linked Data into a family of contexts and object-to-object binary relations. We then explain the bootstrapping step of iterative RCA process, i.e. building of the initial lattices, and clarify the way evolving lattices onindividual contexts interact with each other. In particular, we illustrate the relational scaling mechanism, i.e. given a relation between two contexts and a scaling schema (roughly a logical quantifier) how concepts on the range context are turned intopredicate-like attributes of the domain context. Finally, we discuss various ways to extract patterns and associations from classdefinitions while avoiding potential cycles.

#### Ontology design and refinement

- We show how given a dataset and an ontology, the latter can be refined with RCA output that can suggest new classes that
would be specializations of existing ontological classes.
#### Static Analysis

- We show how RCA detects discrepancies between data and the ontology classes these data is assigned to. By pointing out suchdiscrepancies, RCA acts as a recommendation mechanism to an ontologist to suggest, among other, missing properties and/orproperty restrictions in a class, potential assignments of data to a more specialized subclass, potential missing specializations ofan existing class, etc.
#### Ontology Restructuring

- We present an RCA-based method to improve the quality of an ontology by reorganizing the specialization between classes aswell as between properties, while discovering some potentially missing abstract classes and properties. The method requires nodata as it operates on the ontological schema as meta-data instead

Petko Valtchev is Associate Professor with the Computer Science department of University of Quebec at Montreal(UQAM). His Ph.D. was awarded in 1999 by J. Fourier University, Grenoble. He is member of the Editorial Board of the International conference on Formal Concept Analysis (FCA) and has served as a member of the program committees of top-tier conferences (AAAI, IJCAI, ISWC). He has been researching on knowledge discovery and data mining with/from ontologies and knowledge bases. In this context, he designed a number of methods and practical tools exploiting concept analysis.

Mickael Wajnberg is a student, currently enrolled in a PhD at University of Quebec at Montreal (Québec, Canada) and at Université de Lorraine (France) , he currently works on RCA and knowledge extraction. He did a Math and Physics Prepa before he got an Engineering Degree (M. Sc equivalent) at Telecom Nancy(France) and a M. Sc at University of Quebec at Chicoutimi (Québec, Canada) in Computer Science, he specialized in algorithms and theory for computer science.

- Ganter, B. & Wille, R. Formal concept analysis: mathematical foundations(Springer Science & Business Media, 2012)
- Rouane-Hacene, M., Huchard, M., Napoli, A. & Valtchev, P. Soundness and completeness of relational concept analysis. In International Conference on Formal Concept Analysis, 228–243 (Springer, 2013).
- Rouane-Hacene, M., Huchard, M., Napoli, A. & Valtchev, P. Relational concept analysis: mining concept lattices from multi-relational data. Annals Math. Artif. Intell.67, 81–108 (2013).
- Džeroski, S. Multi-relational data mining: an introduction.ACM SIGKDD Explor. Newsl.5, 1–16 (2003)
- Rouane, M. H., Huchard, M., Napoli, A. & Valtchev, P. A proposal for combining formal concept analysis and description logics for mining relational data. In International Conference on Formal Concept Analysis, 51–65 (Springer, 2007)
- Nica, C., Braud, A., Dolques, X., Huchard, M. & Le Ber, F. Exploring temporal data using relational concept analysis: Anapplication to hydroecology. In 13th International Conference on Concept Lattices and Their Applications (CLA 2016), vol.1624, 299–311 (2016).
- Wajnberg, M.et al. Semantic interoperability of large systems through a formal method: Relational concept analysis. IFAC-PapersOnLine51, 1397–1402 (2018).