Views: 10 | Downloads: 8
This thesis addresses the task of formalizing and implementing the process of semi-automatic ontology construction.
We propose a theoretical framework for formalizing the ontology construction process. The process is described as a sequence of operators applied to the ontology. Several types of common operators are identified and each type is abstracted so it can be discovered by a combination of machine learning algorithms and user interactions. The proposed ontology learning framework is generic and can handle various domains. The requirement is, that domain data can be provided in a format supported by the learning algorithms.
Operators defined as part of the ontology construction process are implemented using several machine learning algorithms. Clustering, active learning and large-scale classifications are used to learn operators for adding concepts and relations. A novel visualization approach for visualizing instances, concepts and ontologies is developed, using a combination of dimensionality reduction techniques. The ability to incorporate additional background data is implemented using a novel feature weighting schema, and the addition of new instances to the ontology is translated to a standard classification task.
We also developed a system, which implements the framework, together with the proposed machine learning algorithms. The system takes domain data on the input, and guides the user through the process of constructing the ontology for the given domain. The developed system was applied in several use-cases, where domain data was provided as a text corpus or a social network, to showcase the capabilities.
The system was also evaluated in two user studies, to evaluate the user interface and to compare developed ontologies against manually constructed ones. The results of the users studies show, that the system is user friendly enough to be used by domain experts. The users can construct ontologies that are comparable to manually constructed ontology and can do so in a shorter amount of time.