Radon (222Rn, half-life 3.82 days) is a natural radioactive noble gas which originates from the radioactive decay of radium (226Ra) in the Earth's crust. It is a known hazard to humans, due to its radioactivity. Moreover, radon can also be used as a versatile tool in geophysical research. In order …
Most machine learning, data mining and statistical methods rely on the assumption that the analyzed data are independent and identically distributed (i.i.d.). More specifically, the individual examples included in the training data are assumed to be drawn independently from each other from the same probability distribution. However, cases where this …
In the thesis, we address the task of polynomial regression, i.e., inducing regression models based on polynomial equations, from data. We aim at improving and extending the existing approaches to learning polynomial regression models in several directions. First, we improve the existing methods for addressing the issue of over-fitting and …
In this thesis we address the problem of learning various types of decision trees from timechanging data streams. In particular, we study online machine learning algorithms for learning regression trees, linear model trees, option trees for regression, multi-target model trees, and ensembles of model trees from data streams. These are …
Feature ranking is the machine learning task of inducing an ordering of features in a given dataset according to some notion of relevance. We consider the feature ranking task in the context of supervised learning, where the notion of feature relevance is defined with respect to a target concept. Feature …
Can a model constructed by machine learning or data mining programs be trusted? For example, it is known that a decision tree model can contain less-credible parts caused by pathologies in induction algorithms, noise and missing values in data, or simply because of the complexity of a domain. Such models …
This thesis addresses the task of formalizing and implementing the process of semi-automatic ontology construction. We propose a theoretical framework for formalizing the ontology construction process. The process is described as a sequence of operators applied to the ontology. Several types of common operators are identified and each type is …
Simulation models are a widely used tool for modelling and simulating systems for which it is hard to obtain real data. However, the simulation models are usually complex and it is not an easy task to induce new knowledge and find relationships and dependencies among different parts (parameters, processes, modules) …
Data analysis with machine learning methods, when applied to large collections of text data, enables us to discover new knowledge. This knowledge, once put together, might describe the still unknown connections among phenomena and thus contribute to the formation of new hypotheses in different fields, medicine including. Also, connectivity and …
The goal of knowledge discovery in databases is to construct models or discover interesting patterns in data. Model construction and pattern discovery are frequently performed by rule learning, as the induced rules are easy to be interpreted by human experts. The standard classification rule learning task is to induce classification/prediction …