Views: 7 | Downloads: 9
In the thesis, we address the task of polynomial regression, i.e., inducing regression models
based on polynomial equations, from data. We aim at improving and extending the existing
approaches to learning polynomial regression models in several directions. First, we improve
the existing methods for addressing the issue of over-fitting and improve the existing methods
for ordering the search space of candidate polynomial equations. Second, we extend the scope
of existing methods towards learning piecewise, multi-target, and classification via regression
polynomial models. The central hypothesis of the thesis is that the improvements and
extensions of the existing approaches are going to improve the performance of the polynomial
models on regression and classification tasks. We also conjecture that their performance will
be comparable to the performance of models obtained with other state-of-the-art regression
and classification approaches.
To accomplish the aims and test the hypotheses, we start with performing a survey
of existing research on learning regression models with focus on evaluation metrics used
for regression. Then we develop new heuristics and refinement operators, and implement
them into the algorithm Ciper for inducing polynomial regression models. The algorithm
is capable of learning piecewise and multi-target polynomial models and polynomial models
for classification via regression. Finally, we perform empirical evaluation and comparative
analysis of the performance of polynomial models obtained with Ciper and the performance
of models obtained with other approaches.
The results of the empirical evaluation and the comparative analysis show that the newly
developed search heuristics and refinement operators lead to improved performance of the
learned regression models. The performance of models induced with Ciper is comparable
to the performance of models induced with other commonly used regression algorithms.
Also, classification models based on multi-target polynomials have predictive performance
comparable to the performance of models obtained with other classification approaches.
Finally, we also show that piecewise polynomial models of limited degree perform comparable
to polynomial models of higher (unlimited) degrees.
The thesis contribution to the field of machine learning is a new machine learning algorithm
for inducing regression models based on polynomial equations. The algorithm is
carefully designed by analyzing and comparing the performance of different methods for
generating and evaluating candidate equations. The algorithm also extends the scope of
polynomial regression to piecewise and multi-target regression models that can also serve
well for solving classification tasks following the classification via regression approach.