MPŠ MP&Scaron MP&Scaron MP&Scaron Avtorji

Jožef Stefan
Postgraduate School

Jamova 39
SI-1000 Ljubljana

Phone: +386 1 477 31 00
Fax: +386 1 477 31 10


Course Description

Knowledge Discovery in Environmental Data


Ecotechnologies, third-level study programme


prof. dr. Sašo Džeroski


Introduce the students to the field of knowledge discovery from environmental data. The students will gain basic knowledge about data analysis with machine learning methods. Also, they will gain an overview of some widely used machine learning methods. They will be familiarized with case studies that used machine learning to analyse environmental data. Through exercises, the students will get acquainted with some machine learning software packages.

The students will acquire an understanding of the process of knowledge discovey and they will be able to apply some methods for machine learning to environmental data.


1. Introduction to knowledge discovery and machine learning methods (learning decision and regression trees; learning rules; Bayesian classification; nearest-neighbor method; equation discovery)

2. Classes of environmental problems for which machine learning can be used (population dynamics and habitat modeling)

3. Case studies of using machine learning to analyse environmental data (aquatic ecosystems, agriculture, forestry environmental epidemiology and toxicology, forecasting natural disasters / for example, modelling algal growth in the Lagoon of Venice and lake Bled, gene flow for GMOs, bear habitat, predicting biodegradability, predicting earthquakes, fires, floods)

4. Exercises for application of selected machine learning methods to environmental data (demostration/exercises with machine learning software packages)

Course literature:

A. Fielding, editor. Machine Learning Methods for Ecological Applications. Kluwer, Boston, MA, 1999.

S. Dzeroski. Data mining in a nutshell. In S. Dzeroski, N. Lavrac, editors, Relational Data Mining, pages 3-27. Springer, Berlin, 2001.

S. Dzeroski. [KDD Applications in] Environmental sciences. In W. Klösgen, and J. M. Zytkow, editors. Handbook of Data Mining and Knowledge Discovery, pages 817-830. Oxford University Press, 2002.

Significant publications and references:

• S. Dzeroski and L. Todorovski, eds., Computational Discovery of Scientific Knowledge. Springer, Berlin, 2007.

• Todorovski, L. , and Dzeroski, S. Integrating Domain Knowledge in Equation Discovery. In S. Dzeroski and L. Todorovski, eds., Computational Discovery of Scientific Knowledge, pages 69-97. Springer, Berlin, 2007.

• N. Atanasova, F. Recknagel, Lj. Todorovski, S. Dzeroski, and B. Kompare. Computational Assemblage of Ordinary Differential Equations for Chlorophyll-a Using a Lake Process Equation Library and Measured Data of Lake Kasumigaura. In F. Recknagel, editor, Ecological informatics : scope, techniques, and applications, 2nd ed., pages 409-427. Springer, Berlin, New York, 2006.

• Todorovski, L., and Dzeroski, S. Integrating knowledge-driven and data-driven approaches to modeling. Ecological Modelling, 194 : 3-13, 2006.

• D. Demsar, S. Dzeroski, T. Larsen, J. Struyf, J. Axelsen, M. Bruns-Pedersen, and P. Henning Krogh. Using multi-objective classification to model communities of soil microarthropods. Ecological Modelling, vol. 191 : 131-143, 2006.

• M. Jurc, M. Perko, S. Dzeroski, D. Demsar, and B. Hrasovec. Spruce bark beetles (Ips typographus, Pityogenes chalcographus, Col.: Scolytidae) in the Dinaric mountain forests of Slovenia : monitoring and modeling. Ecological Modelling, vol. 194 : 219-226, 2006.

• A. Kobler, S. Dzeroski, and I. Keramitsoglou. Habitat mapping using machine learning-extended kernel-based reclassification of an Ikonos satelite image. Ecological Modelling, vol. 191 (1) : 83-95, 2006.

• N. Atanasova, Lj. Todorovski, S. Dzeroski, and B. Kompare. Constructing a library of domain knowledge for automated modelling of aquatic ecosystems. Ecological Modelling, vol. 194 : 14-36, 2006.

• N. Atanasova, Lj. Todorovski, S. Dzeroski, S. Remec-Rekar, F. Recknagel, and B. Kompare. Automated modelling of a food web in lake Bled using measured data and a library of domain knowledge. Ecological Modelling, vol. 194 : 37-48, 2006.

• S. Dzeroski, S., and L. Todorovski. Learning population dynamics models from data and domain knowledge. Ecological Modelling, 170: 129-140, 2003.

• S. Dzeroski, and D. Drumm. Using regression trees to identify the habitat preference of the sea cucumber (Holothuria leucospilota) on Rarotonga, Cook Island. Ecological Modelling, 170: 219-226, 2003.


Seminar and oral exam.

Students obligations:

Seminar and oral exam.