MPŠ MP&Scaron MP&Scaron MP&Scaron Avtorji

Jožef Stefan
Postgraduate School

Jamova 39
SI-1000 Ljubljana

Phone: +386 1 477 31 00
Fax: +386 1 477 31 10


Course Description

Computational Scientific Discovery and e-Science


Information and Communication Technologies, second-level study programme


prof. dr. Sašo Džeroski


The goal of the course is to familiarize the student with the field of computational scientific discovery, i.e., with computational approaches to automating or supporting crucial aspects of scientific discovery.
Basic concepts will be covered first, including the scientific method and the elements of scientific behavior, which include scientific knowledge structures, as well as scientific activities that generate and manipulate these structures. The history of the development of the field will be presented and its relation to recent developments such as mining scientific data will be discussed. A range of techniques will be covered, as well as their applications in environmental sciences (ecology) and life sciences (bioinformatics).

The students will acquire a basic understanding of scientific knowledge structures and activities, as well as computer methods to support their automation.


1) The scientific method
Scientific knowledge structures, scientific activities/processes.

2) Computational scientific discovery
Introduction, history of development of the area, basic methods, e.g., equation discovery, discovering networks, discovering pathways, inductive process modelling.

3) Mining scientific data
Specific requirements for mining scientific data vs. data mining in business, finance, retail.

4) Applications in Environmental Sciences
Habitat modeling, modeling population dynamics.

5) Applications in Life Sciences
Applications in bioinformatics, biomedicine, and systems biology, e.g., predicting gene function, discovering metabolic and regulation pathways.

6) Introduction to e-Science
The Grid, work-flows, semantic web/Grid, scientific ontologies.

Course literature:

• Džeroski, S, and Todorovski, L. (eds.) Computational Discovery of Scientific Knowledge. Springer, Berlin, 2007.
• Shrager, J., and Langley, P. (eds.). Computational Models of Scientific Discovery and Theory Formation. Morgan Kaufmann, San Mateo, CA, 1990.
• Langley, P., Simon, H.A., Bradshaw, G.L., and Zytkow, J. Scientific Discovery Computational Explorations of the Creative Processes. MIT Press, Cambridge, MA, 1983.

Significant publications and references:

• E. Ikonomovska, J. Gama, and S. Džeroski. Online tree-based ensembles and option trees for regression on evolving data streams. Neurocomputing 150, 458-470, 2015.
• P. Panov, L. Soldatova, and S. Džeroski. Ontology of core data mining entities. Data Mining and Knowledge Discovery 28 (5-6), 1222-1265, 2014.
• D. Kocev, C. Vens, J. Struyf, and S. Džeroski. Tree ensembles for predicting structured outputs. Pattern Recognition 46 (3), 817-833, 2013.
• D. Čerepnalkoski, K. Taškova, L. Todorovski, N. Atanasova, and S. Džeroski. The influence of parameter fitting methods on model structure selection in automated modeling of aquatic ecosystems. Ecological Modelling 245, 136-165, 2012.
• G. Madjarov, D. Kocev, D. Gjorgjevikj, and S. Džeroski. An extensive experimental comparison of methods for multi-label learning. Pattern Recognition 45 (9), 3084-3104, 2012.


Seminar and oral exam (100%)

Students obligations:

Seminar and oral exam