REPOSITORY > RESULTS

Doctoral dissertation

Explainable machine learning techniques for applications in life sciences

Author(s): Martin Marzidovšek (Author), Vid Podpečan (Supervisor), Patricija Mozetič (Co-Supervisor)

Thesis defense date: 24.10.2024

Organization: MPŠ - Mednarodna podiplomska šola Jožefa Stefana

PID: 20.500.12556/ReVIS-13690

Views: 5 | Downloads: 10

Abstract

As ecological, agricultural, and biological disciplines face mounting challenges like biodiversity
loss, food chain disruption, and climate change, leveraging machine learning (ML)
to process complex and heterogeneous data becomes increasingly vital. This dissertation
explores the potential of ML in combination with explainability approaches for enhancing
research in life sciences, specifically focusing on ecology. The research aims to advance
ML applications in ecosystem monitoring, ecological modelling, and predictive analytics,
emphasising the necessity for explainable artificial intelligence (xAI) to balance predictive
performance with model transparency.
The dissertation underscores the potential for xAI and interpretable ML models. While
advanced ML models excel in handling non-linear relationships and complex systems, their
’black box’ nature often limits their scientific utility. This work advocates for the integration
of xAI methods to elucidate the inner workings of these models, thereby enhancing
their applicability and acceptance in scientific research.
The contributions of the dissertation include:
• Modelling diarrhetic shellfish poisoning events: An xAI approach predicting the toxicity
of mussels due to harmful algal blooms in the Adriatic Sea. Utilising a 28-year
dataset, the study highlights the importance of data pre-processing and demonstrated
that Random Forest models, coupled with explainability methods, provide critical
insights into marine ecosystem dynamics and can serve as cost-effective early warning
systems.
• Unsupervised ML in Agriculture: Cluster analysis was applied to identify barriers
and incentives for the use of decision support systems (DSS) in integrated pest management.
The study revealed distinct groups among farmers and advisors, identifying
common barriers and incentives, which can inform future research to enhance DSS
adoption.
• Phytoplankton Identification and Quantification: A computer vision system that automates
the identification, size estimation, and biovolume calculation of phytoplankton
species. Using transfer learning, the system processes samples more efficiently
than manual methods, providing a more accurate assessment of marine ecosystems.
Visual explanations are applied for greater thrustworthiness and confidence of the
ML-based solution.
• BEFANA Software: An open-source software tool that facilitates network analysis
and ML applications in ecological networks. It enables ecologists interactive visualisation
and to quantify network topologies, test hypotheses, and embed experimental
data, thereby enriching the analysis and modelling of ecosystem dynamics.
The dissertation contributes to improving ML methodologies in life sciences research.
The principles of open, reproducible science are upheld through open access to software
code, data, and publications associated with this research.

Attachments

Cite this work