Research Projects

I advise MSc theses at University of Helsinki (primarily at the Department of Digital Humanities, others departments also possible), and research projects, previously through Aalto CS-E4875 and CS-E4000. Most projects are about machine learning/deep learning techniques, natural language processing, and their applications in healthcare and social good. Deep learning methods include recent neural architectures and learning paradigms such as multitask learning and federated learning. Projects focus on multilingual NLP (e.g., multilingual instruction tuning of large language models, and machine translation), healthcare (e.g., clinical text analytics, electronic health records, and AI-assisted diagnosis) and social good (e.g., mental disorder detection from social media, toxicity/abusive text/cyberbullying detection, sentiment analysis).

Available Topics

Topic 1: AI for Social Good

Background: AI for social good (AI4SG) is a research field that focuses on tackling important social, environmental, and public health challenges that exist today using AI. This topic focuses more on social and public health challenges. Specifically, this topic develops deep learning for proactive social care such as early detection of mental illness (e.g, anxiety, depression and suicidal ideation) expressed in social media, which can raise early alert for effective prevention. Abusive text detection from social text is also one possible task under this topic. Abusive text detection classifies social text which may contain various toxic information such hate speech, cyberbullying, aggressiveness, and offensiveness.

Prerequisites

Each topic has similar prerequisites including, but are not limited to:

  1. Good knowledge of deep learning;
  2. Programming skills with deep learning frameworks (e.g., PyTorch);
  3. Experience with LaTex typesetting and Linux servers.

Past Projects

MSc Thesis Advising

  • Ya Gao (MSc, Aalto University, now PhD candidate at Aalto University)
    Joint entity and relation extraction via contrastive learning on knowledge-augmented graph embeddings, 2023

  • Tuulia Denti (MSc, Aalto University, jointly with HUS, now Data Analyst at HUS)
    Natural Language Processing with Topic Models for Clinical Texts of Prostate Cancer Patients, 2022.

  • Wei Sun (MSc, Aalto University, jointly with HUS, now at PhD candidate KU Leuven, Belgium)
    Extracting Medical Entities from Radiology Reports with Ontology-based Distant Supervision, 2022.

BSc Thesis Supervision

Previous Projects

  • Risk adjustment for health plan payment (2019 Winter)
  • Deep learning for cyberbullying detection (2020 Summer)
  • Pretrained language models for diagnosis code prediction (2020 Summer)
  • Federated learning (2020 Fall)
  • Depression detection from social content (2021 Spring)
  • Biomedical text classification (2022 Spring)

Published Project Reports

Here is a list of some project reports published in scientific venues after some revisions of the original reports.