Multimodality research is an emerging discipline that studies the way humans communicate using intentional combinations of expressive resources. Recently, there have been increased calls for supporting empirical research on multimodal communication by creating large, systematically-annotated multimodal corpora. However, creating such corpora is time-consuming and expensive, which is why multimodal corpora remain small.
In this research project, we explore the use of paid crowdsourcing for creating multimodal corpora. Crowdsourcing is a technique that involves breaking complex tasks into piecemeal work, which is then distributed to a large pool of workers on online platforms. Crowdsourcing is frequently used for creating datasets in artificial intelligence research, but its use entails several ethical issues.
This project is funded by a three-year grant from the University of Helsinki Research Funds. The project has also been supported by the Helsinki Institute for Social Sciences and Humanities and through a research grant from Toloka.
Develop crowdsourcing tasks that are motivated by theories of multimodal communication
Promote ethically responsible use of crowdsourcing in multimodality research and digital humanities
Create large multimodal corpora with rich and reliable annotations using crowdsourcing
Develop a tool and a framework for ethically responsible crowdsourcing
Illustrate how crowdsourced multimodal corpora can be used to support empirical research on multimodality
Tuomo Hiippala (principal investigator)
Rosa Suviranta (doctoral researcher)
Helmiina Hotti (research assistant, 2022)
Jonas Haverinen (research assistant, 2021)
For an up-to-date list of publications and other research outputs, see the University of Helsinki research portal.
Haverinen, Jonas (2022) Written labels in elementary school science diagrams: linguistic patterns and discourse relations.
Hotti, Helmiina (2023) TBD.