I am a tenure track Associate Professor in English Language and Digital Humanities (2023–) in the Department of Languages at the University of Helsinki, where I have worked since 2018. I also hold a Title of Docent in Multimodality Research and Digital Methods at the University of Turku.
I am a member of the Digital Geography Lab, where I worked as a post-doctoral researcher for the year 2017, and affiliated with the Helsinki Institute of Sustainability Science and Helsinki Institute of Urban and Regional Studies.
In 2015–2016, I worked as a post-doctoral researcher at the Centre for Applied Language Studies at the University of Jyväskylä.
I hold a PhD in English philology from the University of Helsinki (2014), supervised by Eija Ventola.
I do research on multimodal communication, that is, how multiple forms of expression interact and co-operate with each other in different communicative situations.
To exemplify, spoken language, gestures, posture and gaze are constantly coordinated in face-to-face interaction, while magazines, newspapers, websites and other page-based texts organize written language, photographs, illustrations, diagrams, information graphics and other modes of expression into coherent layouts.
In both cases, communication builds on appropriate combinations of different forms of expression. Theories of multimodality attempt to explain how such appropriate combinations are formed and how they become understandable in context.
I am particularly interested in how theories of multimodality can inform research on artificial intelligence, and conversely, how artificial intelligence can support empirical research on multimodal communication. In addition, I am also interested in applications of natural language processing and computer vision in the humanities and beyond.
You can find out more about my current projects, doctoral and post-doctoral researchers working under my supervision, recent publications and teaching resources below.
For a comprehensive and up-to-date list of my publications, please see my profile on the University of Helsinki research portal.
If you have questions or comments about my work, feel free to reach out to me! The same applies to prospective MA students (University of Helsinki) wishing to write their thesis on multimodality or a related topic.
This project studies microtask crowdsourcing as a part of the academic research infrastructure. In this context, microtask crowdsourcing refers to distributing small tasks that require human intelligence to masses of non-expert workers on online platforms.
We examine the role of crowdsourcing, focusing especially on the role of platforms, and seek to rethink the role of crowdsourcing as a form of participatory research governed by fair and transparent algorithms. The project is funded by a four-year grant from Kone Foundation.
For more information, see the project page.
This project develops crowdsourcing methods for empirical multimodality research. Crowdsourcing is a technique that involves breaking complex tasks into piecemeal work and distributing this effort to a large pool of workers on online platforms.
We develop annotation tasks that are motivated by theories of multimodality, which are then distributed to non-expert workers. The resulting descriptions are then converted into systematically annotated corpora using statistical and computational methods.
This project is funded by a three-year grant from the University of Helsinki Research Funds, which is used to support the work of one doctoral researcher.
For more information, see the project page.
This project studies which languages are used in the Helsinki Metropolitan Area and the users of different languages move around the region by combining official register data and social media data.
The project, which is conducted in collaboration with other members of the Digital Geography Lab, combines methods from natural language processing and geoinformatics with measurements developed in ecology and information sciences. The quantitative and spatial analyses are complemented by ethnographic approaches.
The research project is funded by a three-year project grant from the Emil Aaltonen Foundation, which is used to support the work of one doctoral researcher and one post-doctoral researcher.
For more information, see the project page.
We developed a new annotation schema for describing how diagrams in primary school science textbooks combine multiple modes of expression and applied the schema to 1000 diagrams in an existing dataset of diagrams.
The new corpus, AI2D-RST, is based on AI2D Diagrams dataset, which has been developed at the Allen Institute for Artificial Intelligence for tasks such as automatic diagram understanding and visual question answering.
AI2D-RST is intended to support research on the computational processing of diagrams and their multimodal structure. The discourse relations in diagrams are described using Rhetorical Structure Theory.
My long-term interest is to move research on multimodality towards a more empirically-responsible direction by establishing a closer bond between theories and data. As a part of this effort, I have studied how computational techniques from the fields of computer vision and natural language processing could be used to describe multimodal communication at scale.
This work has been supported by City of Helsinki Urban Facts (Helsingin kaupungin tietokeskus) and the Finnish Cultural Foundation (Suomen Kulttuurirahasto).
I currently supervise the following doctoral and post-doctoral researchers:
Laura Savolainen (post-doctoral researcher in CROWDINFRA, 6/2023–)
Rosa Suviranta (doctoral researcher in CROWDSRC, 8/2021–)
Tuomas Väisänen, co-supervised with Tuuli Toivonen (main supervisor) and Olle Järv (doctoral researcher in MAPHEL, 8/2019–)
Tatu Leppämäki, co-supervised with Tuuli Toivonen (main supervisor) and Johanna Eklund
My former supervisees are listed below:
Hanna-Mari Pienimäki (post-doctoral researcher in MAPHEL, 4/2021–3/2023)
Prospective doctoral researchers: I do not reply to mass e-mails about doctoral positions. If you wish to pursue a PhD under my supervision, you need to first convince me that I am the right supervisor for you. You also need to tell me how you plan to finance your doctoral research. Note that I do not currently have the possibility to support PhD students financially. Any funded positions will be announced on this website.
Prospective post-doctoral researchers: If you are interested in pursuing a post-doctoral research project under my supervision, please get in touch. Note that I do not currently have funding for hiring post-doctoral researchers, but I am happy to support applications for post-doctoral fellowships e.g. from Marie Skłodowska-Curie Actions, if our research interests are aligned.
This article demonstrates how unsupervised machine learning algorithms can be used to discover patterns that characterise particular types of diagrams in the AI2D-RST corpus.
In addition, the article proposes a novel method for quantifying information about the spatial organisation of different expressive resources.
This system demonstration paper introduces a tool for fair and reproducible microtask crowdsourcing on the Toloka platform. The tool allows building complex crowdsourcing pipelines using YAML configuration files.
In addition to increasing transparency by using explicit configuration files that may be shared in academic publications, the tool incorporates other mechanisms that mitigate ethical issues associated with crowdsourcing.
The development of the tool has been supported by the Helsinki Institute for Social Sciences and Humanities.
This article, co-authored with John A. Bateman (Bremen University), discusses diagrams from a multimodal perspective, drawing on the AI2D-RST corpus for data.
We propose that diagrams may be treated as a full-blown mode of communication, which allows integrating diverse expressive resources such as natural language and various forms of depiction, and provides mechanisms that support their interpretation.
The article won the award for best paper at the 13th International Conference on the Theory and Application of Diagrams.
In this article, we study the diversity of languages in the Helsinki Metropolitan Area by combining social media and register data. Using measurements of diversity developed in ecology and information sciences, we show that the two data sources provide radically different views into the diversity of languages. Our analyses also demonstrate that linguistic diversity is a spatio-temporal phenomenon.
By combining statistical and computational methods, we show that particular diagram types are characterised by specific layout patterns. Our analyses also reveal the diversity of visual expressive resources used in the diagrams.
This article discusses the prospects and challenges of combining multimodality theory with distant viewing, a recent framework proposed in the field of digital humanities, advocates the use of computational methods to enable large-scale analysis of visual and multimodal materials.
I argue that multimodality theory is well-positioned to support this effort by providing descriptive schemas that impose structure on the materials under analysis.
In this article, we report on a study of human–nature interactions in Finnish national parks, which we examined by applying computer vision to Flickr photographs. We propose an approach named semantic clustering for detecting activities, which combines both computer vision and clustering algorithms. We then compared the activities among domestic and foreign visitors.
This publication introduces a multimodal corpus that contains 1000 primary school science diagrams. The corpus builds on the crowdsourced annotations available in the Allen Institute for Artificial Intelligence Diagrams Dataset (AI2D) to add multiple layers of expert annotations for diagram type, logical structure and relations between elements, as proposed in this publication.
We studied the language choices of Twitter users in Finland by detecting languages in their tweet histories and estimating their home locations at the levels of regions and municipalities based on geotagged tweets.
The analyses revealed a multilingual platform: the vast majority of Twitter users in Finland use more than one language to communicate on the platform.
This chapter, published open-access in a volume edited by Helen Kennedy and Martin Engebretsen, presents a multimodal perspective on data visualization, building on the approach presented in our recent textbook on multimodality.
This article provides a thorough overview of how social media data can be used for various purposes in the field of conservation science.
My contribution involved writing the sections on computer vision and natural language processing, and illustrating their use in examples.
This article proposes a set of techniques that combine computer vision, natural language processing and machine learning for tracking illegal wildlife trade on social media platform.
This chapter, published in a handbook edited by Geoff Thompson, Wendy L. Bowcher, Lise Fontaine and David Schöntal, considers the relationship between the currently dominant statistical paradigm in natural language processing and systemic-functional linguistics, while also touching upon issues of multimodality.
This article, co-authored with Anna Hausmann, Henrikki Tenkanen and Tuuli Toivonen, examines the distribution of languages in Instagram posts geotagged to the Senate Square, Helsinki, over a period of four and half years.
The results showed that the languages and their respective proportions change over time, but English remains the dominant language, as the language gains users from many other linguistic groups. Finns, for example, post in English half the time.
This book is intended to provide a foundational introduction to multimodality, that is, how multiple modes of communication work together in different texts and situations.
We adopt a problem-based approach, showing how the foundation provided by the textbook can be used to derive appropriate methods for every situation in which multimodality needs to be of concern.
We illustrate the application of these methods using a large number of case studies dealing with different multimodal phenomena.
This conference paper, co-authored with research assistant Serafina Orekhova, outlines our plan for improving the AI2D Diagrams dataset as a part of the project.
In addition to sketching out some of the rhetorical relations necessary for describing diagrams, the paper reports on a preliminary study on inter-annotator agreement in applying these relations to diagrams in the AI2D dataset.
We edited a special section for Discourse, Context & Media with Chiao-I Tseng (Bremen University), which brings together two concepts, media and genre.
These concepts are frequently invoked across a wide range of disciplines broadly concerned with discourse (and multimodality), but rarely brought into productive dialogue with each other.
The special section contains invited contributions from leading researchers, whose articles offer different perspectives to these key concepts.
This conference paper presents a system for detecting military vehicles in social media images, which I developed with non-governmental (i.e. citizen journalists) and intergovernmental organizations in mind, as such organizations may have limited resources available for monitoring social media in conflict zones. The system draws on recent advances in applying deep neural networks to computer vision tasks, while also making extensive use of openly available libraries, models and data.
This review article provides an overview of the research conducted within the Genre and Multimodality framework, which has been used to describe documents and other multimodal artefacts over the last 15 years. In addition to reviewing the previous work, the article introduces the central theoretical concepts of the framework – medium, mode and genre – and their application to the study of multimodality.
In this article, which grew out of the additional research conducted for my book, I examined how digital longform journalism combines written language, news photography, short videos and other modes of communication using a corpus of 12 longform articles published between 2012 and 2013.
This article discusses how different specialists contribute to document design, presenting a case study focusing on the production of Finnair's corporate annual reports.
By interviewing a project manager and a graphic designer and contrasting the findings from the interview with a multimodal analysis of the annual report, I attempt to trace how their contributions shape the design of the document.
Tämä datankuvausartikkeli esittelee väitöskirjaani varten kootun multimodaalisen korpuksen, joka sisältää 58 aukeamaa Helsingin kaupungin vuosina 1967–2008 julkaisemista englanninkielisistä matkailuesitteistä.
This article describes the multimodal corpus that I compiled for my doctoral dissertation. The corpus contains 58 double-pages from English-language tourist brochures published by the city of Helsinki between 1967 and 2008.
This conference paper presents the annotator I've developed for automating the description of multimodal documents using the Genre and Multimodality model. The annotator leverages open source libraries such as OpenCV, NLTK and Tesseract for generating the annotation. The work on the annotator continues: feel free to get in touch if you wish to contribute to its development.
Learning to read academic texts is a key skill required by any student or researcher. In this chapter, I aimed to show how multimodal analysis can be used to reveal different types of structures typically found in academic texts. The chapter appeared in a volume edited by Arlene Archer (University of Cape Town) and Esther Breuer (University of Cologne).
In this article, I examined how bilingual documents use layout to signal the reader that their contents are semantically equivalent. Recently, I've been interested in computer vision and machine learning: I applied some of these techniques to sketch a method for studying layout symmetry in large data sets. The article was published in a volume edited by Janina Wildfeuer (Bremen University).
This book, which is largely based on my doctoral dissertation, develops a framework for studying how language, images and other modes of communication work together in page-based documents. In addition to revising and rewriting each chapter, two new chapters have been added, which focus on the page as a unit of analysis in a multimodal document and show how the framework can be applied to digital media.
In addition to conducting an extensive, empirical study of the Helsinki tourist brochures published between 1967 and 2008, I examine how digital longform journalism, exemplified by publications such as The New York Times' "Snow Fall", combines photography, short videos, cinematic transitions and written language.
Drawing on the research conducted for my doctoral dissertation, I also contributed a chapter to a handbook edited by Sigrid Norris (Auckland University of Technology) and Carmen Daniela Maier (Aarhus University), which outlines how the concept of genre may be used to describe and compare different documents.
The central argument of the chapter is that in order for the concept of genre to be useful, the concept needs to be tied to multimodal structures identified in the document.
The chapter also shows how the visualisation tools developed for my dissertation can reveal different types of structure commonly found in multimodal documents.
In my doctoral dissertation, I developed a framework for studying how language, image and other communicative resources work together in page-based documents. To test the framework, I collected a set of tourist brochures published between 1967 and 2008, which advertised the city of Helsinki.
I compiled the brochures into a multimodal corpus, describing their content, layout, rhetorical organisation and appearance using the Genre and Multimodality model.
I then investigated the corpus for patterns that characterised the tourist brochures as a document genre. I also explored how these patterns have changed over time as a result of new production technologies.
In 2011, I gave a two-minute lightning talk about my research at the InterFace 2011 digital humanities conference. On the basis of my presentation, I was invited to contribute to a special section of Literary & Linguistic Computing. This paper develops some ideas about layout and how it guides the interpretation of its contents. I later explored these issues in greater detail in my doctoral dissertation.
This article emerged from a side project during my doctoral research. After reading several articles on visual perception, while simultaneously annotating a multimodal corpus containing a rich description of document structure, I wanted to find a way to interface the corpus with an eye-tracker. The article proposes a way of doing so and outlines some research questions that could be investigated using such a combination.
My first publication ever, a chapter in a volume edited by Wendy Bowcher (Sun Yat-Sen University). The chapter discusses localisation, that is, how documents are translated into another language and adapted to the target culture. I argue that the process of localisation must pay equal attention to visual communication and multimodality.
Lyhyt johdanto tekstilingvistiikkaan kevään 2015 luentokurssilta.
Aineisto, joka esittelee tekstilingvistiikan keskeisiä käsitteitä, on lisensoitu Creative Commons Nimeä - Ei Kaupallinen 4.0 Kansainvälinen -lisenssillä.
Voit siis vapaasti ladata ja jakaa aineistoa, sekä muokata sitä tarpeen mukaan, kunhan mainitset aineiston alkuperän.