Dr. Tuomo Hiippala


I am Assistant Professor of English Language and Digital Humanities (tenure track) in the Department of Languages at the University of Helsinki, where I have worked since 2018.

I am also a member of the Digital Geography Lab, where I worked as a post-doctoral researcher for the year 2017, and affiliated with the Helsinki Institute of Sustainability Science and Helsinki In­sti­tute of Urban and Regional Stud­ies. I am a member of Young Academy Finland (2018–2022).

In 2015–2016, I worked as a post-doctoral researcher at the Centre for Applied Language Studies at the University of Jyväskylä.

I hold a PhD in English philology from the University of Helsinki (2014), supervised by Eija Ventola.

Photo by Veikko Somerpuro


I do research on multimodality, that is, how multiple modes of expression interact and co-operate with each other in different communicative situations.

To exemplify, spoken language, gestures, posture and gaze are constantly coordinated in face-to-face interaction, while magazines, newspapers, websites and other page-based texts organize written language, photographs, illustrations, diagrams, information graphics and other modes of expression into coherent layouts.

In both cases, communication builds on appropriate combinations of different modes of expression. Theories of multimodality attempt to explain how such appropriate combinations are formed and how they become understandable in context.

I am particularly interested in how theories of multimodality can inform research on artificial intelligence, and conversely, how artificial intelligence can support empirical research on multimodality. In addition, I am also interested in applications of natural language processing and computer vision in the humanities and beyond.

You can find out more about my current research, recent publications and teaching resources below.

For an up-to-date list of my publications, please see my profile on the University of Helsinki research portal.

Get in touch

If you have questions or comments about my work, feel free to reach out to me! The same applies to prospective MA students (University of Helsinki) wishing to write their thesis on multimodality or graduates interested in doing a PhD under my supervision.

Note that I do not currently have the possibility to offer financial support for PhD students. Any funded positions will be announced on this website.

You can reach me via e-mail at tuomo.hiippala@helsinki.fi and find me on GitHub, ResearchGate and Twitter. My ORCID is 0000-0002-8504-9422.


Mapping the linguistic landscape of the Helsinki Metropolitan Area


A map showing the number of unique languages per a 250m grid cell in the Helsinki Metropolitan Area.

This project seeks to describe the languages used in the Helsinki Metropolitan Area and how the language users move about using official register and social media data.

The project, which is conducted in collaboration with other members of the Digital Geography Lab, combines methods from natural language processing and geoinformatics with measurements developed in ecology and information sciences. The results of quantitative analyses are validated using ethnographic methods.

The research project is funded by a three-year project grant from the Emil Aaltonen Foundation, which is used to support the work of one doctoral researcher and one post-doctoral researcher.

AI2D-RST: A multimodal corpus of 1000 primary school science diagrams


An example of a diagram from the AI2D dataset

We developed a new annotation schema for describing how diagrams in primary school science textbooks combine multiple modes of expression and applied the schema to 1000 diagrams in an existing dataset of diagrams.

The new dataset, AI2D-RST, is based on AI2D Diagrams, which has been developed at the Allen Institute for Artificial Intelligence for tasks such as automatic diagram understanding and visual question answering.

AI2D-RST is intended to support research on the computational processing of diagrams and their multimodal structure. The discourse relations in diagrams are described using Rhetorical Structure Theory.

Empirical methods for multimodality research


An example of showing the application of SIFT, a computer vision algorithm, to an image of the Lutheran Cathedral in Helsinki.

My long-term interest is to move research on multimodality towards a more empirically-responsible direction by establishing a closer bond between theories and data. As a part of this effort, I have studied how computational techniques from the fields of computer vision and natural language processing could be used to describe multimodal communication at scale.

Some publications emerging from this work can be found here, here and here.

This work has been supported by City of Helsinki Urban Facts (Helsingin kaupungin tietokeskus) and the Finnish Cultural Foundation (Suomen Kulttuurirahasto).


A multimodal perspective on data visualization

Data Visualization in Society / Amsterdam University Press / 2020 / DOI

This chapter, published open-access in a volume edited by Helen Kennedy and Martin Engebretsen, presents a multimodal perspective on data visualization, building on the approach presented in our recent textbook on multimodality.

Social media data for conservation science: a methodological overview

Co-authored with Tuuli Toivonen, Vuokko Heikinheimo, Christoph Fink, Anna Hausmann, Olle Järv, Henrikki Tenkanen and Enrico Di Minin / Biological Conservation / 2019 / DOI GITHUB

A photograph of elephants crossing the road in a national park in Africa, which have been segmented from the image by a computer vision algorithm.

This article provides a thorough overview of how social media data can be used for various purposes in the field of conservation science.

My contribution involved writing the sections on computer vision and natural language processing, and illustrating their use in examples.

A framework for investigating illegal wildlife trade on social media with machine learning

Co-authored with Enrico Di Minin, Christoph Fink and Henrikki Tenkanen / Conservation Biology / 2019 / DOI

This article proposes a set of techniques that combine computer vision, natural language processing and machine learning for tracking illegal wildlife trade on social media platform.

This article was preceded by a call to action published in Nature Ecology and Evolution. The HELICS Lab continues this work.

Systemic-Functional Linguistics and Computation: New Directions, New Challenges

Co-authored with John Bateman, Daniel McDonald, Daniel Couto-Vale and Eugeniu Costetchi / Cambridge Handbook of Systemic Functional Linguistics / Cambridge University Press / 2019 / PREPRINT DOI

This chapter, published in a handbook edited by Geoff Thompson, Wendy L. Bowcher, Lise Fontaine and David Schöntal, considers the relationship between the currently dominant statistical paradigm in natural language processing and systemic-functional linguistics, while also touching upon issues of multimodality.

Exploring the linguistic landscape of geotagged social media content in urban environments

Co-authored with Anna Hausmann, Henrikki Tenkanen and Tuuli Toivonen / Digital Scholarship in the Humanities / 2019 / PREPRINT DOI GITHUB

This article, co-authored with Anna Hausmann, Henrikki Tenkanen and Tuuli Toivonen, examines the distribution of languages in Instagram posts geotagged to the Senate Square, Helsinki, over a period of four and half years.

The results showed that the languages and their respective proportions change over time, but English remains the dominant language, as the language gains users from many other linguistic groups. Finns, for example, post in English half the time.

Multimodality: Foundations, Research and Analysis - A Problem-Oriented Introduction

Co-authored with John Bateman and Janina Wildfeuer / De Gruyter / 2017 / PUBLISHER

We co-authored a problem-based introduction to multimodal analysis with John Bateman (Bremen University) and Janina Wildfeuer (University of Groeningen).

This book is intended to provide a foundational introduction to multimodality, that is, how multiple modes of communication work together in different texts and situations.

We adopt a problem-based approach, showing how the foundation provided by the textbook can be used to derive appropriate methods for every situation in which multimodality needs to be of concern.

We illustrate the application of these methods using a large number of case studies dealing with different multimodal phenomena.

Enhancing the AI2 Diagrams dataset using Rhetorical Structure Theory

Co-authored with Serafina Orekhova / Proceedings of the 11th International Conference on Language Resources and Evaluation, Miyazaki / 2018 / PDF GITHUB

Example of a diagram from the AI2D dataset

This conference paper, co-authored with research assistant Serafina Orekhova, outlines our plan for improving the AI2D Diagrams dataset as a part of the project.

In addition to sketching out some of the rhetorical relations necessary for describing diagrams, the paper reports on a preliminary study on inter-annotator agreement in applying these relations to diagrams in the AI2D dataset.

Media evolution and genre expectations

Co-edited with Chiao-I Tseng / Discourse, Context & Media / 2017 / JOURNAL

We edited a special section for Discourse, Context & Media with Chiao-I Tseng (Bremen University), which brings together two concepts, media and genre.

These concepts are frequently invoked across a wide range of disciplines broadly concerned with discourse (and multimodality), but rarely brought into productive dialogue with each other.

The special section contains invited contributions from leading researchers, whose articles offer different perspectives to these key concepts.

Recognizing military vehicles in social media images using deep learning

Proceedings of the 2017 IEEE International Conference on Intelligence and Security Informatics, Beijing / 2017 / PDF GITHUB

Screenshot of the tool

This conference paper presents a system for detecting military vehicles in social media images, which I developed with non-governmental (i.e. citizen journalists) and intergovernmental organizations in mind, as such organizations may have limited resources available for monitoring social media in conflict zones. The system draws on recent advances in applying deep neural networks to computer vision tasks, while also making extensive use of openly available libraries, models and data.

An overview of research within the Genre and Multimodality framework

Discourse, Context & Media / 2017 / PDF DOI

This review article provides an overview of the research conducted within the Genre and Multimodality framework, which has been used to describe documents and other multimodal artefacts over the last 15 years. In addition to reviewing the previous work, the article introduces the central theoretical concepts of the framework – medium, mode and genre – and their application to the study of multimodality.

The multimodality of digital longform journalism

Digital Journalism / 2017 / PDF DOI CORPUS

Screenshot © The New York Times

In this article, which grew out of the additional research conducted for my book, I examined how digital longform journalism combines written language, news photography, short videos and other modes of communication using a corpus of 12 longform articles published between 2012 and 2013.

Individual and collaborative semiotic work in document design

Hermes – Journal of Language and Communication in Business / 2016 / PDF PUBLISHER

Screenshot of a diagram

This article discusses how different specialists contribute to document design, presenting a case study focusing on the production of Finnair's corporate annual reports.

By interviewing a project manager and a graphic designer and contrasting the findings from the interview with a multimodal analysis of the annual report, I attempt to trace how their contributions shape the design of the document.

Helsingin kaupungin matkailuesitteiden multimodaalinen korpus [A multimodal corpus of tourist brochures advertising Helsinki, Finland]

Terra / 2016 / PDF CORPUS

Screenshot of a figure

Tämä datankuvausartikkeli esittelee väitöskirjaani varten kootun multimodaalisen korpuksen, joka sisältää 58 aukeamaa Helsingin kaupungin vuosina 1967–2008 julkaisemista englanninkielisistä matkailuesitteistä.

This article describes the multimodal corpus that I compiled for my doctoral dissertation. The corpus contains 58 double-pages from English-language tourist brochures published by the city of Helsinki between 1967 and 2008.

Semi-automated annotation of page-based documents within the Genre and Multimodality framework

Proceedings of the 10th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities at ACL 2016, Berlin / 2016 / PDF PUBLISHER GITHUB

Screenshot of the annotator

This conference paper presents the annotator I've developed for automating the description of multimodal documents using the Genre and Multimodality model. The annotator leverages open source libraries such as OpenCV, NLTK and Tesseract for generating the annotation. The work on the annotator continues: feel free to get in touch if you wish to contribute to its development.

Aspects of multimodality in higher education monographs

Multimodality in Higher Education / Brill / 2016 / PDF DOI

Learning to read academic texts is a key skill required by any student or researcher. In this chapter, I aimed to show how multimodal analysis can be used to reveal different types of structures typically found in academic texts. The chapter appeared in a volume edited by Arlene Archer (University of Cape Town) and Esther Breuer (University of Cologne).

Combining computer vision and multimodal analysis: a case study of layout symmetry in bilingual in-flight magazines

Building Bridges for Multimodal Research: International Perspectives on Theories and Practices of Multimodal Analysis / Peter Lang / 2015 / PDF

In this article, I examined how bilingual documents use layout to signal the reader that their contents are semantically equivalent. Recently, I've been interested in computer vision and machine learning: I applied some of these techniques to sketch a method for studying layout symmetry in large data sets. The article was published in a volume edited by Janina Wildfeuer (Bremen University).

The Structure of Multimodal Documents: An Empirical Approach


This book, which is largely based on my doctoral dissertation, develops a framework for studying how language, images and other modes of communication work together in page-based documents. In addition to revising and rewriting each chapter, two new chapters have been added, which focus on the page as a unit of analysis in a multimodal document and show how the framework can be applied to digital media.

In addition to conducting an extensive, empirical study of the Helsinki tourist brochures published between 1967 and 2008, I examine how digital longform journalism, exemplified by publications such as The New York Times' "Snow Fall", combines photography, short videos, cinematic transitions and written language.

Multimodal genre analysis

Interactions, Images, and Texts: A Reader in Multimodality / De Gruyter Mouton / 2014 / PDF DOI

Drawing on the research conducted for my doctoral dissertation, I also contributed a chapter to a handbook edited by Sigrid Norris (Auckland University of Technology) and Carmen Daniela Maier (Aarhus University), which outlines how the concept of genre may be used to describe and compare different documents.

The central argument of the chapter is that in order for the concept of genre to be useful, the concept needs to be tied to multimodal structures identified in the document.

The chapter also shows how the visualisation tools developed for my dissertation can reveal different types of structure commonly found in multimodal documents.

Modelling the structure of a multimodal artefact

University of Helsinki / 2013 / PDF URN CORPUS

In my doctoral dissertation, I developed a framework for studying how language, image and other communicative resources work together in page-based documents. To test the framework, I collected a set of tourist brochures published between 1967 and 2008, which advertised the city of Helsinki.

I compiled the brochures into a multimodal corpus, describing their content, layout, rhetorical organisation and appearance using the Genre and Multimodality model.

I then investigated the corpus for patterns that characterised the tourist brochures as a document genre. I also explored how these patterns have changed over time as a result of new production technologies.

The interface between rhetoric and layout in multimodal artefacts

Literary & Linguistic Computing / 2013 / PDF DOI

In 2011, I gave a two-minute lightning talk about my research at the InterFace 2011 digital humanities conference. On the basis of my presentation, I was invited to contribute to a special section of Literary & Linguistic Computing. This paper develops some ideas about layout and how it guides the interpretation of its contents. I later explored these issues in greater detail in my doctoral dissertation.

Reading paths and visual perception in multimodal research, psychology and brain sciences

Journal of Pragmatics / 2012 / PDF DOI

This article emerged from a side project during my doctoral research. After reading several articles on visual perception, while simultaneously annotating a multimodal corpus containing a rich description of document structure, I wanted to find a way to interface the corpus with an eye-tracker. The article proposes a way of doing so and outlines some research questions that could be investigated using such a combination.

The localisation of advertising print media as a multimodal process

Multimodal Texts from Around the World: Linguistic and Cultural Insights / Palgrave Macmillan / 2012 / PDF DOI

My first publication ever, a chapter in a volume edited by Wendy Bowcher (Sun Yat-Sen University). The chapter discusses localisation, that is, how documents are translated into another language and adapted to the target culture. I argue that the process of localisation must pay equal attention to visual communication and multimodality.


Tekstilingvistiikkaa kääntäjille

2015 / PDF

Lyhyt johdanto tekstilingvistiikkaan kevään 2015 luentokurssilta.

Aineisto, joka esittelee tekstilingvistiikan keskeisiä käsitteitä, on lisensoitu Creative Commons Nimeä - Ei Kaupallinen 4.0 Kansainvälinen -lisenssillä.

Voit siis vapaasti ladata ja jakaa aineistoa, sekä muokata sitä tarpeen mukaan, kunhan mainitset aineiston alkuperän.