Máster Universitario Erasmus Mundus en Ciencia de Datos Lingüísticos / Erasmus Mundus Master in Linguistic Data Science 2025–2026




Introduction

Linguistic data science lies at the intersection of several fields: computer science, mathematics and statistics, and linguistics. While (traditional) data science relies primarily on the statistical dimension of data, linguistic data science also requires mechanisms for explicit representation of language data, in order to address not only "plain" text but also the linguistic features of the data (grammatical categories, word senses, etc.). This helps address complex questions for which a purely statistical approach is insufficient. 

This master's degree responds to the need for training at this intersection of fields of knowledge, combining teaching in a new field of study at excellent higher education institutions and providing mobility as an essential feature of the program. 

Students will explore the specificities of linguistic data in both small and big data contexts. Implicit representation of linguistic aspects (distributional semantics, embeddings, latent semantic analysis, etc.) will be studied, as well as explicit representation of linguistic data (terminologies, dictionaries, ontologies, annotated corpora, etc.). Students will be trained in machine learning and deep learning techniques for use in the analysis and processing of lexical and textual data, as well as in the Semantic Web and linguistic linked data. 

You can find more details at the Master's website: https://emlds.fcsh.unl.pt/   


Why take this degree?

Given the current emergence in the adoption of  Artificial Intelligence (AI) techniques in general, and Natural Language Processing (NLP) in particular, studying linguistic data science means positioning yourself at the heart of this revolution. It means being part of those designing the future of human-machine communication, with linguistic sensitivity and technical rigor. Current AI models not only process words, but also need to understand syntactic, semantic, and pragmatic structures. This requires experts capable of representing language computationally and tackling complex problems such as semantic ambiguity, automatic text generation in specific contexts, machine translation, or overcoming language barriers on the Web. 


Recommended profile

Two student profiles are expected:

1) graduates in scientific and technical fields, with skills in programming, mathematics, and statistics; and

2) graduates in humanities and language sciences, with skills in linguistics and social sciences.


Career opportunities

The program will equip students with skills that complement and go beyond their original training, emphasizing interdisciplinarity and encouraging collaborative activities and practices with students from diverse backgrounds. 

The graduate profile will have the following characteristics and skills: 

  • Both theoretical and computational thinking about the nature of linguistic data. 
  • Analysis of problems involving natural language text processing and linguistic data, proposing creative and innovative solutions that can be used for technology transfer. 
  • Mastery of machine learning and deep learning techniques for use in the analysis and processing of lexical and textual data, as well as in the Semantic Web and linguistic linked data. 
  • Adaptation to change, capable of applying new and advanced technologies with initiative and an entrepreneurial spirit in multidisciplinary and multicultural environments. 
  • Ability to learn independently to maintain and improve acquired skills, continuously developing in the practice of the profession. 

This profile will allow them to transfer their knowledge of linguistic data science to practical applications in industry or in academic research, enabling their future incorporation into teams in companies, public administrations, or universities.