Natural Language Processing

We at the Chair for Natural Language Processing (Computer Science XII) try to make machines understand human language! In fact, we try to make them understand very many different human languages. We primarily focus on written text (after all, speech can always be transcribed to text). Methodologically, the work of the group focuses on deep learning and representation learning methods for semantic modeling of natural language (that is, precise modeling of meaning of natural language statements and text documents), with the special focus on multilingual representation learning and cross-language transfer of models for concrete NLP tasks.

Driven by deep learning advances, NLP has lately seen substantial progress, primarily due to the technical ability to (pre)train ever larger neural models on ever more text. Such progress can be exclusive as its benefits are beyond reach for most of the world’s population (e.g., speakers of low-resource languages, anyone who lacks computational resources needed to train or use these models). Moreover, training ever larger language models based on complex neural architectures (for example, the popular Transformer) has a large carbon footprint and such models tend to encode a wide range of negative societal stereotypes and biases (e.g., sexism, racism). At WüNLP we specifically address these challenges and aim to democratize state-of-the-art language technology. To this end, we pursue three research threads that we hope will lead to equitable, societally fair, and sustainable language technology: (i) sustainable, modular, and sample-efficient NLP models, (ii) fair and ethical (i.e., unbiased) NLP, and (iii) truly multilingual NLP, with special focus on low-resource languages.

Text data is all around -- besides the core methodological NLP work, we also work on interdisciplinary projects where we apply cutting-edge NLP methods to interesting problems from other disciplines, most prominently in the area of Computational Social Science (and so far most often in collaboration with political scientists).

Our Chair has international prominence and visibility. We regularly publish our research results at the very competitive top-tier NLP conferences (ACL, EMNLP, NAACL, EACL). Further, Prof. Glavaš served as an Editor-in-Chief for the ACL Rolling Review, the centralized reviewing service of the Association for Computational Linguistics. We have established numerous research collaborations, most prominently with the Language Technology Group of the University of Cambridge., CIS at LMU München, and UKP at TU Darmstadt.

News

Open position: Postdoc to work on aligning LLMs

We are looking for a postdoc to join our group! The position is bound to the project EQUIFAIR (Equitably Fair and Trustworthy

Language Technology), funded by the Alcatel-Lucent Stiftung and focuses on alignment of large language models (LLMs): hallucinations and societal biases, to be addressed in a (massively) multilingual context and in an explainable/interpretable manner.

Mehr

Two papers accepted at ACL 2023

WüNLP will have two papers in the Main Conference Program of the 61st Annual Meeting of the Association for Computational Linguistics (ACL’23), which is the primary publication venue in NLP.

Mehr

Three papers accepted at EMNLP 2022

WüNLP will have three papers in the Main Conference Program of Empirical Methods in Natural Language Processing, one of the most prestigious venues in NLP.

Mehr

Hubland Nord, Gebäude 50

Open position: Postdoc to work on aligning LLMs

Two papers accepted at ACL 2023

Three papers accepted at EMNLP 2022

Bildnachweise