South East Asia (SEA) is a region with immense cultural and linguistic diversity—a melting pot of
religions, and languages and it has a linguistic diversity hosting over 1200 languages.
In addition, multilingualism (i.e., speaking more than one language or dialect) is widely practiced on a
daily basis.
This tutorial will present an overview of language issues in the SEA region, link multilingualism and
sociolinguistics with historical and societal perspectives, and provide a summary of the existing datasets
for CL research,
NLP systems, and evaluation benchmarks.
Our goal is to inform the AACL'23 audience about challenges and opportunities for NLP research in SEA,
taking the linguistic diversity in the region and multilingualism among the users and communities into
while providing an overview of current NLP research on the languages spoken in the area.
By providing an overview, we will highlight the research gaps to be tackled in the future.
author = {Aji, Alham Fikri and Forde, Jessica Zosa and Loo, Alyssa Marie and Sutawika, Lintang and Wang, Skyler and Winata, Genta Indra and Yong, Zheng-Xin and Zhang, Ruochen and Do\u{g}ru\"{o}z, A. Seza and Tan, Yin Lin and Cruz, Jan Christian Blaise},
title = {Current Status of NLP in South East Asia with Insights from Multilingualism and Language Diversity},
booktitle = {Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics},
month = {November},
year = {2023},
address = {Nusa Dua, Bali},
publisher = {Association for Computational Linguistics},
pages = {8--13},
url = {}