Current Status of NLP in South East Asia
with Insights from Multilingualism and Language Diversity


AACL 2023 Tutorial
*Indicates Equal Contribution

αMBZUAI βBrown University γEleutherAI δUC Berkeley εMeta AI
ηBloomberg κGhent University λStanford University
πNational University of Singapore φSamsung R&D Institute Philippines

Overview

South East Asia (SEA) is a region with immense cultural and linguistic diversity—a melting pot of cultures, religions, and languages and it has a linguistic diversity hosting over 1200 languages. In addition, multilingualism (i.e., speaking more than one language or dialect) is widely practiced on a daily basis.

This tutorial will present an overview of language issues in the SEA region, link multilingualism and computational sociolinguistics with historical and societal perspectives, and provide a summary of the existing datasets for CL research, NLP systems, and evaluation benchmarks.

Our goal is to inform the AACL'23 audience about challenges and opportunities for NLP research in SEA, taking the linguistic diversity in the region and multilingualism among the users and communities into account, while providing an overview of current NLP research on the languages spoken in the area. By providing an overview, we will highlight the research gaps to be tackled in the future.

BibTeX


@inproceedings{aji-etal-2023-current,
  author    = {Aji, Alham Fikri  and  Forde, Jessica Zosa  and  Loo, Alyssa Marie  and  Sutawika, Lintang  and  Wang, Skyler  and  Winata, Genta Indra  and  Yong, Zheng-Xin  and  Zhang, Ruochen  and  Do\u{g}ru\"{o}z, A. Seza  and  Tan, Yin Lin  and  Cruz, Jan Christian Blaise},
  title     = {Current Status of NLP in South East Asia with Insights from Multilingualism and Language Diversity},
  booktitle      = {Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics},
  month          = {November},
  year           = {2023},
  address        = {Nusa Dua, Bali},
  publisher      = {Association for Computational Linguistics},
  pages     = {8--13},
  url       = {https://aclanthology.org/2023.ijcnlp-tutorials.2}
}