I²R, A*STAR (2023 - 2025)

Position

Scientist (04/2023 - 04/2025), Institute for Infocomm Research (I²R), A*STAR, Singapore.

Tech Lead, MERaLiON Team - National Multimodal LLM Programme (NMLP), S$70 million grant from NRF

Research & Projects

Summary

I started my time at I²R, A*STAR focusing on dialogue summarization research. When the MERaLiON team was later formed to spearhead Singapore's national LLM project, I joined this elite group—honestly one of the sharpest teams at the institute, packed with top-tier PhDs and engineers. I took the lead on evaluation and data preparation for the AudioLLM workstream, focusing on making large models work across the diverse languages of Southeast Asia. It was a fast-paced journey that I wrapped up in 2025.

Research Topics

Dialogue Summarization

  • How to effectively summarize multi-turn dialogues while preserving key information?
  • What techniques can improve coherence and factual consistency in dialogue summaries?

Making LLMs hear - AudioLLM

  • What techniques can be used to effectively integrate audio processing capabilities into existing LLM architectures?
  • What is the most efficient approach for achieving seamless cross-modality integration?
  • What benchmarks can be designed to accurately evaluate the real-world performance of AudioLLMs?

Videos

MERaLiON Introduction

Introduction to MERaLiON project.

MERaLiON Demo

Demo of MERaLiON AudioLLM capabilities.

Publications

Awards

  • Best Paper Award ($300)
    SUMEval Workshop, COLING 2025
  • Best Paper Award ($200)
    C3NLP Workshop, ACL 2024

Talks

Academic Services

  • Publication Chair: EMNLP 2023
  • Local Organizing Team: EMNLP 2023
  • Area Chair: ACL ARR (2024-2025)
  • Editor: APSIPA Transactions on Signal and Information Processing
  • Reviewer: ACL, EMNLP, NAACL, ICASSP, IEEE TASLP

Students

  • Pham The Binh Minh, Undergraduate Research Intern, NTU, Singapore
    2025-01 - 2025-05
    Topic: Multimodal AudioLLMs.
  • Yiming Gao, Undergraduate Research Intern, NTU, Singapore
    2025-01 - 2025-05
    Topic: Instruction following capability for multimodal large language models.
    Publication: AACL 2025
  • Tey Xue Cong, A*STAR Scholar Intern, Ngee Ann Polytechnic, Singapore
    Main Supervisor: Xunlong Zou
    2025-02 - 2025-04
    Topic: Multilingual speech data collection and processing.
  • Jayden Lum, A*STAR Scholar Intern, Ngee Ann Polytechnic, Singapore
    Main Supervisor: Xunlong Zou
    2025-02 - 2025-04
    Topic: Multilingual speech data collection and processing.
  • Yanchao Li, ACIS PhD Scholar, NTU, Singapore
    Main Supervisor: Nancy F. Chen
    2024-01 - 2025-04
    Topic: Long video understanding.
  • Ziyi Xu, Research Intern, NUS, Singapore
    Main Supervisor: Sun Shuo
    2024-07 - 2024-12
    Topic: Multimodal alignment data collection and filtering.
  • Ayrton San Joaquin, Research Associate, DesCarte@CREATE, Singapore
    2023-09 - 2024-08
    Topic: Efficient training of large language models through gradient estimation.
    Publication: EMNLP 2024 Findings
  • Anh Thuc Nguyen, Research Intern, UNC Chapel Hill, USA
    2024-01 - 2024-05
    Topic: Question generation for MERaLiON project and evaluation dataset creation.