Apr 2023 – Apr 2025

Scientist · Tech Lead, MERaLiON Team

Institute for Infocomm Research (I²R), A*STAR, Singapore · Singapore

Research Focus

I started my time at I²R, A*STAR focusing on dialogue summarization research. When the MERaLiON team was later formed to spearhead Singapore's national LLM project, I joined this elite group — honestly one of the sharpest teams at the institute, packed with top-tier PhDs and engineers. I took the lead on evaluation and data preparation for the AudioLLM workstream, focusing on making large models work across the diverse languages of Southeast Asia. It was a fast-paced journey that I wrapped up in 2025.

I served as Tech Lead of the MERaLiON Team under the National Multimodal LLM Programme (NMLP), a S$70 million grant from NRF.

Active Topics

  • Dialogue Summarization
    • How to effectively summarize multi-turn dialogues while preserving key information?
    • What techniques can improve coherence and factual consistency in dialogue summaries?
  • Making LLMs hear — AudioLLM
    • What techniques can be used to effectively integrate audio processing capabilities into existing LLM architectures?
    • What is the most efficient approach for achieving seamless cross-modality integration?
    • What benchmarks can be designed to accurately evaluate the real-world performance of AudioLLMs?

Publications

Students Supervised

  • Pham The Binh Minh — Undergraduate Research Intern, NTU, Singapore (2025-01 – 2025-05). Multimodal AudioLLMs.
  • Yiming Gao — Undergraduate Research Intern, NTU, Singapore (2025-01 – 2025-05). Instruction following capability for multimodal large language models. (AACL 2025)
  • Tey Xue Cong — A*STAR Scholar Intern, Ngee Ann Polytechnic, Singapore (2025-02 – 2025-04). Supervisor: Xunlong Zou. Multilingual speech data collection and processing.
  • Jayden Lum — A*STAR Scholar Intern, Ngee Ann Polytechnic, Singapore (2025-02 – 2025-04). Supervisor: Xunlong Zou. Multilingual speech data collection and processing.
  • Yanchao Li — ACIS PhD Scholar, NTU, Singapore (2024-01 – 2025-04). Supervisor: Nancy F. Chen. Long video understanding.
  • Ziyi Xu — Research Intern, NUS, Singapore (2024-07 – 2024-12). Supervisor: Sun Shuo. Multimodal alignment data collection and filtering.
  • Ayrton San Joaquin — Research Associate, DesCarte@CREATE, Singapore (2023-09 – 2024-08). Efficient training of large language models through gradient estimation. (EMNLP 2024 Findings)
  • Anh Thuc Nguyen — Research Intern, UNC Chapel Hill, USA (2024-01 – 2024-05). Question generation for MERaLiON project and evaluation dataset creation.

Academic Services

  • Publication Chair: EMNLP 2023
  • Local Organizing Team: EMNLP 2023
  • Area Chair: ACL ARR (2024-2025)
  • Editor: APSIPA Transactions on Signal and Information Processing
  • Reviewer: ACL, EMNLP, NAACL, ICASSP, IEEE TASLP

Awards

  • Best Paper Award ($300) — SUMEval Workshop, COLING 2025
  • Best Paper Award ($200) — C3NLP Workshop, ACL 2024

Videos

Talks

  • 2025.03 — Lorong AI, Singapore. Evaluation on Audio-LLMs and Beyond. Slides