About.

CV ↗

Bin Wang is an AI Research Scientist at Apodex, based in Singapore. He received his Ph.D. from the University of Southern California (USC) in 2021, advised by Prof. C.-C. Jay Kuo. He earned his B.Eng. from the University of Electronic Science and Technology of China (UESTC) in 2017.

His research has evolved along a continuous thread: representation learning for words, sentences, and knowledge graphs during his Ph.D.; dialogue summarization and audio-language models as a Research Fellow at the National University of Singapore (NUS) and as a Scientist at the Institute for Infocomm Research (I²R), A*STAR; and most recently agentic reasoning models at Apodex. He has published over 60 papers in venues including ACL, EMNLP, NAACL, IEEE TASLP, IEEE TNNLS, and ACM KDD, and has received multiple best paper and outstanding paper awards.

He grew up in Jining, Shandong, and studied in Chengdu, Hong Kong, Oshawa, and Los Angeles before settling in Singapore, where he became a permanent resident in 2024. His days and evenings go to Apodex; off-hours tend to involve weights, books, maintaining a couple of open-source resource hubs (audio AI and Southeast-Asian language data), and a quiet obsession with keeping life simple and low-entropy.

Research

Agentic AI & Reasoning

Open-source agent research models and frameworks that scale along model size, context length, and interactive depth. The goal is to equip general-purpose reasoning models with robust tool use, long-horizon planning, and reproducible evaluation.

MiroFlow · MiroThinker · MiroMind-M1 · Survey: 100 Days After DeepSeek-R1

Audio & Multimodal LLMs

At I²R, A*STAR, led the evaluation and data workstreams for MERaLiON AudioLLM under Singapore's National Multimodal LLM Programme, focusing on instruction-following, cross-modality alignment, and paralinguistic understanding across English, Mandarin, Malay, Tamil, and Singlish.

MERaLiON-AudioLLM · AudioBench · MoWE-Audio · IFEval-Audio

Multilingual & Cultural LLMs

Research on cross-lingual knowledge alignment, cultural reasoning, and data curation for Southeast Asian languages, developed through close collaboration with regional academic and industry partners.

SeaEval · SEACrowd · CRAFT · CrossIn

Experience

Work

05/2025 – Present
AI Research Scientist · Apodex, Singapore→
Research on agentic AI and reasoning models.
04/2023 – 04/2025
Scientist · Tech Lead (Evaluation & Data), MERaLiON Team · I²R, A*STAR, Singapore→
Tech Lead (Evaluation & Data) for the MERaLiON AudioLLM team under Singapore's National Multimodal LLM Programme, focusing on Southeast Asian audio-language models.
09/2021 – 03/2023
Research Fellow · National University of Singapore (NUS)→
Dialogue summarization, sentence representation learning, and embedding evaluation. Advisor: Prof. Haizhou Li.
05/2020 – 08/2020
Research Internship · JD AI Research, Mountain View, USA→
Inductive learning for commonsense knowledge graph completion, in collaboration with Stanford (Jure Leskovec group).
07/2016 – 10/2016
Research Internship · Ontario Tech University, Canada→
3D point-cloud hand-gesture recognition with Kinect for human-robot interaction. Mitacs Globalink awardee.

Education

08/2017 – 05/2021
Ph.D. · University of Southern California (USC), Los Angeles, USA→
Thesis on word, sentence, and knowledge graph representation learning. Advisor: Prof. C.-C. Jay Kuo.
09/2013 – 07/2017
Bachelor · University of Electronic Science and Technology of China (UESTC)→
Undergraduate National Scholarship (2015, 2016); Excellent Graduate of Sichuan Province (2017).
08/2015 – 12/2015
Exchange Student · City University of Hong Kong (CityU HK)→
One-semester exchange program in Hong Kong during undergraduate studies.

Recognition

Awards

APSIPA Sadaoki Furui Prize Paper Award (2024)
APSIPA Sadaoki Furui Prize Paper Award (2022)
Best Paper Award, SUMEval Workshop at COLING 2025
Best Paper Award, C3NLP Workshop at ACL 2024
USC Graduate Student Government Research Travel Grant (2019)
Excellent Graduate of Sichuan Province (2017)
Undergraduate National Scholarship, China (2015, 2016)
Mitacs Globalink Research Internship (2016)

Academic Services

Publication Chair, EMNLP 2023
Area Chair, ACL ARR (2024–2025)
Editorial Board, APSIPA Transactions on Signal and Information Processing, 2023–2025
Session Chair, IJCNN 2021
Reviewer: Nature Human Behaviour (2022), IEEE/ACM TASLP, ACL, EMNLP, NAACL, ICASSP, ICME

Students Supervised

Mentored 10+ research students and interns at I²R (A*STAR) and NUS on audio LLMs, instruction tuning, and long-video understanding; many have moved on to strong PhD programs or industry roles.

Publications

60+ papers · 2500+ citations

MiroFlow: Towards High-Performance and Robust Open-Source Agent Framework for General Deep Research Tasks· arXiv 2026
MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling· Technical Report 2025
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization· Technical Report 2025
MERaLiON-AudioLLM: Bridging Audio and Language with Large Language Models· ACL 2025
AudioBench: A Universal Benchmark for Audio Large Language Models· NAACL 2025
CoinMath: Harnessing the Power of Coding Instruction for Math LLMs· Findings of ACL 2025
Resilience of Large Language Models for Noisy Instructions· Findings of EMNLP 2024
SeaEval for Multilingual Foundation Models: From Cross-Lingual Alignment to Cultural Reasoning· NAACL 2024
Knowledge Graph Embedding: An Overview· APSIPA Transactions on Signal and Information Processing, 2024
An Overview on Language Models: Recent Developments and Outlook· APSIPA Transactions on Signal and Information Processing, 2023
Compounding Geometric Operations for Knowledge Graph Completion· ACL 2023
Analyzing and Evaluating Faithfulness in Dialogue Summarization· EMNLP 2022
Just Rank: Rethinking Evaluation with Word and Sentence Similarities· ACL 2022
Graph Representation Learning: A Survey· APSIPA Transactions on Signal and Information Processing, 2020
SBERT-WK: A Sentence Embedding Method by Dissecting BERT-based Word Models· IEEE/ACM TASLP 2020
Evaluating Word Embedding Models: Methods and Experimental Results· APSIPA Transactions on Signal and Information Processing, 2019

Full list on Google Scholar ↗

Contact

bwang28c@gmail.com