Projects

This page contains information about my key research projects and contributions.

Current Projects @ MiroMind (2025-)

MiroFlow & MiroThinker

Open-source agentic frameworks and advanced AI models for research and applications.

  • MiroFlow: Leading open-source agentic framework

  • MiroThinker: Advanced reasoning and thinking framework

  • Focus: Agent platforms, reasoning systems, and practical AI applications

MiroMind-M1 Reasoning Models

Advanced reasoning models developed at MiroMind for complex problem-solving tasks.

  • MiroMind-M1: State-of-the-art reasoning capabilities

  • Applications: Mathematical reasoning, logical inference, and multi-step problem solving

Audio Multimodal Projects @ I2R, A*STAR (2023-2025)

MERaLiON AudioLLM

Flagship national AI project developing audio-based large language models for Southeast Asian languages.

  • MERaLiON Website: First audio LLM designed for Singlish and regional languages

  • Models: Released models and datasets

  • Impact: Advancing multilingual and multicultural AI capabilities

AudioBench Evaluation Framework

Comprehensive benchmark and evaluation toolkit for Audio Large Language Models.

  • Code Repository: Open-source evaluation framework

  • Leaderboard: Live evaluation results

  • Paper: NAACL 2025 accepted work

  • Contribution: Universal benchmark for AudioLLM evaluation

Multilingual & Cross-Cultural AI (2021-2025)

SeaEval Multilingual Evaluation

Comprehensive evaluation framework for multilingual foundation models in Southeast Asian contexts.

  • Project Website: Evaluation suite and resources

  • Code: Open-source evaluation toolkit

  • Focus: Cross-lingual alignment and cultural reasoning capabilities

SEACrowd Data Initiative

Multilingual multimodal data hub for Southeast Asian languages and cultures.

  • Paper: Comprehensive data collection and standardization

  • Impact: Enabling research in low-resource Southeast Asian languages

Representation Learning & Knowledge Graphs (2017-2023)

Sentence and Word Embedding Research

Advanced techniques for learning semantic representations from text.

  • SBERT-WK: Sentence embedding via BERT-based word model dissection

  • EvalRank: Rethinking evaluation with word and sentence similarities

  • Applications: Information retrieval, semantic search, and text understanding

Knowledge Graph Completion

Inductive learning approaches for commonsense knowledge graph completion.

  • InductivE: Novel inductive learning framework for KG completion

  • Focus: Commonsense reasoning and knowledge representation learning