Publications

(2026). Ψ-Bench: Evaluating Persona-Sensitive Influencing in Persuasive Dialogues. Under review at EMNLP ‘26.
(2025). Which LLM Multi-Agent Protocol to Choose?. In ICML ‘26.
(2025). ModelingAgent: Bridging LLMs and Mathematical Modeling for Real-World Challenges. In EMNLP ’25 Findings.
(2025). MultiAgentBench: Evaluating the Collaboration and Competition of LLM Agents. In ACL ’25.
(2024). EscapeBench: Towards Advancing Creative Intelligence of Language Model Agents. In ACL ’25.