Benchmark for evaluating persona-sensitive influencing in persuasive dialogues; under review at EMNLP 2026.
May 15, 2026

Benchmark suite and analysis framework for evaluating collaboration and competition among LLM Multi Agents; accepted to ACL 2025 Main.
Feb 15, 2025

Benchmark + agent framework for evaluating creative reasoning in room-escape environments; ACL 2025 main-conf paper.
Dec 15, 2024