Best Paper Award at EMNLP 2024

Mir Tafseer Nayeem and Davood Rafiei won the Best Resource Paper Award at EMNLP 2024.

14 November 2024

Mir Tafseer Nayeem and Davood Rafiei won the Best Resource Paper Award at EMNLP 2024 for their paper “KidLM: Advancing Language Models for Children – Early Insights and Future Directions”.

This paper, one of two recipients of the “Resource Paper Award,” lays the groundwork for developing child-specific language models by highlighting the critical role of high-quality pre-training data. The authors present a novel user-centric data collection pipeline that involves gathering and validating a corpus specifically tailored for children, including content written for and sometimes by them. Additionally, they introduce Stratified Masking, a new training objective that dynamically adjusts masking probabilities based on domain-specific child language data, enabling models to prioritize vocabulary and concepts more suitable for children’s linguistic needs.


Read the full paper for more details here!