Paper link: A Report on the First Native Language Identification Shared Task (Tetreault, Blanchard & Cahill, 2013)
Paper Reading – First Native Language Identification (NLI) Shared Task
This post summarizes the first shared task on Native Language Identification (NLI)- predicting a writer’s native language (L1) from essays written in a learned language (here, English). It standardizes data, tasks, and evaluation to enable meaningful comparison across 29 participating teams, and remains a foundational benchmark for educational NLP and authorship profiling.
Why this matters
NLI supports targeted feedback for language learners (different L1s show distinct error tendencies) and contributes to authorship profiling. Before this effort, research relied on small, inconsistent corpora (often ICLE), making results hard to compare. This shared task fixed that by providing a large, balanced corpus and uniform evaluation.
Continue reading “Paper#1: A Report on the First Native Language Identification Shared Task”