Language, Computation and Cognition (LaCC) Lab

PsyArXiv
OneStop: A 360-Participant English Eye Tracking Dataset with Different Reading Regimes

Yevgeni Berzak, Jonathan Malmaud, Omer Shubi, Yoav Meiri, Ella Lion, and Roger Levy

PsyArXiv preprint 2025

Abs Bib PDF

We present OneStop Eye Movements, a large-scale corpus of eye movements in reading, in which native (L1) speakers read newswire texts in English and answer reading comprehension questions. OneStop has 152 hours of eye movement recordings from 360 participants for 2.6 million word tokens, more data than all the existing public broad coverage English L1 eye tracking datasets combined. The eye movement data was collected for extensively piloted reading comprehension materials comprising 486 reading comprehension questions and auxiliary text annotations geared towards behavioral analyses of reading comprehension. Importantly, OneStop includes multiple reading regimes: ordinary reading, information seeking, repeated reading of the same text, and reading simplified text. The combination of the unprecedented size, high-quality reading comprehension materials and multiple reading scenarios, aims to enable new research avenues in the study of reading and human language processing. It further aims to facilitate the integration of eye tracking data in Natural Language Processing (NLP), Artificial Intelligence (AI), Human Computer Interaction (HCI) and educational applications.
@article{berzak2025onestop, abbr = {PsyArXiv}, bibtex_show = {True}, pdf = {https://osf.io/preprints/psyarxiv/kgxv5}, dataset = {onestop}, data_url = {https://github.com/lacclab/OneStop-Eye-Movements}, title = {OneStop: A 360-Participant English Eye Tracking Dataset with Different Reading Regimes}, author = {Berzak, Yevgeni and Malmaud, Jonathan and Shubi, Omer and Meiri, Yoav and Lion, Ella and Levy, Roger}, journal = {PsyArXiv preprint}, year = {2025} }
arXiv
Decoding Open-Ended Information Seeking Goals from Eye Movements in Reading

Cfir Avraham Hadar, Omer Shubi, Yoav Meiri, and Yevgeni Berzak

arXiv preprint 2025

Abs Bib PDF

When reading, we often have specific information that interests us in a text. For example, you might be reading this paper because you are curious about LLMs for eye movements in reading, the experimental design, or perhaps you only care about the question “but does it work?”. More broadly, in daily life, people approach texts with any number of text-specific goals that guide their reading behavior. In this work, we ask, for the first time, whether open-ended reading goals can be automatically decoded from eye movements in reading. To address this question, we introduce goal classification and goal reconstruction tasks and evaluation frameworks, and use large-scale eye tracking for reading data in English with hundreds of text-specific information seeking tasks. We develop and compare several discriminative and generative multimodal LLMs that combine eye movements and text for goal classification and goal reconstruction. Our experiments show considerable success on both tasks, suggesting that LLMs can extract valuable information about the readers’ text-specific goals from eye movements.
@article{hadar2025goals, abbr = {arXiv}, bibtex_show = {True}, pdf = {https://arxiv.org/pdf/2505.02872}, title = {Decoding Open-Ended Information Seeking Goals from Eye Movements in Reading}, author = {Hadar, Cfir Avraham and Shubi, Omer and Meiri, Yoav and Berzak, Yevgeni}, journal = {arXiv preprint}, year = {2025} }
arXiv
Eye Tracking Based Cognitive Evaluation of Automatic Readability Assessment Measures

Keren Gruteke Klein, Shachar Frenkel, Omer Shubi, and Yevgeni Berzak

arXiv preprint 2025

Abs Bib PDF

Automated text readability prediction is widely used in many real-world scenarios. Over the past century, such measures have primarily been developed and evaluated on reading comprehension outcomes and on human annotations of text readability levels. In this work, we propose an alternative, eye tracking-based cognitive framework which directly taps into a key aspect of readability: reading ease. We use this framework for evaluating a broad range of prominent readability measures, including two systems widely used in education, by quantifying their ability to account for reading facilitation effects in text simplification, as well as text reading ease more broadly. Our analyses suggest that existing readability measures are poor predictors of reading facilitation and reading ease, outperformed by word properties commonly used in psycholinguistics, and in particular by surprisal.
@article{gruteke2025ara, abbr = {arXiv}, bibtex_show = {True}, pdf = {https://arxiv.org/pdf/2502.11150}, title = {Eye Tracking Based Cognitive Evaluation of Automatic Readability Assessment Measures}, author = {Gruteke Klein, Keren and Frenkel, Shachar and Shubi, Omer and Berzak, Yevgeni}, journal = {arXiv preprint}, year = {2025} }
ACL
Déjà Vu? Decoding Repeated Reading from Eye Movements

Yoav Meiri, Omer Shubi, Cfir Avraham Hadar, Ariel Kreisberg Nitzav, and Yevgeni Berzak

In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics 2025

Abs Bib PDF

Be it your favorite novel, a newswire article, a cooking recipe or an academic paper – in many daily situations we read the same text more than once. In this work, we ask whether it is possible to automatically determine whether the reader has previously encountered a text based on their eye movement patterns. We introduce two variants of this task and address them with considerable success using both feature-based and neural models. We further introduce a general strategy for enhancing these models with machine generated simulations of eye movements from a cognitive model. Finally, we present an analysis of model performance which on the one hand yields insights on the information used by the models, and on the other hand leverages predictive modeling as an analytic tool for better characterization of the role of memory in repeated reading. Our work advances the understanding of the extent and manner in which eye movements in reading capture memory effects from prior text exposure, and paves the way for future applications that involve predictive modeling of repeated reading.
@inproceedings{meiri2025dejavu, abbr = {ACL}, bibtex_show = {True}, pdf = {https://aclanthology.org/2025.acl-long.956.pdf}, title = {Déjà Vu? Decoding Repeated Reading from Eye Movements}, author = {Meiri, Yoav and Shubi, Omer and Hadar, Cfir Avraham and Kreisberg Nitzav, Ariel and Berzak, Yevgeni}, booktitle = {Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics}, year = {2025} }
ACL
Decoding Reading Goals from Eye Movements

Omer Shubi, Cfir Avraham Hadar, and Yevgeni Berzak

In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics 2025

Abs Bib PDF

Readers can have different goals with respect to the text they are reading. Can these goals be decoded from the pattern of their eye movements over the text? In this work, we examine for the first time whether it is possible to decode two types of reading goals that are common in daily life: information seeking and ordinary reading. Using large scale eye-tracking data, we apply to this task a wide range of state-of-the-art models for eye movements and text that cover different architectural and data representation strategies, and further introduce a new model ensemble. We systematically evaluate these models at three levels of generalization: new textual item, new participant, and the combination of both. We find that eye movements contain highly valuable signals for this task. We further perform an error analysis which builds on prior empirical findings on differences between ordinary reading and information seeking and leverages rich textual annotations. This analysis reveals key properties of textual items and participant eye movements that contribute to the difficulty of the task.
@inproceedings{shubi2025goals, abbr = {ACL}, bibtex_show = {True}, pdf = {https://aclanthology.org/2025.acl-long.280.pdf}, title = {Decoding Reading Goals from Eye Movements}, author = {Shubi, Omer and Hadar, Cfir Avraham and Berzak, Yevgeni}, booktitle = {Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics}, year = {2025} }
CogSci
The Effect of Text Simplification on Reading Fluency and Reading Comprehension in L1 English Speakers

Keren Gruteke Klein, Omer Shubi, Shachar Frenkel, and Yevgeni Berzak

In Proceedings of the Annual Meeting of the Cognitive Science Society 2025

Abs Bib PDF

Text simplification is a common practice for making texts easier to read and easier to understand. To which extent does it achieve these goals, and which participant and text characteristics drive simplification benefits? In this work, we use eye tracking to address these questions for the first time for the population of adult native (L1) English speakers. We find that 42% of the readers exhibit reading facilitation effects, while only 2% improve reading comprehension accuracy. We further observe that reading fluency benefits are larger for slower and less experienced readers, while comprehension benefits are more substantial in lower comprehension readers, but not vice versa. Finally, we find that high-complexity original texts are key for enhancing reading fluency, while large complexity reduction is more pertinent to improving comprehension. Our study highlights the potential of cognitive measures in the evaluation of text simplification and distills empirically driven principles for enhancing simplification effectiveness.
@inproceedings{gruteke2025effect, abbr = {CogSci}, bibtex_show = {True}, pdf = {https://escholarship.org/uc/item/3tb1553f}, title = {The Effect of Text Simplification on Reading Fluency and Reading Comprehension in L1 English Speakers}, author = {Gruteke Klein, Keren and Shubi, Omer and Frenkel, Shachar and Berzak, Yevgeni}, booktitle = {Proceedings of the Annual Meeting of the Cognitive Science Society}, year = {2025} }

CoNLL
The Effect of Surprisal on Reading Times in Information Seeking and Repeated Reading

Keren Klein, Yoav Meiri, Omer Shubi, and Yevgeni Berzak

In Proceedings of the 28th Conference on Computational Natural Language Learning 2024

Abs Bib PDF

The effect of surprisal on processing difficulty has been a central topic of investigation in psycholinguistics. Here, we use eyetracking data to examine three language processing regimes that are common in daily life but have not been addressed with respect to this question: information seeking, repeated processing, and the combination of the two. Using standard regime-agnostic surprisal estimates we find that the prediction of surprisal theory regarding the presence of a linear effect of surprisal on processing times, extends to these regimes. However, when using surprisal estimates from regime-specific contexts that match the contexts and tasks given to humans, we find that in information seeking, such estimates do not improve the predictive power of processing times compared to standard surprisals. Further, regime-specific contexts yield near zero surprisal estimates with no predictive power for processing times in repeated reading. These findings point to misalignments of task and memory representations between humans and current language models, and question the extent to which such models can be used for estimating cognitively relevant quantities. We further discuss theoretical challenges posed by these results.
@inproceedings{klein-etal-2024-effect, abbr = {CoNLL}, bibtex_show = {True}, pdf = {https://aclanthology.org/2024.conll-1.17.pdf}, title = {The Effect of Surprisal on Reading Times in Information Seeking and Repeated Reading}, author = {Klein, Keren and Meiri, Yoav and Shubi, Omer and Berzak, Yevgeni}, editor = {Barak, Libby and Alikhani, Malihe}, booktitle = {Proceedings of the 28th Conference on Computational Natural Language Learning}, month = nov, year = {2024}, address = {Miami, FL, USA}, publisher = {Association for Computational Linguistics}, url = {https://aclanthology.org/2024.conll-1.17}, pages = {219--230} }
EMNLP
Fine-Grained Prediction of Reading Comprehension from Eye Movements

Omer Shubi, Yoav Meiri, Cfir Hadar, and Yevgeni Berzak

In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing 2024

Abs Bib PDF

Can human reading comprehension be assessed from eye movements in reading? In this work, we address this longstanding question using large-scale eyetracking data. We focus on a cardinal and largely unaddressed variant of this question: predicting reading comprehension of a single participant for a single question from their eye movements over a single paragraph. We tackle this task using a battery of recent models from the literature, and three new multimodal language models. We evaluate the models in two different reading regimes: ordinary reading and information seeking, and examine their generalization to new textual items, new participants, and the combination of both. The evaluations suggest that the task is highly challenging, and highlight the importance of benchmarking against a strong text-only baseline. While in some cases eye movements provide improvements over such a baseline, they tend to be small. This could be due to limitations of current modelling approaches, limitations of the data, or because eye movement behavior does not sufficiently pertain to fine-grained aspects of reading comprehension processes. Our study provides an infrastructure for making further progress on this question.
@inproceedings{shubi-etal-2024-fine, abbr = {EMNLP}, bibtex_show = {True}, pdf = {https://aclanthology.org/2024.emnlp-main.198.pdf}, title = {Fine-Grained Prediction of Reading Comprehension from Eye Movements}, author = {Shubi, Omer and Meiri, Yoav and Hadar, Cfir and Berzak, Yevgeni}, editor = {Al-Onaizan, Yaser and Bansal, Mohit and Chen, Yun-Nung}, booktitle = {Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing}, month = nov, year = {2024}, address = {Miami, Florida, USA}, publisher = {Association for Computational Linguistics}, url = {https://aclanthology.org/2024.emnlp-main.198}, pages = {3372--3391} }
CogSci
Déjà Vu: Eye Movements in Repeated Reading

Yoav Meiri, and Yevgeni Berzak

In Proceedings of the Annual Meeting of the Cognitive Science Society 2024

Abs Bib PDF

From cooking recipes to novels and scientific papers, we often read the same text more than once. How do our eye movements in repeated reading differ from first reading? In this work, we examine this question at scale with L1 English readers via standard eye-movement measures and their sensitivity to linguistic word properties. We analyze consecutive and non-consecutive repeated reading, in ordinary and information-seeking reading regimes. We find sharp and robust reading facilitation effects in repeated reading, and characterize their modulation by the reading regime, the presence of intervening textual material, and the relevance of the information to the task across the two readings. Finally, we examine individual differences in repeated reading effects and find that their magnitude interacts with reading speed, but not with reading proficiency. Our work extends prior findings, providing a detailed empirical picture of repeated reading which could inform future models of eye movements in reading.
@inproceedings{meiri2024, abbr = {CogSci}, bibtex_show = {True}, pdf = {https://escholarship.org/content/qt5fd0z5qs/qt5fd0z5qs.pdf}, title = {Déjà Vu: Eye Movements in Repeated Reading}, author = {Meiri, Yoav and Berzak, Yevgeni}, booktitle = {Proceedings of the Annual Meeting of the Cognitive Science Society}, year = {2024}, address = {Online} }

CogSci
Eye Movements in Information-Seeking Reading

Omer Shubi, and Yevgeni Berzak

In Proceedings of the Annual Meeting of the Cognitive Science Society 2023

Abs Bib PDF

In this work, we use question answering as a general framework for studying how eye movements in reading reflect the reader’s goals, how they are pursued, and the extent to which they are achieved. We leverage fine-grained annotations of task-critical textual information to perform a detailed comparison of eye movements in information-seeking and ordinary reading regimes. We further examine how eye movements during information seeking relate to question answering behavior. We find that reading times, saccade patterns and sensitivity to the linguistic properties of the text are all strongly and systematically conditioned on the reading task, and further interact with question answering behavior. The observed reading patterns are consistent with a rational account of cognitive resource allocation during task-based reading.
@inproceedings{shubi2023, abbr = {CogSci}, bibtex_show = {True}, pdf = {https://escholarship.org/content/qt6019k40d/qt6019k40d.pdf}, title = {Eye Movements in Information-Seeking Reading}, author = {Shubi, Omer and Berzak, Yevgeni}, booktitle = {Proceedings of the Annual Meeting of the Cognitive Science Society}, year = {2023}, address = {Online} }
OPMI
Eye Movement Traces of Linguistic Knowledge in Native and Non-Native Reading

Yevgeni Berzak, and Roger Levy

Open Mind 2023

Abs Bib PDF

Eye movements in reading offer a rich, detailed picture of how language understanding unfolds in real time. Decades of research have demonstrated the sensitivity and quantitative functional form of how readers’ eye movements are influenced by the linguistic characteristics of the words being read and their relationship with context. However, most of this work has examined only reading by native (L1) speakers, even though much of the world’s population is multilingual, and non-native (L2) reading is a ubiquitous everyday activity. Here we present an analysis of eye movements in reading in a dataset containing a large and linguistically diverse sample of English L2 readers, including a quantitative characterization of the shape of the relationship between linguistic word properties and eye movements, and how this relationship relates to the reader’s independently measured L2 proficiency. Our key result is that while many of the same qualitative effects are found in L2 readers as in L1 readers, we also find a “lexicon-context tradeoff” that is sensitive to a reader’s L2 proficiency. L2 readers’ eye movements are generally less sensitive to a word’s relationship with its context and more sensitive to the word’s intrinsic properties. However, the most proficient L2 readers’ eye movements approach an L1 pattern. This tradeoff supports an experience-dependent account of the speed and efficiency with which context-driven expectations can be deployed in L2 language processing, with a proficiency driven gradual shift away from lexicon-dependent processing and towards contextual processing.
@article{berzak2023traces, abbr = {OPMI}, bibtex_show = {True}, pdf = {https://direct.mit.edu/opmi/article/doi/10.1162/opmi_a_00084/116138/Eye-Movement-Traces-of-Linguistic-Knowledge-in}, data_url = {https://github.com/berzak/celer}, author = {Berzak, Yevgeni and Levy, Roger}, title = {Eye Movement Traces of Linguistic Knowledge in Native and Non-Native Reading}, journal = {Open Mind}, year = {2023} }

EMNLP
The Aligned Multimodal Movie Treebank: An Audio, Video, Dependency-Parse Treebank

Adam Yaari, Jan DeWitt, Henry Hu, Bennett Stankovits, Sue Felshin, Yevgeni Berzak, Helena Aparicio, Boris Katz, Ignacio Cases, and Andrei Barbu

In Proceedings of the Conference on Empirical Methods in Natural Language Processing 2022

Abs Bib PDF

Treebanks have traditionally included only text and were derived from written sources such as newspapers or the web. We introduce the Aligned Multimodal Movie Treebank (AMMT), an English language treebank derived from dialog in Hollywood movies which includes transcriptions of the audiovisual streams with word-level alignment, as well as part of speech tags and dependency parses in the Universal Dependencies (UD) formalism. AMMT consists of 31,264 sentences and 218,090 words, that will amount to the 3rd largest UD English treebank and the only multimodal treebank in UD. We find that parsers on this dataset often have difficulty with conversational speech and that they often rely on punctuation which is often not available from speech recognizers. To help with the web-based annotation effort, we also introduce the Efficient Audio Alignment Annotator (EAAA), a companion tool that enables annotators to significantly speed-up their annotation processes.
@inproceedings{yaari2022, abbr = {EMNLP}, bibtex_show = {True}, pdf = {https://aclanthology.org/2022.emnlp-main.648.pdf}, title = {The Aligned Multimodal Movie Treebank: An Audio, Video, Dependency-Parse Treebank}, author = {Yaari, Adam and DeWitt, Jan and Hu, Henry and Stankovits, Bennett and Felshin, Sue and Berzak, Yevgeni and Aparicio, Helena and Katz, Boris and Cases, Ignacio and Barbu, Andrei}, booktitle = {Proceedings of the Conference on Empirical Methods in Natural Language Processing}, year = {2022}, address = {Online}, publisher = {Association for Computational Linguistics} }
OPMI
CELER: A 365-Participant Corpus of Eye Movements in L1 and L2 English Reading

Yevgeni Berzak, Chie Nakamura, Amelia Smith, Emily Weng, Boris Katz, Suzanne Flynn, and Roger Levy

Open Mind 2022

Abs Bib PDF

We present CELER (Corpus of Eye Movements in L1 and L2 English Reading), a broad coverage eye-tracking corpus for English. CELER comprises over 320,000 words, and eye-tracking data from 365 participants. Sixty-nine participants are L1 (first language) speakers, and 296 are L2 (second language) speakers from a wide range of English proficiency levels and five different native language backgrounds. As such, CELER has an order of magnitude more L2 participants than any currently available eye movements dataset with L2 readers. Each participant in CELER reads 156 newswire sentences from the Wall Street Journal (WSJ), in a new experimental design where half of the sentences are shared across participants and half are unique to each participant. We provide analyses that compare L1 and L2 participants with respect to standard reading time measures, as well as the effects of frequency, surprisal, and word length on reading times. These analyses validate the corpus and demonstrate some of its strengths. We envision CELER to enable new types of research on language processing and acquisition, and to facilitate interactions between psycholinguistics and natural language processing (NLP).
@article{berzak2022celer, abbr = {OPMI}, bibtex_show = {True}, pdf = {https://doi.org/10.1162/opmi\_a\_00054}, dataset = {celer}, data_url = {https://github.com/berzak/celer}, author = {Berzak, Yevgeni and Nakamura, Chie and Smith, Amelia and Weng, Emily and Katz, Boris and Flynn, Suzanne and Levy, Roger}, title = {CELER: A 365-Participant Corpus of Eye Movements in L1 and L2 English Reading}, journal = {Open Mind}, pages = {1-10}, year = {2022}, month = apr, issn = {2470-2986}, doi = {10.1162/opmi_a_00054}, eprint = {https://direct.mit.edu/opmi/article-pdf/doi/10.1162/opmi\_a\_00054/2012324/opmi\_a\_00054.pdf} }

CogSci
Eye Movement Traces of Linguistic Knowledge

Yevgeni Berzak, and Roger Levy

In Proceedings of the Annual Meeting of the Cognitive Science Society 2021

Abs Bib PDF

This study examines how linguistic knowledge is manifested in eye movements in reading, focusing on the effect of two key word properties: frequency and surprisal, on three progressively longer standard fixation measures: First Fixation, Gaze Duration and Total Fixation. Comparing English L1 speakers to a large and linguistically diverse group of English L2 speakers, we obtain the following results. 1) Word property effects on reading times are larger in L2 than in L1. 2) Differences between L1 and L2 speakers are substantially larger in the response to frequency than to surprisal. 3) The functional form of the relation between fixation times and frequency and surprisal in L2 is superlinear. 4) In L2 speakers, proficiency modulates frequency effects as a U shaped function. We discuss the implications of these results on theory of language processing and acquisition, as well as the general interpretation of frequency and surprisal effects in reading.
@inproceedings{berzak2021eye, abbr = {CogSci}, bibtex_show = {True}, pdf = {https://escholarship.org/uc/item/0hr535qs}, title = {Eye Movement Traces of Linguistic Knowledge}, author = {Berzak, Yevgeni and Levy, Roger}, booktitle = {Proceedings of the Annual Meeting of the Cognitive Science Society}, volume = {43}, number = {43}, year = {2021} }
CoNLL
Predicting Text Readability from Scrolling Interactions

Sian Gooding, Yevgeni Berzak, Tony Mak, and Matt Sharifi

In Proceedings of the 25th Conference on Computational Natural Language Learning 2021

Abs Bib PDF

Judging the readability of text has many important applications, for instance when performing text simplification or when sourcing reading material for language learners. In this paper, we present a 518 participant study which investigates how scrolling behaviour relates to the readability of a text. We make our dataset publicly available and show that (1) there are statistically significant differences in the way readers interact with text depending on the text level, (2) such measures can be used to predict the readability of text, and (3) the background of a reader impacts their reading interactions and the factors contributing to text difficulty.
@inproceedings{gooding-etal-2021-predicting, abbr = {CoNLL}, bibtex_show = {True}, pdf = {https://arxiv.org/pdf/2105.06354.pdf}, title = {Predicting Text Readability from Scrolling Interactions}, author = {Gooding, Sian and Berzak, Yevgeni and Mak, Tony and Sharifi, Matt}, booktitle = {Proceedings of the 25th Conference on Computational Natural Language Learning}, year = {2021}, address = {Online}, publisher = {Association for Computational Linguistics} }

CoNLL
Bridging Information-Seeking Human Gaze and Machine Reading Comprehension

Jonathan Malmaud, Roger Levy, and Yevgeni Berzak

In Proceedings of the 24th Conference on Computational Natural Language Learning 2020

Abs Bib PDF

In this work, we analyze how human gaze during reading comprehension is conditioned on the given reading comprehension question, and whether this signal can be beneficial for machine reading comprehension. To this end, we collect a new eye-tracking dataset with a large number of participants engaging in a multiple choice reading comprehension task. Our analysis of this data reveals increased fixation times over parts of the text that are most relevant for answering the question. Motivated by this finding, we propose making automated reading comprehension more human-like by mimicking human information-seeking reading behavior during reading comprehension. We demonstrate that this approach leads to performance gains on multiple choice question answering in English for a state-of-the-art reading comprehension model.
@inproceedings{malmaud-etal-2020-bridging, abbr = {CoNLL}, bibtex_show = {True}, pdf = {https://aclanthology.org/2020.conll-1.11.pdf}, title = {Bridging Information-Seeking Human Gaze and Machine Reading Comprehension}, author = {Malmaud, Jonathan and Levy, Roger and Berzak, Yevgeni}, booktitle = {Proceedings of the 24th Conference on Computational Natural Language Learning}, month = nov, year = {2020}, address = {Online}, publisher = {Association for Computational Linguistics}, url = {https://aclanthology.org/2020.conll-1.11}, doi = {10.18653/v1/2020.conll-1.11}, pages = {142--152} }
CoNLL
Classifying Syntactic Errors in Learner Language

Leshem Choshen, Dmitry Nikolaev, Yevgeni Berzak, and Omri Abend

In Proceedings of the 24th Conference on Computational Natural Language Learning 2020

Abs Bib PDF

We present a method for classifying syntactic errors in learner language, namely errors whose correction alters the morphosyntactic structure of a sentence. The methodology builds on the established Universal Dependencies syntactic representation scheme, and provides complementary information to other error-classification systems. Unlike existing error classification methods, our method is applicable across languages, which we showcase by producing a detailed picture of syntactic errors in learner English and learner Russian. We further demonstrate the utility of the methodology for analyzing the outputs of leading Grammatical Error Correction (GEC) systems.
@inproceedings{choshen-etal-2020-classifying, abbr = {CoNLL}, bibtex_show = {True}, pdf = {https://aclanthology.org/2020.conll-1.7.pdf}, title = {Classifying Syntactic Errors in Learner Language}, author = {Choshen, Leshem and Nikolaev, Dmitry and Berzak, Yevgeni and Abend, Omri}, booktitle = {Proceedings of the 24th Conference on Computational Natural Language Learning}, month = nov, year = {2020}, address = {Online}, publisher = {Association for Computational Linguistics}, url = {https://aclanthology.org/2020.conll-1.7}, doi = {10.18653/v1/2020.conll-1.7}, pages = {97--107} }
ACL
STARC: Structured Annotations for Reading Comprehension

Yevgeni Berzak, Jonathan Malmaud, and Roger Levy

In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 2020

Abs Bib PDF

We present STARC (Structured Annotations for Reading Comprehension), a new annotation framework for assessing reading comprehension with multiple choice questions. Our framework introduces a principled structure for the answer choices and ties them to textual span annotations. The framework is implemented in OneStopQA, a new high-quality dataset for evaluation and analysis of reading comprehension in English. We use this dataset to demonstrate that STARC can be leveraged for a key new application for the development of SAT-like reading comprehension materials: automatic annotation quality probing via span ablation experiments. We further show that it enables in-depth analyses and comparisons between machine and human reading comprehension behavior, including error distributions and guessing ability. Our experiments also reveal that the standard multiple choice dataset in NLP, RACE, is limited in its ability to measure reading comprehension. 47% of its questions can be guessed by machines without accessing the passage, and 18% are unanimously judged by humans as not having a unique correct answer. OneStopQA provides an alternative test set for reading comprehension which alleviates these shortcomings and has a substantially higher human ceiling performance.
@inproceedings{berzak-etal-2020-starc, abbr = {ACL}, bibtex_show = {True}, pdf = {https://aclanthology.org/2020.acl-main.507.pdf}, dataset = {onestopqa}, data_url = {https://github.com/berzak/onestop-qa}, title = {{STARC}: Structured Annotations for Reading Comprehension}, author = {Berzak, Yevgeni and Malmaud, Jonathan and Levy, Roger}, booktitle = {Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics}, month = jul, year = {2020}, address = {Online}, publisher = {Association for Computational Linguistics}, url = {https://aclanthology.org/2020.acl-main.507}, doi = {10.18653/v1/2020.acl-main.507}, pages = {5726--5735}, video = {https://slideslive.com/38928901/starc-structured-annotations-for-reading-comprehension} }

CL
Modeling Language Variation and Universals: A Survey on Typological Linguistics for Natural Language Processing

Edoardo Maria Ponti, Helen O’Horan, Yevgeni Berzak, Ivan Vulić, Roi Reichart, Thierry Poibeau, Ekaterina Shutova, and Anna Korhonen

Computational Linguistics 2019

Abs Bib HTML

Linguistic typology aims to capture structural and semantic variation across the world’s languages. A large-scale typology could provide excellent guidance for multilingual Natural Language Processing (NLP), particularly for languages that suffer from the lack of human labeled resources. We present an extensive literature survey on the use of typological information in the development of NLP techniques. Our survey demonstrates that to date, the use of information in existing typological databases has resulted in consistent but modest improvements in system performance. We show that this is due to both intrinsic limitations of databases (in terms of coverage and feature granularity) and under-utilization of the typological features included in them. We advocate for a new approach that adapts the broad and discrete nature of typological categories to the contextual and continuous nature of machine learning algorithms used in contemporary NLP. In particular, we suggest that such an approach could be facilitated by recent developments in data-driven induction of typological knowledge.
@article{10.1162/coli_a_00357, abbr = {CL}, bibtex_show = {True}, html = {https://direct.mit.edu/coli/article/45/3/559/93372/Modeling-Language-Variation-and-Universals-A}, author = {Ponti, Edoardo Maria and O’Horan, Helen and Berzak, Yevgeni and Vulić, Ivan and Reichart, Roi and Poibeau, Thierry and Shutova, Ekaterina and Korhonen, Anna}, title = {{Modeling Language Variation and Universals: A Survey on Typological Linguistics for Natural Language Processing}}, journal = {Computational Linguistics}, volume = {45}, number = {3}, pages = {559-601}, year = {2019}, month = sep, issn = {0891-2017}, doi = {10.1162/coli_a_00357}, url = {https://doi.org/10.1162/coli\_a\_00357}, eprint = {https://direct.mit.edu/coli/article-pdf/45/3/559/1847397/coli\_a\_00357.pdf} }

NAACL
Assessing Language Proficiency from Eye Movements in Reading

Yevgeni Berzak, Boris Katz, and Roger Levy

In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2018

Abs Bib PDF

We present a novel approach for determining learners’ second language proficiency which utilizes behavioral traces of eye movements during reading. Our approach provides stand-alone eyetracking based English proficiency scores which reflect the extent to which the learner’s gaze patterns in reading are similar to those of native English speakers. We show that our scores correlate strongly with standardized English proficiency tests. We also demonstrate that gaze information can be used to accurately predict the outcomes of such tests. Our approach yields the strongest performance when the test taker is presented with a suite of sentences for which we have eyetracking data from other readers. However, it remains effective even using eyetracking with sentences for which eye movement data have not been previously collected. By deriving proficiency as an automatic byproduct of eye movements during ordinary reading, our approach offers a potentially valuable new tool for second language proficiency assessment. More broadly, our results open the door to future methods for inferring reader characteristics from the behavioral traces of reading.
@inproceedings{berzak2018assessing, abbr = {NAACL}, bibtex_show = {True}, pdf = {https://aclanthology.org/N18-1180.pdf}, title = {Assessing Language Proficiency from Eye Movements in Reading}, author = {Berzak, Yevgeni and Katz, Boris and Levy, Roger}, booktitle = {Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies}, pages = {1986--1996}, year = {2018} }
EMNLP
Grounding Language Acquisition by Training Semantic Parsers Using Captioned Videos

Candace Ross, Andrei Barbu, Yevgeni Berzak, Battushig Myanganbayar, and Boris Katz

In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing 2018

Abs Bib PDF

We develop a semantic parser that is trained in a grounded setting using pairs of videos captioned with sentences. This setting is both data-efficient, requiring little annotation, and similar to the experience of children where they observe their environment and listen to speakers. The semantic parser recovers the meaning of English sentences despite not having access to any annotated sentences. It does so despite the ambiguity inherent in vision where a sentence may refer to any combination of objects, object properties, relations or actions taken by any agent in a video. For this task, we collected a new dataset for grounded language acquisition. Learning a grounded semantic parser—turning sentences into logical forms using captioned videos—can significantly expand the range of data that parsers can be trained on, lower the effort of training a semantic parser, and ultimately lead to a better understanding of child language acquisition.
@inproceedings{ross2018grounding, abbr = {EMNLP}, bibtex_show = {True}, pdf = {https://aclanthology.org/D18-1285.pdf}, title = {Grounding Language Acquisition by Training Semantic Parsers Using Captioned Videos}, author = {Ross, Candace and Barbu, Andrei and Berzak, Yevgeni and Myanganbayar, Battushig and Katz, Boris}, booktitle = {Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing}, pages = {2647--2656}, year = {2018} }
MIT
Second Language Learning from a Multilingual Perspective

Yevgeni Berzak

2018

Abs Bib PDF

How do people learn a second language? In this thesis, we study this question through an examination of cross-linguistic transfer: the role of a speaker’s native language in the acquisition, representation, usage and processing of a second language. We present a computational framework that enables studying transfer in a unified fashion across language production and language comprehension. Our framework supports bidirectional inference between linguistic characteristics of speakers’ native languages, and the way they use and process a new language. We leverage this inference ability to demonstrate the systematic nature of cross-linguistic transfer, and to uncover some of its key linguistic and cognitive manifestations. We instantiate our framework in language production by relating syntactic usage patterns and grammatical errors in English as a Second Language (ESL) to typological properties of the native language, showing its utility for automated typology learning and prediction of second language grammatical errors. We then introduce eye tracking during reading as a methodology for studying cross-linguistic transfer in second language comprehension. Using this methodology, we demonstrate that learners’ native language can be predicted from their eye movement while reading free-form second language text. Further, we show that language processing during second language comprehension is intimately related to linguistic characteristics of the reader’s first language. Finally, we introduce the Treebank of Learner English (TLE), the first syntactically annotated corpus of learner English. The TLE is annotated with Universal Dependencies (UD), a framework geared towards multilingual language analysis, and will support linguistic and computational research on learner language. Taken together, our results highlight the importance of multilingual approaches to the scientific study of second language acquisition, and to Natural Language Processing (NLP) applications for non-native language.
@phdthesis{berzak2018second, abbr = {MIT}, bibtex_show = {True}, pdf = {https://dspace.mit.edu/bitstream/handle/1721.1/115634/1036987447-MIT.pdf}, title = {Second Language Learning from a Multilingual Perspective}, author = {Berzak, Yevgeni}, year = {2018}, school = {Massachusetts Institute of Technology} }

ACL
Predicting Native Language from Gaze

Yevgeni Berzak, Chie Nakamura, Suzanne Flynn, and Boris Katz

In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics 2017

Abs Bib PDF

A fundamental question in language learning concerns the role of a speaker’s first language in second language acquisition. We present a novel methodology for studying this question: analysis of eye-movement patterns in second language reading of free-form text. Using this methodology, we demonstrate for the first time that the native language of English learners can be predicted from their gaze fixations when reading English. We provide analysis of classifier uncertainty and learned features, which indicates that differences in English reading are likely to be rooted in linguistic divergences across native languages. The presented framework complements production studies and offers new ground for advancing research on multilingualism.
@inproceedings{berzak2017predicting, abbr = {ACL}, bibtex_show = {True}, pdf = {https://aclanthology.org/P17-1050.pdf}, title = {Predicting Native Language from Gaze}, author = {Berzak, Yevgeni and Nakamura, Chie and Flynn, Suzanne and Katz, Boris}, booktitle = {Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics}, pages = {541--551}, year = {2017} }

COLING
Survey on the Use of Typological Information in Natural Language Processing

Helen O’Horan, Yevgeni Berzak, Ivan Vulić, Roi Reichart, and Anna Korhonen

In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers 2016

Abs Bib PDF

In recent years linguistic typology, which classifies the world’s languages according to their functional and structural properties, has been widely used to support multilingual NLP. While the growing importance of typological information in supporting multilingual tasks has been recognised, no systematic survey of existing typological resources and their use in NLP has been published. This paper provides such a survey as well as discussion which we hope will both inform and inspire future work in the area.
@inproceedings{o2016survey, abbr = {COLING}, bibtex_show = {True}, pdf = {https://aclanthology.org/C16-1123.pdf}, title = {Survey on the Use of Typological Information in Natural Language Processing}, author = {O’Horan, Helen and Berzak, Yevgeni and Vuli{\'c}, Ivan and Reichart, Roi and Korhonen, Anna}, booktitle = {Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers}, pages = {1297--1308}, year = {2016} }
EMNLP
Anchoring and agreement in syntactic annotations

Yevgeni Berzak, Yan Huang, Andrei Barbu, Anna Korhonen, and Boris Katz

In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing 2016

Abs Bib PDF

We present a study on two key characteristics of human syntactic annotations: anchoring and agreement. Anchoring is a well known cognitive bias in human decision making, where judgments are drawn towards pre-existing values. We study the influence of anchoring on a standard approach to creation of syntactic resources where syntactic annotations are obtained via human editing of tagger and parser output. Our experiments demonstrate a clear anchoring effect and reveal unwanted consequences, including overestimation of parsing performance and lower quality of annotations in comparison with human-based annotations. Using sentences from the Penn Treebank WSJ, we also report systematically obtained inter-annotator agreement estimates for English dependency parsing. Our agreement results control for parser bias, and are consequential in that they are on par with state of the art parsing performance for English newswire. We discuss the impact of our findings on strategies for future annotation efforts and parser evaluations.
@inproceedings{berzak2016anchoring, abbr = {EMNLP}, bibtex_show = {True}, pdf = {https://arxiv.org/pdf/1605.04481.pdf}, data_url = {http://people.csail.mit.edu/berzak/agreement/}, dataset = {anchoring}, title = {Anchoring and agreement in syntactic annotations}, author = {Berzak, Yevgeni and Huang, Yan and Barbu, Andrei and Korhonen, Anna and Katz, Boris}, booktitle = {Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing}, year = {2016} }
ACL
Universal Dependencies for Learner English

Yevgeni Berzak, Jessica Kenney, Carolyn Spadine, Jing Xian Wang, Lucia Lam, Keiko Sophie Mori, Sebastian Garza, and Boris Katz

In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics 2016

Abs Bib PDF

We introduce the Treebank of Learner English (TLE), the first publicly available syntactic treebank for English as a Second Language (ESL). The TLE provides manually annotated POS tags and Universal Dependency (UD) trees for 5,124 sentences from the Cambridge First Certificate in English (FCE) corpus. The UD annotations are tied to a pre-existing error annotation of the FCE, whereby full syntactic analyses are provided for both the original and error corrected versions of each sentence. Further on, we delineate ESL annotation guidelines that allow for consistent syntactic treatment of ungrammatical English. Finally, we benchmark POS tagging and dependency parsing performance on the TLE dataset and measure the effect of grammatical errors on parsing accuracy. We envision the treebank to support a wide range of linguistic and computational research on second language acquisition as well as automatic processing of ungrammatical language. The treebank is available at universaldependencies.org. The annotation manual used in this project and a graphical query engine are available at esltreebank.org.
@inproceedings{berzak-etal-2016-universal, abbr = {ACL}, bibtex_show = {True}, pdf = {https://aclanthology.org/P16-1070.pdf}, data_url = {http://esltreebank.org/}, dataset = {tle}, title = {{U}niversal {D}ependencies for Learner {E}nglish}, author = {Berzak, Yevgeni and Kenney, Jessica and Spadine, Carolyn and Wang, Jing Xian and Lam, Lucia and Mori, Keiko Sophie and Garza, Sebastian and Katz, Boris}, booktitle = {Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics}, month = aug, year = {2016}, address = {Berlin, Germany}, publisher = {Association for Computational Linguistics}, url = {https://aclanthology.org/P16-1070}, doi = {10.18653/v1/P16-1070}, pages = {737--746} }

EMNLP
Do You See What I Mean? Visual Resolution of Linguistic Ambiguities

Yevgeni Berzak, Andrei Barbu, Daniel Harari, Boris Katz, and Shimon Ullman

In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing 2015

Abs Bib PDF

Understanding language goes hand in hand with the ability to integrate complex contextual information obtained via perception. In this work, we present a novel task for grounded language understanding: disambiguating a sentence given a visual scene which depicts one of the possible interpretations of that sentence. To this end, we introduce a new multimodal corpus containing ambiguous sentences, representing a wide range of syntactic, semantic and discourse ambiguities, coupled with videos that visualize the different interpretations for each sentence. We address this task by extending a vision model which determines if a sentence is depicted by a video. We demonstrate how such a model can be adjusted to recognize different interpretations of the same underlying sentence, allowing to disambiguate sentences in a unified fashion across the different ambiguity types.
@inproceedings{berzak-etal-2015-see, abbr = {EMNLP}, bibtex_show = {True}, pdf = {https://aclanthology.org/D15-1172.pdf}, talk = {https://www.youtube.com/watch?v=w9UJRhvJfNM}, dataset = {lava}, data_url = {http://web.mit.edu/lavacorpus/}, title = {Do You See What {I} Mean? Visual Resolution of Linguistic Ambiguities}, author = {Berzak, Yevgeni and Barbu, Andrei and Harari, Daniel and Katz, Boris and Ullman, Shimon}, booktitle = {Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing}, month = sep, year = {2015}, address = {Lisbon, Portugal}, publisher = {Association for Computational Linguistics}, url = {https://aclanthology.org/D15-1172}, doi = {10.18653/v1/D15-1172}, pages = {1477--1487} }
CoNLL
Contrastive Analysis with Predictive Power: Typology Driven Estimation of Grammatical Error Distributions in ESL

Yevgeni Berzak, Roi Reichart, and Boris Katz

In Proceedings of the Nineteenth Conference on Computational Natural Language Learning 2015

Abs Bib PDF

This work examines the impact of cross-linguistic transfer on grammatical errors in English as Second Language (ESL) texts. Using a computational framework that formalizes the theory of Contrastive Analysis (CA), we demonstrate that language specific error distributions in ESL writing can be predicted from the typological properties of the native language and their relation to the typology of English. Our typology driven model enables to obtain accurate estimates of such distributions without access to any ESL data for the target languages. Furthermore, we present a strategy for adjusting our method to low-resource languages that lack typological documentation using a bootstrapping approach which approximates native language typology from ESL texts. Finally, we show that our framework is instrumental for linguistic inquiry seeking to identify first language factors that contribute to a wide range of difficulties in second language acquisition.
@inproceedings{berzak-etal-2015-contrastive, abbr = {CoNLL}, bibtex_show = {True}, pdf = {https://aclanthology.org/K15-1010.pdf}, title = {Contrastive Analysis with Predictive Power: Typology Driven Estimation of Grammatical Error Distributions in {ESL}}, author = {Berzak, Yevgeni and Reichart, Roi and Katz, Boris}, booktitle = {Proceedings of the Nineteenth Conference on Computational Natural Language Learning}, month = jul, year = {2015}, address = {Beijing, China}, publisher = {Association for Computational Linguistics}, url = {https://aclanthology.org/K15-1010}, doi = {10.18653/v1/K15-1010}, pages = {94--102} }

CoNLL
Reconstructing Native Language Typology from Foreign Language Usage

Yevgeni Berzak, Roi Reichart, and Boris Katz

In Proceedings of the Eighteenth Conference on Computational Natural Language Learning 2014

Abs Bib PDF

Linguists and psychologists have long been studying cross-linguistic transfer, the influence of native language properties on linguistic performance in a foreign language. In this work we provide empirical evidence for this process in the form of a strong correlation between language similarities derived from structural features in English as Second Language (ESL) texts and equivalent similarities obtained from the typological features of the native languages. We leverage this finding to recover native language typological similarity structure directly from ESL text, and perform prediction of typological features in an unsupervised fashion with respect to the target languages. Our method achieves 72.2% accuracy on the typology prediction task, a result that is highly competitive with equivalent methods that rely on typological resources.
@inproceedings{berzak-etal-2014-reconstructing, abbr = {CoNLL}, bibtex_show = {True}, pdf = {https://aclanthology.org/W14-1603.pdf}, title = {Reconstructing Native Language Typology from Foreign Language Usage}, author = {Berzak, Yevgeni and Reichart, Roi and Katz, Boris}, booktitle = {Proceedings of the Eighteenth Conference on Computational Natural Language Learning}, month = jun, year = {2014}, address = {Ann Arbor, Michigan}, publisher = {Association for Computational Linguistics}, url = {https://aclanthology.org/W14-1603}, doi = {10.3115/v1/W14-1603}, pages = {21--29} }