Program for Wednesday, February 5th

PROGRAM FOR WEDNESDAY, FEBRUARY 5TH

Days:

09:00-17:00 Optional Activities (Mentorship, Networking, Community-Building)

17:00-18:00 Session 6: Features of Reading

17:00	Peter Dixon and Marisa Bortolussi Detecting Unreliable Narration PRESENTER: Peter Dixon ABSTRACT. With some frequency, first-person narratives are presented by an unreliable narrator whose views and descriptions of the story world are suspect in some way. For example, an insane narrator may be hallucinating, or a child narrator may be naïve about adult relationships. In order to process such stories, the reader must be able to make inferences about the story world that are not supported by the narrator’s words. The central theoretical issues concern how such unreliability is detected and how the unreliability affects the readers understanding of the story world. As a step towards addressing these issues, we developed a questionnaire to assess the perceived reliability of the narrator across the various ways in which the narrator might be unreliable and the different aspects of the story world that might not be described reliably. We then used the questionnaire to investigate how unreliability is used in “You Should Have Seen the Mess” by Muriel Sparks. According to our analysis, unreliability in this story is signalled by a violation of Gricean conventions. We tested this analysis by creating a second version of the story in which the violations were minimized. Our manipulation had the expected effect on the perceived objectivity of the narrator with respect to other characters in the story, but it had no effect on readers' interpretation of other aspects of the narrative. Thus, inferences about unreliability may be fairly circumscribed.
17:20	Jakob Åsberg Johnels, Carmela Miniscalco and Maria Larsson Text comprehension in adults with cognitive and communicative disabilities PRESENTER: Jakob Åsberg Johnels ABSTRACT. Little is known about text comprehension in adults with cognitive and communicative disabilities (CCD), such as intellectual disability, traumatic brain injury or autism. The aims of the current study were to: 1) explore the ability profile of narrative text comprehension in adults with CCD and 2) to explore concurrent psycholinguistic predictors of different facets of their text comprehension. Sixty-eight adults with CCD (and at least some reading ability of single words) were recruited from daily activity centers in Sweden. A comparison group was also recruited that consisted of typically developing primary school children matched with the adult CCD group in terms of written word reading ability. In addition to tests of word reading, oral sentence comprehension and proxy ratings of certain reading practices in daily life, they were all assessed on a narrative text comprehension task with specific sub-scores for inferential and explicit text comprehension. Group comparisons revealed lower comprehension scores for inferential text meaning in the adults with CCD, while the groups did not differ on explicit text comprehension. Regression analyses within the CCD group showed that explicit text comprehension was mainly predicted by oral sentence comprehension scores, whereas inferential text comprehension was predicted jointly by several factors. Results thus suggest that more complex interpretative/inferential text comprehension processes are particularly difficult for adults with CCD and that these abilities are complex also in terms of their underlying psycholinguistic and cognitive bases. Implications for the lifelong learning opportunities of people with CCD are discussed.
17:40	Lexi Elara and Beth Phillips Assessment of Elementary Students' Detection of Text Structure Inconsistencies PRESENTER: Lexi Elara ABSTRACT. Purpose: Some elementary school children struggle with acquiring higher order language skills (e.g., comprehension monitoring) to properly support comprehension. The purpose of this study is to assess elementary students’ comprehension monitoring through the detection of inconsistencies in text structure elements across assessment items. The goal is to determine if child performance varies between consistent and inconsistent passages, and if these differences vary with physical responses during listening or verbal responses after listening to text passages. Method: Children in kindergarten, first, and second grade listened to 36 short, researcher developed passages that did (inconsistent) or did not (consistent) have word-level inconsistencies in text structure elements. As the children listened to the passages, they reported inconsistencies by raising a question mark sign (sign response). When the passage ended, they replied to the question “Did that story make sense?” (verbal response). A 2 (consistency: consistent vs. inconsistent) x 2 (response type: sign vs. verbal) repeated measures ANOVA will be used to assess child differences in comprehension monitoring across these variables. Results: Initial analyses suggest differences in child performance between consistent (M=15.84, SD=2.83) and inconsistent (M=10.03, SD=4.50) passages (F(1,599.2)=44.567, p<.001). Differences between sign responses (M=25.38, SD=5.01) and verbal responses (M=26.37, SD=5.42) are not significant. However, the interaction between the variables approaches significance (F(1,51.2)=3.81, p=0.05). Conclusion: Differences in recognition of inconsistencies across consistency type may suggest the need for the degree of inconsistency to extend beyond the word-level for this young population. Also, children may perform similarly on assessments during and after listening to passages.

18:00-18:40 Session 7: Misinformation and Trust

Location: Alpine-Balsam

18:00

Catherine McGrath, Jason Braasch, Julie DiLeo, Laura Allen and Erica Kessler

The contributions of prior topic beliefs and general reading comprehension skill to evaluating and writing about contradictory claims about childhood vaccines

PRESENTER: Catherine McGrath

ABSTRACT. The current study investigated the contributions of prior accurate and inaccurate beliefs and reading comprehension ability to writing about a controversial topic. Fifty-eight undergraduates read four authentic texts, two conveyed accurate information that vaccines are helpful and two conveyed inaccurate information that vaccines are unnecessary in modern times. After reading and completing a distractor task, students wrote a 25-minute timed essay about the extent to which childhood vaccinations should be required by the government. Students also separately completed the individual differences measures of prior topic beliefs - ratings of endorsement of accurate and inaccurate beliefs about childhood vaccinations, and a reading comprehension assessment, which assessed their ability to read passages and answer multiple-choice questions that required inferential reasoning. Essays were scored for inclusion of statements that explicitly evaluated the accuracy of claims included in essays and the tagging of ideas to their respective sources. Preliminary results suggested that endorsement of accurate beliefs before reading supported evaluating the accuracy of texts’ claims included in essays. By contrast, the more readers endorsed inaccurate beliefs before reading, the less they provided evaluative statements about accuracy of ideas in essays. Above and beyond these effects,, general reading comprehension skill was a significant positive predictor of evaluating claims for accuracy. However, neither student’s accurate beliefs, inaccurate beliefs, or reading comprehension skills appear to support tagging ideas to their respective sources in written essays. Future analyses will code for additional aspects in students’ essays (e.g., from which texts the ideas originated).

18:20

Victoria Johnson and Panayiota Kendeou

The Role of Trust in Science/Scientists on Belief in Information and Knowledge Revision: A Theoretical Framework

PRESENTER: Victoria Johnson

ABSTRACT. The current information ecosystem, contaminated by misinformation, highlights the devastating consequences of lacking trust in science/scientists on belief in scientific information as well as associated attitudes and behaviors. However, little work has systematically considered the separate influences of trust in science, trust in groups of scientists, and trust in individual scientists (three source manifestations of trust) on learning. In this presentation, we outline a theoretical framework of the influence of these three manifestations of trust in science/scientists on belief in information and knowledge revision. By integrating manifestations of trust in science/scientists into existing theoretical frameworks of belief in information and knowledge revision, we hope to advance our understanding of the relations among these core constructs. Such an investigation also allows for future identification of the conditions under which different manifestations of science/scientists should be communicating crucial scientific information and thus how to frame science communication to best foster revision of misconceptions.

18:40-19:00 Session 8: Winter Text Business Meeting: Discussing WT&D 2026 and Beyond

All are invited to discuss the conference and what we can do next year to make it even better!

Location: Alpine-Balsam

19:00-21:00 Session 9: Closing Poster Session and Reception

Location: Foyer, Boulderado

Varun Athilat, Puren Oncel, Aaron Wong, Jason Braasch and Laura Allen

Emotions and Multiple Document Comprehension: Static or Dynamic?

PRESENTER: Varun Athilat

ABSTRACT. Multiple document (MD) studies explore how people integrate and make decisions about controversial issues from varied sources. These complex issues may interact with emotions that impact reading processes. Unfortunately, research on participants’ emotions during MD tasks is limited. When examined, emotions are often treated statically despite emotion literature suggesting that they can fluctuate throughout experimental tasks. This oversight is particularly detrimental in MD studies, where emotionally-driven motivations are crucial for resolving conflicting viewpoints. This study investigated how two dimensions of emotion – valence and arousal – as well as memory for the texts, change when reading conflicting perspectives about childhood vaccines. This research replicates and extends a study by Kessler and colleagues (2021), where participants read eight diverse documents on childhood vaccines and wrote an essay to a friend unsure about vaccinating their child. Additionally, we manipulated participants' initial emotions using a 2 (prompt type: plain text vs. text message screenshot) x 2 (emotionality: low vs. high arousal) within-subjects design, assessing changes in emotions throughout reading via emotion probes.

Participants’ valence during reading generally trended from positive to neutral, whereas arousal stayed mostly static. Essays are currently being coded for concepts from the documents, prior knowledge, and vaccine advocacy. We will provide analyses on the coded essays and how they reflect participants’ memory for the documents alongside how their emotions may have impacted their reading. These results will contribute to the MD literature by providing insights into how participants’ emotions change throughout reading and influence their processing of conflicting perspectives.

Mya Urena, Sam Winer and Caitlin Mills

Agency in Reward Devaluation: Fear of Happiness and Feedback Selection

PRESENTER: Mya Urena

ABSTRACT. Affect plays a crucial role in how individuals process and understand textual information. While previous research has primarily focused on the short-term effects of induced emotions on reading behavior, the influence of chronic mood disorders, such as depression, on text processing and comprehension remains underexplored. Reward Devaluation Theory (RDT) suggests that individuals with depression, particularly those exhibiting fear of happiness (FOH), may devalue or avoid positive stimuli, which could significantly alter their interactions with text. In the current study, participants were given a choice between positive and neutral feedback before completing two sets of 5th-grade-level math problems. Regression analysis revealed that participants with higher FOH scores were significantly more likely to switch from positive to neutral feedback between the two sets (b = 0.091, p = 0.029). This finding was further supported by a replication of prior research, where higher FOH scores were significantly negatively correlated with the selection of positive responses in the Valence Selection Task (VST) administered post-task (rho = -0.30, p = 0.007). The findings suggest that giving individuals with depression, particularly those with FOH, agency over their feedback choices may demonstrate a preference for more neutral textual feedback, aligning with their avoidance of positivity as posited by RDT. This study also underscores the importance of considering individual differences, particularly in mood and affect, when examining text processing and comprehension. Furthermore, it opens avenues for future research on how mood disorders might shape responses to textual information, with implications for education and mental health interventions.

Caroline J. Wendt, Ehsanul Haque Nirjhar and Theodora Chaspari

Linguistic Analysis of Veteran Job Interviews to Assess Effectiveness in Translating Military Expertise to the Civilian Workforce

PRESENTER: Caroline J. Wendt

ABSTRACT. As hiring processes evolve alongside modern technological systems, so too should intelligent resources to assist underrepresented populations with successful integration to the workforce. For military veterans specifically, job interviews present an initial barrier to participation in civilian labor. The ways in which natural language processing (NLP) can inform how veterans can improve effectiveness in translating military experience to workforce utility is underexplored. We leverage transcript data from a mock interview study between professional interviewers and veteran participants to design NLP experiments to evaluate the degree of explanation in responses to interview questions. We focus on how linguistic and psycholinguistic features and participant-level variability influence classification performance, offering new insights into the mechanics of effective communication. Results indicate high performance when distinguishing between generically long and short responses, demonstrating the robustness of linguistic feature integration. Classifying over- and under-explained responses is less straightforward, reflecting challenges of class imbalance and the limitations of tested approaches in detecting subtle differences in overly verbose or concise communication. Our findings have immediate applications for inclusive assistive technologies in job interview settings, and broader implications for enhancing automated communication assessment tools and refining strategies for training in communication-heavy fields.

Nicholas Duran, Jessica Salerno, Megan Lawrence and Alia Wulff

Linguistic Alignment and Suspicion in 911 Calls

PRESENTER: Nicholas Duran

ABSTRACT. 911 calls offer a complex dataset for studying high-stakes conversational dynamics. These calls, reporting violent crimes, capture raw interactions between distressed callers and 911 operators. The urgency and ambiguity of these situations create rich data for examining how linguistic behaviors shape social judgments. While past research has focused solely on the caller’s language, this exploratory study expands the analysis to include the operator’s language as well. Specifically, using 90 real calls from the Phoenix and Tucson police departments, we investigate how linguistic alignment—across lexical, syntactic, and semantic levels—may influence perceptions of suspicion.

Linguistic alignment, or the degree to which conversational partners reuse each other’s forms, facilitates mutual understanding. Previous research suggests that alignment streamlines communication, reduces cognitive load, and fosters cooperative behavior. In high-stakes settings like 911 calls, alignment may be crucial for quickly establishing trust and clarity, potentially affecting judgments of credibility and suspicion.

Understanding suspicion in this context is critical, as law enforcement may perceive certain calls as deceptive, leading to the pursuit of callers as suspects and potentially contributing to wrongful convictions. Using natural language processing (NLP), we will analyze the temporal dynamics of alignment and assess whether these alignments serve as markers for suspicion.

By integrating caller-operator interactions with experimentally collected suspicion ratings, this project explores how alignment mediates suspicion in emergency reporting, offering insights into forensic linguistics and the role of linguistic alignment in shaping credibility and suspicion in high-pressure, real-world interactions.

Stephen Hutt, Grace D. Jaiyeola, Aaron Wong, Richard Bryck and Caitlin Mills

Beyond One-Size-Fits-All: Analyzing Neurodivergent Learners’ Reading Interactions with Webcam Based Eye Tracking

PRESENTER: Stephen Hutt

ABSTRACT. Neurodivergent students make up a significant portion (approximately 20%) of US students (Baio et al., 2018; Couzens et al., 2015) however, educational outcomes for neurodivergent students trail behind their neurotypical peers (Kuriyan et al., 2013). This study explores the interaction between neurodivergence and text comprehension using ecologically valid, low-budget webcam based eye tracking collected online from 176 participants who self-identified as neurodivergent. The participants, neurodivergent learners, read 40 paragraphs, each averaging 46 words, on the psychological mechanisms influencing consumer behavior. By employing a combination of features derived from the eye-tracking data (leveraging existing techniques for low-fidelity eye-tracking data - see Hutt et al., 2023; Hutt & D’Mello, 2022) and textual features such as text difficulty, parts of speech tagging, and cohesion, we examined how neurodivergence affects reading and text interaction. Additionally, we measured text comprehension and analyzed how varying interaction and gaze patterns with the text influenced comprehension outcomes. Our analysis was further refined by considering the impact of individual neurodivergent diagnoses on the findings. The results demonstrate that a one-size-fits-all approach is inadequate for understanding the nuances of neurodivergent learners' gaze patterns when reading text. Similarly, we note varying eye gaze indicators of comprehension by diagnosis. We conclude by discussing the implications of these findings for future research and practice, emphasizing the need for tailored approaches in educational and psychological studies involving neurodivergent populations. We also discuss potential supports and interactive systems that could leverage the technology discussed here to provide scalable learning supports for neurodivergent learners.

Danielle Shariff and Nia Nixon

A Critical Discourse Analysis: News Media Portrayals of Gaza

PRESENTER: Danielle Shariff

ABSTRACT. In the study of discourse analysis, it is widely held that, in order to fully grasp the purpose of the news media in a sociopolitical context, one must first pay attention to the persuasive effects that news reports have on the public (Van Dijk, 1995). By closely examining the language found in news articles, useful metrics for bias become apparent: diction, framing, passive voice, and selective reporting. Currently, the increased attention on the Gaza Strip has sparked a substantial rise in media coverage on Israel and Palestine. Historically, sympathetic media representation is reserved for Israel while Palestinians are systematically excluded from such sympathies (Chomsky, 2001). While the existing literature on the pervasiveness of this imbalance in U.S. news outlets is exhaustive, there is less research on comparisons made between Western media and those based in the Middle East. Comparing these two media spheres offers insight into potential divergences along the lines of ideology and approach. Furthermore, the ongoing situation in Gaza is continuously reconstructed in the media’s efforts to characterize it. To explore news media bias, we leverage text-as-data methods, such as Natural Language Processing (NLP) tools. Our approach relies on quantitative analyses of article headlines from four news sources: The New York Times, The Washington Post, The Guardian, and Al Jazeera. By engaging in Critical Discourse Analysis (Fairclough, 2013), we utilize a mixed-methods approach to interpret our NLP results and understand how the media exists in the landscape of normative, sociopolitical narratives on Israel and Palestine.

Jiayi Joyce Zhang

Examining the Robustness of Large Language Models by Language Complexity

ABSTRACT. With the advancement of large language models (LLMs), an increasing number of student models have leveraged LLMs to analyze textual artifacts generated by students to understand and evaluate their learning. These student models typically employ pre-trained LLMs to vectorize text inputs into embeddings and then use the embeddings to train models to detect the presence or absence of a construct of interest. However, how reliable and robust are these models at processing language with different levels of complexity? In the context of learning where students may have different language backgrounds with various levels of writing skills, it is critical to examine the robustness of such models to ensure that these models work equally well for text with varying levels of language complexity. Coincidentally, a few (but limited) research studies show that the use of language can indeed impact the performance of LLMs. As such, in the current study, we examined the robustness of several LLM-based student models that detect student self-regulated learning (SRL) in math problem-solving. Specifically, we compared how the performance of these models vary using texts with high and low lexical, syntactic, and semantic complexity, measured by three linguistic measures.

Scott Hinze

Collective Language: A Replication of Collective Narcissism using LLMs

ABSTRACT. Collective memories are mental representations shared within groups. These representations are shaped by conversations, texts, films, and cultural artifacts. This can lead to biases such as “collective narcissism” where individuals overestimate their group's role in historical narratives. For example, citizens of allied countries overestimated each of their own countries’ contributions to the allied victory in WW2, estimating a total contribution of 309% (Roediger et al., 2019). I explored whether chatbots based on large language models (ChatGPT 4.o and Claude) could demonstrate the same collective narcissism. Beyond possible cultural biases within the training data (Santurkar et al., 2023), LLMs can also be steered to take different perspectives or personas (Yuan et al., 2024), which may simulate collective memory differences based on group membership. To test this, I provided both neutral and steered prompts based on Roediger et al. (2019). For neutral prompts I asked the chatbots to provide the percent contribution of each of 8 allied countries to the allied victory in WW2. For steered prompts, I asked for the same estimate, but first asked the chatbot to take the perspective of an average citizen from that country. To account for variability, I requested 100 responses for each estimate. Steered prompts were similar to human data, and significantly increased contribution estimates for all 8 countries relative to neutral prompts. Total contribution scores were only slightly overestimated from neutral prompts (M = 116.56% for ChatGPT, 99.51% for Claude) but displayed near-human overestimation when steered (M = 256.29% for ChatGPT, 419.47% for Claude). These data suggest that LLMs may be able to simulate differences in collective memory representations.

Emily Doherty, Melissa McLain, Chad Tossell, Richard E. Niemeyer and Leanne Hirshfield

Modeling Susceptibility to Cognitive Attacks in Operational Teams

PRESENTER: Emily Doherty

ABSTRACT. Information-based threats (e.g. ‘cognitive attacks’) aim to disrupt vital cognitive processes including reasoning and decision-making. These threats include the presence of false information (‘misinformation’) and more specifically, false information intended to cause harm (‘disinformation’). Cognitive attacks can prove catastrophic to the dynamics and performance of highly operational teams where the stakes are high, such as in military and spaceflight teams. In this proposed study, we will examine the verbal and written communications during simulated operational missions to 1) model team-level states of susceptibility to information-based threats and subsequently 2) develop a conversational agent to monitor these states and intervene accordingly. Our research will leverage automated discourse classification methods, including models of communication influence and conversational uptake, to identify patterns of susceptibility in real time. We will also evaluate the coherence of discourse between team members as well as the semantics of their language, including the use of words indicative of suspicion. This analysis will not only enhance the understanding of team-level susceptibility to cognitive attack, but also inform the development of robust intervention strategies. The proposed conversational agent will use insights gained from the discourse analysis to provide timely and context-dependent support to operational teams. It will monitor communication patterns, identify states of heightened vulnerability, and suggest actions via an Large Language Model (LLM) to mitigate their impact. Ultimately, our study aims to model states of team-level susceptibility via communication patterns and to develop an assistive agent to prevent falling victim to mis/disinformation.

Langdon Holmes

Dependency Bigrams as a Measure of Collocation Use in Learner Writing

ABSTRACT. Collocations are a difficult aspect of language for learners to acquire and use (Nesselhauf, 2003). Recent advances in automatic dependency parsing have encouraged researchers to consider dependency parsing as a means to extract collocation from a text automatically (Bhalla & Klimcikova, 2019; Paquot, 2019; Kyle & Eguchi, 2021). However, it is not clear how much overlap exists between dependency bigram and collocations, nor how well these constructs are summarized by an averaged measure of association strength. This study attempts to address these issues. Dependency bigrams were extracted from the OANC (Ide, 2009), then from 48 learner writing samples. Three expert raters also tallied collocations in these texts and coded them as either literal or non-literal. Dependency bigram tallies were more strongly correlated with holistic scores of lexical proficiency. Expert raters extracted more non-literal collocations, even though the total number of combinations extracted by expert raters was lower. Furthermore, a random forest classifier was developed to predict learner proficiency from dependency bigrams in the ICNALE corpus (Ishikawa, 2013). In the case of one ICNALE prompt regarding a nation-wide ban of smoking, the presence of a single collocation: “second hand [smoke]” emerged as a powerful predictor of proficiency, used by 28% of higher proficiency learners compared to 11% of lower proficiency learners. These findings suggest that dependency bigrams are a useful means of extracting collocations from reference corpora and learner writing. Additionally, writing context and prompt effects have a major influence on which collocations are most indicative of learner proficiency.