A Test of the Relation between Wh-Acceptability and Syntactic Surprisals in Recurrent Neural Networks
ABSTRACT. Introduction. There is a long heated debate about what causes the convergence of island effects, a principle that people cannot make sentences violating island conditions. Traditional Chomskyan generative scholars (Chomsky, 1964; Hu, 2019; Huang, 1982; Ross, 1967) believe that island effects are a syntactic constraint, but more and more voices (Phillips et al., 2005) argue that island effects are affected by human sentence processing. This study is based on Chinese wh-island effects and employs experimental syntax techniques to explore the origins of island effects. This study conducted one experiment, with wh-acceptability surprisals in Recurrent Neural Networks (henceforth RNNs) as the main evaluation criterion for the experimental measurements. Surprisals in RNNs typically refer to the level of “surprise” or uncertainty when an event (such as a word or symbol in an input sequence) occurs, compared to what was expected. This experiment is a wh-island effect rating task, where participants were required to read sentences and their level of surprise would be assessed.
Methods. In this experiment, surprisals are calculated to indicate the probability of the next word or token, that is:
S(wi) = -log2p(wi|wi-1)
wi represents the i-th word in a sentence, and wi-1 represents the word that appears before wi in the sentence. p(wi|wi-1) represents the probability of wi occurring given the previous word wi-1. -log2 represents the negative logarithm with a base of 2. Using the above mathematical formula, the surprisal value of a word appearing in a sentence can be calculated. A total of 100 participants without any disabilities or linguistic backgrounds were recruited. This experiment adopted a 2×2×2 (islandhood, matrix wh-phrase, and dependency) factive design, in which participants read sentences with conditions overlapped to rate them based on their acceptability for the appearing wh-phrase. According to the factors of islandhood and dependency, we can draw a 2×2 design in Table 1 as a stimulus example. The difference in acceptability between a and d involves additional factors beyond dependency distance and syntactic structure. This additional factor is not the dependency distance or the syntactic structure itself, but the result of the interaction between the two, specifically the effect of long-distance dependencies that cross over island structures. This additional factor is defined as the island effect, i.e., (1).
(1) Island Effect = (a - d) - ((a - b) + (a - c))
For Chinese, which is a Wh-in-situ language, we can form corresponding long-distance and short-distance dependency relations by controlling the position where the wh-phrase appears.
Results. The collected ratings were transferred into z-scored ratings, and both the argument who and adjunct why present strong island effects in Figure 1. Additionally, the stimuli were introduced into large language models (ChatGPT-4 and Gemma 2) to calculate surprisal values, and the differences are presented in Figure 2. It is clear that wh-acceptability and surprisal values are significantly negative.
Discussion and Conclusion. The results indicate that 1) both types of wh-phrases in Chinese move in LF, during which they are constrained by wh-island effects and that 2) although we can see negative significance between wh-acceptability and surprisal values, there is no evidence that they influence each other and that humans process sentences linearly. Basically, it is still believed that wh-island effects are a syntactic constraint.
Selected References. Chomsky, N. (1964). Current issues in Linguistic theory. The Hague:Mouton & Co. // Hu, J. (2019). Prominence and locality in grammar:The syntax and semantics of wh-questions and reflexives. Routledge. // Huang, C-T. (1982). Logical relations in Chinese and the theory of grammar. MIT PhD Dissertation. // Phillips, C., Kanaziza, N., & Abada, S. (2005). ERP effects of the processing of syntactic long-distance dependencies. CBR. // Ross, J. (1967). Constraints on variables in syntax. MIT PhD Dissertation.
What’s in a verb class? Towards an evaluation of Manner/Result diagnostics
ABSTRACT. One of the most deeply researched ontological distinctions in lexical semantics is Manner/Result Complementarity: the claim that a given verb lexicalizes the manner or result of an action, but not both. While this work has generated a wealth of empirical data, different researchers rely on different diagnostics, implemented in different ways. The empirical picture is therefore unclear, making it difficult to apply the same set of considerations when extending the investigation to languages other than English or to additional verb classes (e.g. verbs of throwing or cooking; Beavers & Koontz-Garboden 2020). Our preliminary experimental investigation evaluates the robustness of some diagnostics proposed in the literature, allowing us to better understand what it is that they are probing and what makes a verb class a verb class.
ABSTRACT. This paper reexamines the debate surrounding the vP phase. Through an argumentation against a DP intervention account, combined with a novel approach to Object Shift, we argue in a favor of the vP being a clause intermediate phase.
ABSTRACT. In Azerbaijani morphological causatives, the grammatical function (GF) of the causee depends on the base verb's transitivity. With intransitive predicates, the causee is an object (OBJ) and is marked with the accusative (ACC); with transitive predicates, the causee is an OBJθ and is marked with the dative (DAT). In causativized ditransitives, the causee cannot normally take ACC or DAT, as these are already assigned to the theme and recipient of the base verb.
Some analyses suggest that the causee can be marked with an instrumental postposition in these cases, but native speakers tend to interpret such PPs as instruments, not causees. This suggests that instrumental PPs in Azerbaijani are adjuncts, and although a coerced causee reading is possible, it is not the default.
In causativized ditransitives, the causee can be expressed in two ways: (1) unexpressed, with all arguments of the base verb overt, or (2) realized in the DAT, with the base verb’s recipient omitted. The dative in these constructions is ambiguous, indicating either a causee or a recipient, but not both.
This paper analyzes causee optionality using the framework of Lexical-Functional Grammar (LFG), which accounts for argument structure through a-structure. Following previous work on optional arguments, we argue that unexpressed causees are represented in a-structure, even if omitted from c- or f-structure. The causee may fuse with either the agent or the theme of the embedded predicate, depending on the verb’s transitivity.
We conclude that the causee’s optional realization and its fusion with the highest or affected argument can be modeled within LFG, using Kibort’s mapping principles. This analysis extends to double causatives and unexpressed DAT causees, with further details to be provided in the full paper.
Apparent Syntactic Complexity in Korean Elderly Speech: Reassessing Predicate Ratios through Null Form Reconstruction
ABSTRACT. 1. Introduction
This study investigates how aging affects syntactic complexity in Korean speech, focusing on the use of predicates and null arguments. While aging is often associated with cognitive and linguistic decline, recent research suggests that older speakers may develop adaptive strategies to maintain effective communication. In languages like Korean—where null forms and ellipsis are common—surface-level simplicity may conceal underlying structural richness. To examine this, we analyze spontaneous speech from a large elderly speaker corpus and reassess a widely used syntactic metric: predicate ratio per utterance. By including reconstructed null forms, we explore whether high predicate ratios in elderly speech reflect true syntactic complexity or result from covert structural compression.
2. Background
Much of the prior literature on aging and language has emphasized lexical retrieval difficulties and processing speed declines, but less attention has been paid to syntactic change, particularly in typologically rich, non-Indo-European languages. Korean presents a compelling case due to its widespread ellipsis, argument drop, and agglutinative structure. In particular, null pronouns (pro-drop) and omitted arguments allow speakers to produce shorter utterances while retaining meaningful syntactic relations. In corpus studies, syntactic complexity is often measured by proxy metrics such as the number of predicates per sentence. However, these metrics can be skewed in Korean if null elements are not accounted for.
This study builds on earlier experimental findings about working memory decline in aging (Harris, eds., 2016; Hardy et al., 2020), integrating them with corpus-based insights to investigate whether older Korean speakers genuinely produce more complex syntax or whether they rely more on context-dependent, pragmatically enriched forms. The use of large-scale spoken data allows us to observe naturally occurring language, avoiding the artificial constraints of laboratory tasks.
3. Data and Analysis
We analyze a large spoken corpus comprising approximately 550 hours of prompted spontaneous responses from 1,439 elderly Korean speakers aged 60 to 95 (Ok and Kim, 2024). The dataset includes over 250,000 utterances annotated in units called ecels, which segment postpositions, case markers, and grammatical particles according to Korean orthographic conventions. The dataset captures speech from various educational backgrounds, ensuring broad demographic representation.
Our analysis employed the Korean Structure Analyzer (ETRI), examining four main metrics:
• (1) Average number of ecels per Intent-Based Unit (IBU),
• (2) Average number of ecels per clause,
• (3) Predicate ratio per IBU (number of predicates / total ecels),
• (4) Null pronoun frequency per IBU.
To address limitations in surface-based predicate measures, we conducted a manual reconstruction of null arguments in a subset of 2,100 IBUs sampled from three age groups. Predicate ratios were then recalculated including these reconstructed forms to better reflect underlying syntactic structure.
4. Results and Discussion
4.1 Quantitative Simplification with Age, Elaboration with Education
The average number of ecels per IBU decreased with age—60s (15.00), 70s (14.02), 80s+ (13.42)—suggesting syntactic compression (Welch’s F(2, 61,746) = 331.784, p < .001). Conversely, IBU length increased with education—elementary (13.49), middle school (13.59), high school (14.56), college (15.89) (Welch’s F(3, 133,921) = 586.958, p < .001). Clause length patterns mirrored these findings. These results align with cognitive reserve theory, where education supports maintenance of linguistic elaboration in aging.
4.2 Apparent Complexity: Predicate Ratio Patterns
Surprisingly, predicate ratios (based on overt speech) increased with age—.282 (60s), .305 (70s), .314 (80s+)—and decreased with education—.317 (elementary) to .270 (college) (both p < .001). If taken at face value, this would imply that older or less-educated speakers produce more syntactically complex utterances—contradicting cognitive aging expectations.
4.3 The Role of Null Pronouns
To resolve this contradiction, we examined null pronoun use. Frequency of null forms increased with age—.160 (60s), .172 (70s), .186 (80s+) (F(2, 2097) = 15.828, p < .01)—and decreased with education—.191 (elementary) to .155 (college) (F(3, 2096) = 18.739, p < .001). When predicate ratios were recalculated to include null forms, differences between age and education groups both reduced. Specifically, the F-value for age-related differences decreased by approximately 14.4% (from 35.761 to 30.596), while the decrease was even more pronounced for education-related differences, showing an 18.8% reduction (from 32.945 to 26.750):
• Age: F(2, 2097) = 30.596, p < .01 (vs. 35.761, p < .01 without nulls)
• Education: F(3, 2096) = 26.750, p < .01 (vs. 32.945, p < .01 without nulls)
Thus, high predicate ratios in elderly speech are from elliptical structures, not from increased syntactic embedding or argument complexity.
4.4 Interpretation and Broader Implications
These findings suggest that syntactic ability, as measured through predicate density, remains relatively stable across age and education when covert forms are considered. Older and less-educated speakers are not necessarily using more complex syntax but instead relying more on discourse-driven economy: omitting arguments that can be inferred through shared knowledge. This strategy may compensate for cognitive load while maintaining communicative effectiveness.
Methodologically, the study highlights the limitations of surface-based syntactic measures in corpus research, especially in pro-drop languages like Korean. Predicate ratios that fail to account for null elements may overestimate complexity in speaker groups who rely more heavily on contextual cues.
From an applied perspective, these insights are crucial for AI and speech technology development. Elderly speakers' greater use of ellipsis and pragmatic dependency poses challenges for automatic parsing and understanding. Systems trained on written or overt speech may misinterpret elliptical expressions, particularly in underrepresented populations.