APCLC 2020: ASIA PACIFIC CORPUS LINGUISTICS CONFERENCE 2020
PROGRAM FOR FRIDAY, FEBRUARY 14TH
Days:
previous day
all days

View: session overviewtalk overview

09:00-10:40 Session 23A
Location: Choi Young Hall
09:00
Exploring corpus use in teaching evaluative language in experimental research articles to postgraduates in an EFL context in China: A pilot study

ABSTRACT. This pilot study trialed some of the materials which will be used in a project employing a whole-class pre-/post-test design with embedded multiple cases to explore feasible ways of using corpora and corpus tools to teach evaluative language in ERAs to Chinese EFL postgraduates.

The participants were 32 postgraduates majoring in computer sciences and digital information processing in China. Before the workshop, they submitted introductions of research articles and completed an online questionnaire about their background. During the workshop, they learned how to express different levels of certainty through modal verbs, modal-like expressions and signals of confidence using the online Corpus of Journal Articles 2014 in the first 2-hour session and how to maintain evaluative consistency by looking into collocates and semantic prosody using AntConc and the corpus data in the second 2-hour session. After the workshop, they submitted new introductions and completed the post-intervention questionnaire about their attitudes towards corpus materials. Three participants were selected for individual interviews before and after the workshop so as to build up in-depth data of representative cases. The results of their drafts showed participants improved their writing performance after the workshop. The results of the questionnaires revealed they had overall positive attitudes towards corpus use even though they thought the materials were difficult. The interview data further suggested there existed individual differences in their knowledge about academic writing, writing processes and difficulties in corpus use. This calls for more well-designed training sessions of corpus use which will be considered in the main study.

09:25
Research on the Lexical Analysis of Science Classroom Discourse Corpus-focusing on comparison with the Science textbook-

ABSTRACT. This paper aims to study the lexical distribution of various qualities by analyzing the classroom discourse corpus that transcribed elementary and secondary science classes. For this purpose, this study attempts to use science textbook corpus as a comparison. The transcription corpus of science classroom discourse shows the hidden scene of the science classroom as it is, and contains general information on the teacher variables that influence the quality of the science class. This study corresponds to a corpus-based register studies, which compares the classroom speech and textbook corpus. In other words, a wide range of written to spoken comparison is made. At the very least, a study is conducted that compares experimental and general classes. Finally, a comparative study is conducted between general dialogue and language in actions among various spoken behaviors. Hyland (2002) compared textbooks with research papers. A research paper is one expert reporting and verifying his or her work to several experts. Therefore, the author mobilizes various discourse qualities and presents his argument very carefully. In contrast, textbooks do not show careful discourse qualities because professional authors function to teach non-professional students. A similar study is also made by Biber (2006). With reference to these studies, this study intends to study usage comparisons and lexical distribution patterns at various levels. This includes comparison of high frequency words by register, comparison of vocabulary appearing only in a specific area of use, and comparison of lexical distribution patterns in detailed registers of use.

09:50
Complex structures in aviation English: are they present in radiotelephony communications?

ABSTRACT. Pilots and air traffic controllers (ATCOs) are required a minimum English proficiency level for international operations, assessed through a scale developed by the International Civil Aviation Organization (ICAO, 2010). One of the parameters of the scale states that a professional should have control of basic structures and, in a higher proficiency level, of complex structures. A glossary is offered to exemplify language, resembling features of written language more than of spoken language. A corpus-driven investigation found incongruencies between the corpus and the basic structure glossary. Such approach was hindered in the studies of complex structures as they are of low frequency by nature. In this paper, we aim to compare the complex structure glossary against a spoken corpus. We investigated the complex structure items suggested by ICAO within a 110,737 word-corpus of radio communications between pilots and ATCOs when in abnormal situations. We adopted Sketch Engine particularly due to the ease of access to the part-of-speech tagger, necessary for the extraction of less frequent items. The comparison to the glossary shows low equivalence with the radiotelephony corpus: only 19% of structures were found; however, they carry pragmatic functions. To illustrate, the glossary includes the use of all conditionals, whereas the corpus demonstrates that “if” is only used in “if you can”. We conclude by stressing the importance of incorporating such analysis to the aviation English curriculum, material design and class planning, and advocating a teaching of grammar more closely related to spoken language - the focus of aviation English.

10:15
Taking Authorial Stance: The Case of Novice vs Expert Engineering Writers

ABSTRACT. Research in scholarly publication has revealed successful scientific writers employ various strategies to mitigate, modulate or enhance their knowledge claims when publishing their research findings. This paper reports a comparative study of stance taking made by Malaysian scientific writers and their international counterparts. Specifically, we made three-way comparison between novice Malaysian writers and expert Malaysian and international writers. Analyses of stance markers were derived from a corpus of 216 published research articles and 20 unpublished research articles, totaling approximately 1 million words. We compared patterns of evaluation and engagement devices (c.f. Hyland, 2010, Salager-Meyer, 1994) in all three sub-corpora. Our findings revealed that both Malaysian writers and their international counterparts used similar stance-taking resources such as hedges, boosters and plural pronouns to indicate propositional information. They also employed similar resources such as directives when engaging their readers. However, Malaysian expert writers used limited range of variation compared to their international colleagues. Novice writers, on the other hand, consistently showed lack of strategies and low variability when asserting claims and engaging readers. They were also found to hedge the least. Malaysian writers, both novice and expert, used boosters significantly lower compared to international writers. The differences found in novice and expert writers as well as between Malaysian writers and their international counterparts point towards the complexity of stance-taking and stance marking in research writing. We will show that linguistics devices for marking attitudinal commitments towards propositions remain challenging and elusive to EAL writers despite them being experienced researchers in the field.

09:00-10:40 Session 23B
Chair:
09:00
A Synergetic Approach to the Relationship between the Chinese Syllabic Structures and Chinese Tones

ABSTRACT. The syllable as a crucial linguistic unit has mutual interrelation with other linguistic properties and units, in particular, it is believed that the syllable is the bearer of the tone. As the first attempt to find the principal relevance of the syllable, Zörnig/Altmann (1993) outlined a synergetic control cycle focusing on four selected properties in English: the phoneme inventory, vocabulary size, restrictions regarding the phoneme distribution and the syllable length. Different from the phonographic writing system like English, Chinese belongs to the logographic writing system. This study tries to find whether there is any distinctiveness in Chinese syllables. Based on the above synergetic control cycle, the present study adopts a corpus-based approach within a 150-thousand-word balanced corpus of modern Chinese language in different registers, to explore quantitative properties of Chinese syllables (syllables are noted as sequences of vowels “V” and consonants “C”) in different registers and to find the synergetic relationship between Chinese syllabic structures and 5 Chinese tones. The results reveal that Chinese is a typical logographic language with only 4 syllable types (namely CV, CVC, VC, and V) in different tones, which can be satisfactorily captured by the exponential function. There are only 3 syllable length and the determination coefficient shows a very good match in Menzerathian relation. The results reveal the relationship between Chinese syllabic structure and tones, and the findings would be substantially useful for the synergetic study of Chinese syllables.

09:25
Analysis of the Intonation Patterns of Pragmatic Use of Discourse

ABSTRACT. This study aim is to investigate the intonation patterns of pragmatic use of discourse markers presented on Buckeye Corpus. In colloquialisms, especially in everyday conversation, though not syntactically perfect, the speaker uses a variety of discourse markers due to the different contexts underlying the text, unlike what the language itself intends to convey. Considering that the intonation formed during discourse is systematic and plays an important role in many illocutionary acts, the discourse markers collected from the Buckeye Corpus were categorized by the pragmatic classification and the frequency of each of discourse markers and the pattern of intonation such as pitch values and contours were analyzed. The pragmatic use of discourse markers regarding hedge and face was relatively high, and systematic intonation patterns were also found in these discourse markers. Since few studies on between pragmatics and intonation have been carried out due to the lack of systematic analysis methodology, this study is expected to contribute to the research of pragmatics and intonation based on spoken corpus.

09:50
Investigating phrase-frames and their functions in academic speeches

ABSTRACT. The conception of phrase-frames (p-frames) has received increasing attention in corpus studies these years. P-frames are formulaic sets of identical words with one variable slot (e.g. if you look at *, if you want to *). Investigating p-frames in a corpus can help better understand patterns and variations of phrases in certain discourses or registers. The existing literature on p-frames mostly focuses on p-frames in written discourses; p-frames in spoken data remain unexplored. The current study aims to identify p-frames and analyze their functions in a specific genre, academic speech. A 13-million speech corpus is compiled. The corpus is consisted of transcripts of academic speeches in six different disciplines, i.e. Science, Engineering, Humanity, Social Science, Business, and Medicine. KfNgram, a linguistic program that can generate lists of p-frames, is used to extract five- and six-word p-frames from the corpus. Extracted p-frames are manually filtered and then analyzed by their functions. A general p-frame list of academic speech is thus provided. The functions identified and the p-frame list can be of pedagogical use for academic speech training. The present study also hopes to contribute the research findings to the current body of p-frame knowledge and research.

10:15
Differences in Speech Act Recognition and Pragmatic Awareness Between Academic and Technical-Vocational and Livelihood Tracks Senior High School Students

ABSTRACT. This study investigated the current pragmatic awareness level of Senior High School (SHS) students and compared the differences of pragmatic awareness and speech act recognition between the Academic and Technical-Vocational and Livelihood tracks of a selected public secondary school. This research aimed to assess SHS students’ pragmatic competence as well as the difficulties they encountered in comprehending speech acts and the importance of linguistic features and contextual knowledge in distinguishing types of speech acts. The Pragmatic Listening Comprehension Task adapted from the study of Garcia (2004) which includes audio clips of scenarios of requesting, offering, suggesting and correcting in the school context was used as the instrument. Results showed that (a) the speech act recognition and pragmatic awareness are dependent to the track where the students belong, (b) naturally-occurring conversations in school context are commonly conventional direct speech that are easily recognized by the students, and (c) the linguistic features that help in the facility of speech recognition include explicit agent and recipient, presence of lexical signals such please, and action verbs.

09:00-10:40 Session 23C
Location: IBK Hall
09:00
Authorship Attribution of Jin Yong’s Martial Arts Fiction

ABSTRACT. Louis Cha, better known by his pen name, Jin Yong, was a famous Chinese wuxia novelist. The aim of this study is to provide evidence to the claim that there were possibly ghostwriters in Jin Yong’s martial arts fiction. The paper constructs a corpus of 15 most iconic works of Jin Yong, each chapter of works as an independent file. The dialects and the unique wording sequence are statistically analyzed as the fingerprints or idiolect of the author. The findings reveal that although Jin Yong’s works, with revisions from 1990 to 2006, are different from the original serialized versions initially published in installments in Hong Kong newspaper, most often in Ming Pao, there are still some traces of inconsistence of author’s style and wording. Given the rumor that Ni Kuang wrote some segments in Jin’s novel when Jin was on holiday in Europe, the paper argues that there are still some invisible fingerprints of ghostwriter Ni Kuang or possibly others in the Jin’s revised fiction. Considering author’s birthplace and education background, the dialects in the corpus serve the best examples. One case in point is the dialect structures such as 看他不起,打他不过 or 敌他不过 and their equivalent standard structure 看不起他,打不过他 or 敌不过他 are randomly used across the chapters. The present paper discusses the potentials of annotated corpora in authorship attribution studies.

09:25
Aftermaths of Franco Moretti: “Short” Distant Reading in Korean Literature

ABSTRACT. Ever since Moretti proposed a new approach called “distant reading” for macroscopic literary analysis, there have been heated debates about its methodological status in tandem with close reading. Especially, as it was converged with an emergent technological drive called “digital humanities,” misperceptions have intensified. Some researchers have erroneously claimed that distant reading is all about a scale rather than hermeneutic interpretation and that a bigger scale inevitably blinds us from historical contexts. Others have misunderstood that it only serves a positivistic function to confirm what we already knew. Despite such charges, however, recent scholarship presented numerous diversifications of the quantitative approach after Moretti. Therefore, this presentation aims to examine how contemporary distant readers have confronted the aforementioned criticism and shifted the methodological gear of the new reading, especially from an analytic tool to one of historiography. In particular, it will survey recently published published literature in the field of digital humanities, such as Andrew Piper’s Enumerations (2018), Katherine Bode’s A World of Fiction (2018), and Ted Underwood’s Distant Horizons (2019), and traces how each version of distant reading narrates literary history differently. In doing so, I will contemplate the ways to apply it to Korean literary culture and history, especially under the unfavorable infrastructure of pursuing digital humanities. Then, I will propose and explain what I call “short” distant reading, instead of foregoing quantitative approaches to literary studies.

09:50
Fashion in fact and fiction: Using corpus linguistics to connect 19th-century British novels with 21st-century discourses

ABSTRACT. This research aims to investigate connections between classic literature and 21st-century discourses, as an attempt to find a place for English literary studies in Thailand’s contemporary EFL higher education. A corpus of 19th-century British novels was compiled and then compared via Wmatrix (Rayson 2008) with two general present-day English corpora: British English 2006 and American English 2006. Through the comparison, key semantic domains in the fiction corpus were extracted and that of ‘Clothes and personal belongings’ was chosen to be a focus of the study, not only because it is key but also because it is indicated as a significant theme in 19th-century literature through a statistic topic-modelling approach (Jockers and Mimno 2013). Concordance lines of such high-frequency words in this semantic field as 'clothes', 'dress' and 'hat' were analysed. It is found that their co-occurrence patterns not only contribute locally to characterization and fictional world creation of the novels but can also be linked to the broader socio-cultural subjects often discussed in the critical interpretation of Victorian novels, including gender stereotypes, social class distinction and materialism. Interestingly, such ideological meanings also emerge in concordance lines of the similar words in the present-day English general corpora, albeit in different phraseological patterns. It is therefore suggested that the narratives of 19th century British novels can provide a study topic, i.e. social meanings of clothing and their linguistic expressions, that is relatable to the 21st-century society and that encourages integration of studies in English language, literature and persistent global issues.

10:15
A Corpus-based study of tense use in late-19th to mid-20th century East Asian literature

ABSTRACT. Pre-modern usage of Korean language was notably simplistic in its notion of tenses. For instance, the first Korean translation of Pilgrim’s Progress, one of the very first translated works of literature in Korea is almost exclusively narrated in the present tense, despite the original being mostly narrated in the past tense as per the tradition in storytelling in the English language. In this paper, we argue that the introduction of the Western linguistic style in the early 20th century played a major role in the development of the tenses as they are used in the modern Korean language, while the Chinese influence played a part in the simplicity of the tenses in the works of traditional Korean literature. In the first part of the paper, we will examine the traditional Chinese texts and early 20th-century Chinese translations of Western literature and compare them with Korean texts in order to establish the impact of the Chinese language as a major factor in the early simplicity of the tenses in the Korean language. In the second part, we will construct a corpus database of the Korean translations of English literature published between the late-19th century and the mid-20th century, and then engage in a comparative study of the works published in each decade using the accumulated corpus data in order to elaborate how the change in the usage of tenses in the Korean language corresponds to the introduction of the Western literary tradition.

09:00-10:40 Session 23D
09:00
The language interference between Vietnamese and Chinese of Vietnamese brides’ community in Taiwan

ABSTRACT. In recent years, it has become more and more common for Vietnamese immigration to Taiwan through the way of getting married with Taiwanese. According to the research, in 2018, the population of Vietnamese brides in Taiwan have exceeded the number of 100000. An interesting finding from the research is that those new immigrants speak in both Vietnamese and Chinese while they are having small talk with each other. Therefore, the mixed language of Vietnamese and Chinese has been created by those new immigrants, and it causes the phenomenon of language interference and also becomes a kind of Vietnamese variant. This Vietnamese variant is spoken by the new immigrant community in Taiwan within their daily conversation. Besides, it is certainly influenced by Chinese vocabulary and grammar (their second language). The purpose of this research is to understand this phenomenon of language interference by collecting and analyzing all articles written in the traditional ways or posted on the social media by those Vietnamese brides. Those articles are for communicating, discussing and chatting, and some articles are from the guidebooks that have been translated into Vietnamese from Chinese for the new immigrant community in Taiwan about how to live in Taiwan. This paper focus on the following issues: the cause of language interference in Vietnamese brides’ community in Taiwan, the process and methods of interference between Vietnamese and Chinese in Vietnamese brides’ community in Taiwan, the features of this Vietnamese variant, the effects of this Vietnamese variant in Vietnamese brides’ community in Taiwan.

09:25
Constructing Chinese national identity: discourse on nation and national identity in Chinese language textbooks from Mainland China and Hong Kong

ABSTRACT. The education system in mainland China is said to be highly-politicized since the establishment of the PRC while Hong Kong‘s education system tends to be depoliticized. On one hand, Chinese Language Education in mainland China is centralized with the “One Guide – One Text” policy. On the other hand, in Hong Kong, the issue of identity conflict in Hong Kong became white-hot. Chinese Language Education, as an important vehicle for fostering patriotic or national education, is therefore being concerned. This research aims to discover the submerged dominant ideologies with a focus on the cognitive structuring of national identity in the Chinese Language Textbooks 11 textbooks from mainland China and 11 textbooks from Hong Kong will be examined using a corpus-assisted Critical Discourse Analysis framework, implementing a mixed-method approach to analyse the textbook corpus by employing corpus linguistic analytical techniques. The study reveals that the Chinese language textbooks published in mainland China and the Hong Kong SAR are diverse in their strategies to construct students’ national identity. First, a large part of the learning content in textbooks from China involves topics pertaining to Confucian beliefs; textbooks from Hong Kong forge a Chinese identity relying on a large amount of classical Chinese literature. Second, the national images of China built in textbooks from mainland China emphasize the achievements of the Chinese government, while textbooks from Hong Kong centre on the history of China and Chinese moral values. Lastly, the results show that text selection in textbooks from mainland China is more ideologies-oriented.

09:50
The Emotion of Anger in the Chinese Online discussion Forum Corpus

ABSTRACT. Croft (1993: 64) said that “[t]here are two processes involved in possessing a mental state (and changing a mental state).” For Chinese, there are markers that indicate the direction of anger-- dui4 ‘towards’, rang4 ‘cause/make’, etc. (cf. Cheung & Larson, 2006). These markers, however, mainly work for conventional terms such as sheng1qi4 and fa1huo3 ‘be.angry’. When it comes to our target term, nu4 ‘angry/anger’, a direct expression of anger, we found some slightly different patterns.

Nu4 is an equivalence of ‘anger’ but stronger. 7,464 instances of nu4 were collected from the PTT corpus, a Bulletin Board System (BBS) in Taiwan, containing a conversation-like discussion with emotion expressed directly.

For results, we found that, still, a majority (65.80%) of the instances were used to mean ‘be.angry/anger’. Yet, the uses of nu4 are often accompanied by a follow-up impulsive consequence (nu4 shui4jiao4 ‘angry-sleep’, nu4 chu2zhi2 ‘angry-to.top.up.money’). These unconventional uses of nu4 (termed ‘resultative’) constituted almost 70.88%. Next, nu4 works almost similarly like an emoticon (21.15%). In this function, it is often accompanied by exclamation marks, or brackets. The remaining were idioms (10.08%) and some deleted ones (2.97%) , including movie title, proper nouns, or unidentifiable uses.

From this study, we found that the emotion of anger in the online discussion forum is different from the conventional metaphors such as anger is heat found in the past (cf. Yu, 1998; Chen, 2010). The morphological constructions of it are also different from how it is normally used in texts.

09:00-10:40 Session 23E
Chair:
Location: Helinox Hall
09:00
Life in the Shadows: Loss and Posthuman Bildung in Kazuo Ishiguro’s Never Let Me Go

ABSTRACT. The Bildungsroman has been understood as a genre about socialization. Coming-of-age tends to be seen as a process of social integration and assimilation. Through the process of Bildung, one becomes a member of a society or nation. Drawing upon the insights of critical discourse on the Bildungsroman, this paper examines how Kazuo Ishiguro’s novel Never Let Me Go (2005) adopts and re-envisions the genre in order to invite critical reflection upon the social and ethical implications of human cloning as a form of biotechnology that introduces cloned forms of life into society. By focusing on the narrative style of the novel’s narrator, Kathy H., a clone figure that grows up to serve a carer and donor, I am particularly interested in the way the process of growth in Never Let Me Go is deeply informed by feelings of loss and social marginalization. More specifically, this paper will closely examine how the language Kathy uses in order to recollect and describe her childhood as a student at Hailsham aims to recuperate a sense of communal belonging that she increasingly loses once she leaves the place in order to become a carer.

09:20
Comparative and Adaptative Studies of a Written Text and a Video Text: Text analysis and Video analysis of Never Let Me Go

ABSTRACT. Kazuo Ishiguro, Nobel Prize-winning British author, published Never Let Me Go (2005). Time magazine designated the fiction as the best novel of 2005 and one of the 100 best English-language novels. This well-known novel portrays the clones’ stories of donating their organs to humans, which eventually resulted in their deaths. Based on this original story of biological technology and ethics, Mark Romanek directed a film adaptation in 2010. In this research, I will compare how these two different media with different texts deliver the basic story. I will analyze the written text of the fiction by utilizing text analytics then choose keywords pertaining to biotechnologies and life ethics as text analysis. These keywords will be selected through text analysis centering at the keyness of the fiction, or major themes. In comparison with this text analysis, on the written text I will analyze the screenplay and the video text with synaesthetic signs then choose the significant scenes connected to the keywords identified from the original fiction. Through this comparison, I aim to develop an adaptational theoretical perspective or approach to the visual text adaptation of the written text.

09:40
Evaluating the Impact of Human Genomics in English Literature

ABSTRACT. Since the 19th century, genomic technologies including cloning, eugenics, and mutations have been appeared in many literary and cultural arts due to advances in science and technology. In particular, cloning, 'the artificial creation of a human being', has been an interesting topic for novels and films. A novel, 'Never let me go' is a 2005 dystopian science fiction by British author Kazuo Ishiguro, and the film was released in 2010. It is a dystopian tale about a society in the 1950’s which created human clones in order to erase disease and extends the future lifespan of human past 100 years. Based on a scientific history of cloning, it was impossible to clone human being in the 1950’s. In reality, therapeutic cloning is a commonly discussed type of human clone in order to conquer disease. As of November 2019, there is an active area of medical research for therapeutic cloning which would involve cloning cells from human, but is not started in medical practice anywhere in the world. Current study has evaluated the way how science fictions are oversimplified, unrealistic, and what are their bioethical issues.

10:00
Understanding of viruses: gene therapy in movies

ABSTRACT. Although the science of virology has evolved roughly in parallel with the art, e.g., cinema, since 1990s, the relationship between art and science remains inconsistent. This is particularly important in our time, because the public's perceptions and, accordingly, their reactions are significantly influenced by their view on scientific truth as presented by the media. Generally, virological subjects on the concept difficult to understand or specific nomenclature used in science can easily alienate nonspecialists although viruses seem to be a specialty that can offer cinema the required suspense. Apart from random biographies of virologists and retellings of stories about great viral infectious disease epidemics from the past, most films focus on the dangers presented by outbreaks of unknown viral agents that originate from acts of bioterrorism, from laboratory accidents, or even from space. Memories of great epidemics and continuously available information on new epidemics and dangerous viruses have embedded in the public a sense of awe about viral infection or virus, itself, a prerequisite for cinematic success. In this brief presentation, gene therapy biotechnology using genetically modified viruses in films will be reviewed and the various trends on gene therapy technology in films will be discussed.

10:20
A Study on the Use of Biotechnology in English Literature

ABSTRACT. Literary and cultural arts have long interacted with various technologies in worlds both rea and imagined. In particular, literature has served as a medium for predicting future society and introducing to the general public complex science and technology that are otherwise difficult to understand. For instance, a group of Russian scientists, who have been influenced by the Russian philosopher and literary scholar Nikolai Fyodrv’s transhumanism, could concentrate on developing bio-artificial organ transplantation and position Russia as pioneers in the global artificial organ industry. This example shows how literature can serve as both a starting point and a catalyst for creative inspiration. This study has attempted to compile a corpus of English literature accumulated over the last hundred years, and examines the ways that biotechnology has been described in English literature.

10:50-12:30 Session 24A
Location: Choi Young Hall
10:50
Linguistic features of Thailand’ s university English entrance exams

ABSTRACT. General Aptitude Test (GAT) has served as a gate-keeping exam for Thai students in Thailand for the past 10 years. This study aims to analyze this test through a corpus-based approach. Through a keyword analysis, the 10 English entrance tests for the past 10 years are used as a target corpus against the Key BNC as a benchmark corpus. The keyword list enabled us to understand content and style of the tests. The findings demonstrate that Thailand’s English entrance exam focuses on the certain themes, including science-oriented themes (e.g. greenhouse, scientist and DDT), health-related themes (e.g. obesity and ill), and technology-related themes (e.g. Facebook, Internet and computer) as well as features of the dominant linguistic style of the tests such as negativity (e.g. isn’t, hasn’t and wouldn’t). This study offers an insight into the nature of the Thai entrance exams which can benefit the students who prepare for the exam, educators and test designers.

11:15
A Study on Assessment of Korean Learners Corpus Using Text Mining

ABSTRACT. The purpose of study is to assess the writing of Korean learners using text mining. To verify the validity of the writing assessment using text mining, we compare the assessment scores of Korean teachers. The indicators of writing an assessment to be measured in this study are the ability of vocabulary expression, composition and adequation. First of all, vocabulary expression ability measures how many different vocabularies are used in a composition. Vocabulary composition ability is to measure the structural complexity of the writing. It measures how complex and diverse the connections of vocabularies are in the writing. Vocabulary adequation ability is a measure of how well a vocabulary is placed in the writing. It measures the consistency and appropriateness of the contents through the closeness between the words appearing in the writing. Next, the ability of expression, composition, and adequation scores of the previously measured vocabulary are respectively digitized and added up. In this process, the quantified values ​​were Min-Max regularized to unify the weights in order to prevent the influence from being concentrated on specific indicators. Finally, for securing the reliability of the assessment using text mining, we analyzed the concordance rate between the assessment using text mining and the assessment of Korean teacher. This study has a meaningful point that we attempted the automatic assessment sourced by mining technique to learner’s writing assessment.

11:40
Using the Deep Learning Techniques for Understanding the nativelikeness of Korean EFL Learners

ABSTRACT. Building upon the state-of-the-art deep learning techniques, the present study classifies the texts written by Korean EFL learners and English native speakers and thereby demonstrates how the two types of texts differ from each other. To this end, the current work makes use of the Yonsei English Learner Corpus (YELC) and Gacheon Learner Corpus (GLC) as the L2 data, and Corpus of Contemporary American English (COCA) as the L1 data. Utilizing the sentence classification methods, the current work implements a system to differentiate the two types of texts, the accuracy of which is about 94%. This indicates that the deep leaning-based system is capable of identifying the well-formedness and felicities of the texts written by Korean EFL learners. Nonetheless, the system-based judgments do not overlap with human judgments largely because the deep learning model exclusively focuses on sequence of words. The present study provides a further analysis to see how the two types of judgments differ with respect to grammatical errors (e.g., word order, voice, etc.) and felicity errors (e.g., semantic prosody, the position of adverbs, etc.).

10:50-12:30 Session 24B
10:50
Pilu Membiru and Kunto Aji : a Lexical analysis on Netizen's Reaction in Youtube

ABSTRACT. Currently Youtube is one of Indonesia’s number one most popular social media (Katadata.co.id, 2019). According to the We are Social survey, Youtube is the most widely played social media, mainly for playing music, which is with an 88% percentage of total social media users in 2019. One of them is Kunto Aji's Youtube account which is an Indonesian artist and musician. In his Youtube account, Kunto Aji uploaded many of his works in the field of music. In 2018, Kunto Aji released his second album (Mantra-mantra). Where in the album, one of his songs be entitled Pilu Membiru has been able to make his listeners feel the positive energy conveyed by him through the song. Therefore, this study aims to analyze the reaction of netizens to Kunto Aji’s song Pilu Membiru which was uploaded on Kunto Aji's YouTube account. The song Pilu Membiru shows that it has succeeded in capturing the hearts of netizens, where there are positive comments that bring netizens lost in the memories they have lived through.

11:15
Large-scale Corpus-based Approaches to Speech Rhythm

ABSTRACT. Speech rhythm is defined as patterned recurrences of events in time. The paper provides a bird eyes’ review of earlier attempts to rhythm metrics. Earlier approaches to speech rhythm have been devoted to finding periodicity, and their failure to find periodicity in speech signal have led to the conceptual shift that speech is not isochronous and that the rhythm of a language is the product of its linguistic structure such as syllable structure, vowel reduction, and accent-induced lengthening. Quantitative and stochastic approaches to linguistic rhythm has been accompanied by the conceptual shift, and that various rhythmic metrics such ΔC, %V, and PVI are proposed. The paper utilizes large-scale speech corpora to evaluate what endeavor needs to be made in order to find rhythmic constancy. The study aims to include arrays of phonetic and linguistic features. The inclusion of the multitude of features is a novel way to speech rhythm contrary to previous studies that have mainly focused on durational features. By including features extracting directly from speech signals (e.g., pitch, intensity, and spectral features) in addition to consonantal and vocalic durations, and also features inferred from the linguistic structure such as stress location and parts of speech, this corpus-based study will enhance our approaches to and understanding of speech rhythm.

11:40
An analysis on usage of loanwords in Korean and Japanese newspaper

ABSTRACT. Recently, there has been an increase in the number of loanwords, and they have become a part of our everyday lives. Loanwords are pervasive in a variety of areas such as science, technology, and economics. Also, the effects of this increase of loanwords, especially the ones that came from English, are visible in the field of language education. Therefore, it is important to examine the loanwords and to understand unique features of our own language. In this regard, the aim of this research is to investigate the usage status of loanwords in newspapers and to give explanations for differences by comparing Korean and Japanese, which have many grammatical, morphological and syntactic common points. In order to conduct this study, first, we make a newspaper-based corpus of Korean and Japanese by extracting sentences including loanwords. Newspaper was selected as research data because it uses wide spread loanwords. Second, we clarify the common points and the differences in the usage of loanword concerning the frequency and word class such as noun, verb, etc. The word class of loanwords in the extracted sentences is determined by morphological analysis with Mecab in Python. Third, we investigate the reasons of differences in the word class between Korean and Japanese such as grammar and perception of native speakers. This paper is expected to contribute to understanding the current situation of loanwords in each language. Furthermore, this study is a fundamental research to demonstrate how we adopt and use loanwords and to predict future trends of loanwords.

10:50-12:30 Session 24C
Location: IBK Hall
10:50
Exploring L2 learners’ use of communicative strategies in the Corpus of Hong Kong Spoken English: Implications for teaching English as a lingua franca

ABSTRACT. Since the era of globalisation, the use of English in international communication has been the focus of the research paradigm in English as a lingua franca (ELF), which sees L2 speakers as the vast majority particularly in the professional and business discourse. ELF interactions place greater emphasis on communicative functions (than language forms) to ensure mutual understanding, and the use of communicative strategies (CSs) plays an important role in this process. This study seeks to investigate the use of CSs from a self-complied spoken corpus, which has recorded Hong Kong secondary and university students’ 8-minute semi-authentic (academic) group interactions (n=457), resembling the format of Hong Kong’s public speaking examination. This sample includes L2 learners of different English proficiency levels. In total, around 22 hours and 5 minutes’ interaction transcriptions were collected for the CS analysis following Björkman’s (2014) framework to investigate Hong Kong students’ use of CSs in academic discourse. Our findings suggest that the participants mainly used self-initiated strategies to enhance explicitness (e.g., repetition, paraphrasing) but they used relatively few other-initiated strategies (e.g., confirmation checks, clarification requests, co-creation of the message) that are crucial for mutual support in ELF communication. Furthermore, students with a lower English proficiency level tended to use a limited variety of CSs and rely on some pre-taught formulaic expressions during the discussion. These findings reveal the (mis)alignment in CS use between Hong Kong L2 learners and ELF speakers and their ways of learning CSs, thus giving advice on the teaching of CSs for international communication.

11:15
Development of Speech fluency by task repetition Over a Long Period of Time: effect of Story Retelling

ABSTRACT. The purpose of this study is to examine changes in fluency features made by Iranian high-intermediate EFL learners over one semester of Oral Reproduction of Short Story course in University of Mazandaran. Particular fluency features including Speech rate, Number of syllables, Pauses, False Start, Repetition, Self-correction, Interruption and Scaffolding were measured (Ellis, R. and G. Barkhuizen 2005. Analyzing Learner Language. Oxford: Oxford University Press).To this end a correlational design was applied in the following manner: In the first session, an Oxford Placement Test (OPT) was given to the participants. In the remained sessions the participants performed the oral narrative task. Participants were supposed to retell a story three times in each session for three different listeners. There were three types of the stories: 1) the stories that participants have written by themselves, 2) The stories that the participant adopt from Persian authors, and the stories that the participants adopt from English Once the data was organized into files, a fluency analysis was performed on the speaking data using transcriptions that were written according Students’ recorded voice to analyze the measures of oral fluency in order to see how much students’ speaking move from dysfluency towards fluency.The analysis of each session was compare with next session to examine the long-term effects. The analysis of the data shows that most of the fluency features that were mentioned above made development by task repetition in long time but the amount of change was different for different features.

11:40
CONTENT ANALYSIS ON VIDEOGRAPHED DEBATE ACTIVITIES OF GRADE 9 MERANAW STUDENTS AMONG SELECTED PRIVATE AND PUBLIC HIGH SCHOOLS IN MARAWI CITY

ABSTRACT. This study was conducted to make a video content analysis on debate activities of Grade 9 Meranaw students among selected private and public schools in Marawi City. It attempted to answer the following questions: (1) What are the errors in debate activity among Grade 9 Meranaw students in the aspect of: Phonology, Morphology, Syntax, Semantics? (2) Which categories of error are dominant in the following aspects: Phonology, Morphology, Syntax, Semantics? (3) What generalizations that can be formulated from the findings of the study? The study used a quali-quanti approach. The source of data of the study was the recorded and transcribed speeches of the respondents during the debate tournament. The errors they committed were listed, grouped together, tallied and recorded. Then their occurrences were tallied and recorded in order to determine their frequencies. After the analysis and interpretation of the data, the following were the findings: (1) In phonology, the 5 grade 9 Meranaw speakers had a difficulty in pronouncing the sounds /θ/, /ð/, /ӕ/. (2) In morphology, they frequently committed pluralization, derivation (be it Noun and Adverb), and inflections (i.e. Verb, Noun, and Tense). (3) Specifically, the highest number of committed errors was under inflection. (4) In syntax, they committed preposition, pronoun, diction, verb and article. (5) Specifically, the highest number of committed errors was prepositions. (4) In semantics, the speakers frequently committed unclear meaning, incomprehensible sentence, unclear idea, and inappropriate word choice. (5) Specifically, the students had the most committed number of errors on inappropriate word choice.

12:05
Communicative Activities in Pakistani Intermediate EFL Textbooks: A Corpus-driven Analysis

ABSTRACT. Textbooks are very important tool in language learning. In the present era of technology, use of language corpora and particular computer run software and tools has gleaned astounding attention of English language teachers throughout the world and Pakistan is no exception to it. This paper aims to investigate the communication activities (tasks) in Pakistani intermediate EFL textbooks through corpus driven analysis. It analyzes the usage, grammatical patterns and collocation patterns of verbs particularly used in the Intermediate Textbooks in Pakistan. The researcher has compiled a corpus based on EFL textbooks used as a language teaching tool at intermediate level in Pakistan. These textbooks have been designed by Punjab Textbook Board Lahore (Pakistan). The number of textbooks designed for intermediate level is 4. A corpus tool LancBox with a GraphColl (version 1.0.0) has been used to find out the various properties of English verbs used in the textbooks. The researcher has selected 25 verbs randomly from these textbooks. The particular corpus driven analysis explores the extent to which the communicative and verbal tasks given in Pakistani intermediate EFL textbooks are incompetent and inadequate. The findings of the study underscore that the major reason for the poor communicative proficiency among Pakistani EFL learners is the lack of authentic and reliable language learning tasks in a wide variety of situations. The study has found different grammatical patterns and collocation patterns where these verbs have been used. This study also reflects the properties and semantic meanings of verbs.

10:50-12:30 Session 24D
10:50
Integrating collocation and colligation: A cross-linguistic study of modal adverbs of certainty using Urdu and English corpora

ABSTRACT. This study aims for a cross-linguistic analysis of modal adverbs in Urdu and English conducted by examining collocation and colligation patterns in comparable corpora. As comparable corpora, for Urdu a corpus of Urdu accessed from the Lindat/Clarin repository (Jawaid et al. 2014); and for English, the BNC (XML Edition) accessed via Lancaster University’s CQPweb server is used. Urdu-English parallel corpora (part of EMILLE, Baker et al., 2003) that have been aligned and modified (Jawaid et al., 2011) are utilized for the selection of modal adverbs. Three most frequent English modal adverbs certainly, definitely, and of course and the corresponding modal adverbs in Urdu zarūr, yaqīnān, and bilāṣubha in parallel corpora were identified and selected using software LancsBox. For analysis, only comparable corpora is used due to the paucity of examples of modal adverbs in the parallel corpora. Basing on Sinclair’s model of Extended Lexical Units (ELU), we integrate collocation and colligation analysis of the comparable corpora to contrastively evaluate to answer: (i) what are the most frequent word forms that modal adverbs collocate with, (ii) what particular grammatical categories modal adverbs collocate with, and (iii) what meanings are conveyed through particular collocations in Urdu and English. The analysis will help us in understanding the similarities and differences in the use of modal adverbs of certainty in the two languages and provide us a framework for analysing corpora contrastively using collocation and colligation analysis.

11:15
The Important Role of Indonesia in Saving Local Languages Through Communicative Approach Theory

ABSTRACT. Local languages function as tools to develop the ability to reason, communicate and express thoughts or feelings and preserve national culture. In addition, local language is a self-identity in the era of globalization so that it can filter out foreign cultures that enter Indonesia. However, at present, the development of local languages has decreased among students because there have been many foreign cultures that entered Indonesia. To overcome this, there is a need for local language learning efforts that use a communicative approach. In addition, the need to increase local language learning through formal education in accordance with the context of good language use. So that students will be able to apply local languages well in communication.

11:40
Locative phrase in Sakizaya revisited: from a corpus-based approach

ABSTRACT. The word order of Sakizaya is predicate-initial and other core phrases are determined by the semantic roles, agent focus vs. non-agent focus. Nominative phrase precedes genitive phrase in agent focus, and genitive phrase precedes nominative phrase in non-agent focus (Shen 2008, 2016). On the other hand, locative phrase is an adjunct, free to appear in any position. Nevertheless, the locative phrase is not completely free in Sakizaya. This study adopts a corpus-based approach and suggests that the locative phrase in Sakizaya shows skewed distribution. A corpus including more than 3000 sentences is established by collecting data from an online dictionary of Sakizaya. The data consist of two categories, agent focus and non-agent focus. The data are divided into two phrases [XP, XP] and three phrases [XP, XP, XP]. The sentences are scrutinized and the positions of the locative phrases are examined. The results show that there are 167 tokens of locative phrase for two phrases and 120 tokens for three phrases. The distribution is skewed in two phrases in which locative phrase is only attested after nominative case. In three phrases, locative phrases in the final positionoutnumber those in the middle position (73 vs. 48). The distribution suggests that the locative phrase in Sakizaya is not positionally restricted, as it appears in the middle or final position. Nevertheless, the distribution reveals that the locative phrase might be constrained by the semantic roles. Locative phrases are associated proportionally higher with agent focus than with non-agent focus (276 vs. 11).

10:50-12:30 Session 24E
Location: Helinox Hall
10:50
Corpus-based Analysis of Phrasal verbs in North Korean English Textbooks

ABSTRACT. Previous studies have endeavored to describe and analyze North Korean English textbooks with various perspectives such as structure, ideology-based content, and vocabulary. For better understanding of North Korean English vocabulary education, this poster aims to analyze phrasal verbs in North Korea English textbooks. Phrasal verbs play an important role in all areas of communicative competence and can contribute to language learning success as well (Littlemore & Low, 2006). By examining phrasal verbs in textbooks could provide different perspectives of North Korean English vocabulary education. While learning phrasal verbs, frequency plays an important role and also affects the acquisition, processing, and vocabulary use (Schmitt, 2010). By this rationale, the 150 most frequently used phrasal verbs from Liu (2010) and Liu & Myers (2018) should be seen and considered with as a starting point. In this perspective, Comparing Liu’s (2011) list with North Korean frequently used phrasal verbs would inform how well designed to learning phrasal verbs for EFL learner’s application in real world. Comparing Liu & Myers’s (2018) register-based list also would inform how well realistically reflected. In conclusion, this poster reveals that North Korean English textbooks failed to distribute well-balanced meanings of phrasal verbs. It shows us that North Korean defectors in South Korea should be required to English education.

11:15
Synonym Differentiation of Korean Adjectives Based on Lexical Co-occurrence

ABSTRACT. This paper aims to analyze and describe how to identify and differentiate the lexical information of Korean adjective synonyms on the basis of their collocations, semantic preference, colligation, and semantic prosody, which are observed in the empirical language data in corpora. In order to extract co-occurrences, we use a large corpus of 130 million words in modern Korean, which consists of newspapers, magazines, literature, informational text, and textbooks and contains text from the 1900s to the 1990s. In the case of adjectives, it is necessary to analyze the co-occurrences which differ depending on inflectional forms. By analyzing the preceding element of an adjective, we can identify the modifier or the noun of the given location and identify the common and distinct points of the preceding element. Based on these co-occurrences, semantic preferences can be identified. ‘Semantic preference' refers to the preference of a particular class of meaning in the vocabulary that constitutes collocation. We also analyze the ‘semantic prosody’ that refers to the speaker's positive and negative attitude as identified in co-occurrences. In addition, by investigating the similarities and differences of colligation, other discrete information of synonyms can be described. An example is to analyze the colligation with the grammatical category of negation and examine the tendency of negation. We expect that the lexical information of adjective synonyms will be used for developing dictionaries and teaching vocabulary.

11:40
Temporal Expression System and Variation Patterns in Interlanguage of Korean Language Learners

ABSTRACT. In this study, I survey the temporal expression system and variation patterns in interlanguage of Korean learner’s corpus. By analyzing the relationship between markers of temporal expression and semantic functions observed in Learner’s Corpus, I highlight the system of temporal expression in interlanguage. In addition, I also explore the systematicity of interlanguage by examining the variations based on linguistic context. I ask four research questions, and come to conclude using function analysis, obligatory occasion analysis, and frequency analysis of the learners’ interlanguage. The first research question asks about what kind of function do temporal expression markers serve in each learning steps. To answer this, I performed Form-function Analysis. The second research question asks how tense and aspect functions through temporal expression markers by different learning step in Korean language learner's interlanguage. Here, I performed Function-Form analysis. The third research question asks how the rate of variation and variation type of temporal expression markers appear in Korean language learner’s interlanguage. I used obligatory occasion analysis and frequency analysis. The fourth research question asks how the variation type of temporal expression marker change based on linguistic context and learning step in Korean language learner’s interlanguage. I examined the linguistic context that affects learners in their choice of variants. This research is meaningful in the sense that it revealed the system of interlanguage temporal expressions and its variation patterns, as well as offering new perspective and method in analyzing learner’s language through multilateral analysis for temporal expressions seen in Korean language learner’s interlanguage.

12:05
Constructing and Analyzing a Korean Hotel and Hospitality Corpus to Develop Korean Teaching Materials for Specific Purposes

ABSTRACT. This study aims at building a Korean hotel and hospitality corpus and analyzing it with various methodologies to develop Korean teaching materials for tourism professionals and college majors. As the number of outbound Koreans increase, the need for systematic training of hospitality Korean in overseas universities has been raised in the literature. However, few textbooks in the field are available. Basing on the existing published teaching materials, the researcher constructed the corpus to understand the language use patterns. To build the corpus, twenty hotel and hospitality foreign language teaching materials were selected. They consisted of thirteen volumes for hospitality English, four for Chinese, two for Japanese and one for Korean, nine of which were published in Japan, nine in Korea and two in the UK. Each dialogue was manually translated and modified into authentic Korean expressions from the perspective of an expert working in the tourism industry for about fifteen years. The absolute size of the corpus is not large but considering the limitations of data collection and near impossibility of a legitimate recording of on-site conversations, it is meaningful to show another possibility of producing educational materials. High frequency lexical and grammatical expressions were extracted through employing KMAT, a morphological analyzer and tagger for Korean. Also, the patterns of language use in the hotel and hospitality industry were found. These quantitative results are expected to contribute to developing optimized Korean teaching materials for specific-purposes learners abroad.

13:30-15:10 Session 25A
13:30
Revisit Firth’s Linguistic Theory and Its Influence on Corpus Linguistics

ABSTRACT. We usually trace the origin of corpus linguistics to Firthian linguistic theory, especially the assertion of linguistic monism, the theory of contextual meaning, collocation, etc., emphasizing their profound influence on the development of corpus linguistics, while ignoring their evolution. By revisiting the works by Firth, we worked to interpret and re-evaluate Firthian linguistic theory, some key notions in particular, and compared Firth’s original ideas on them with their counterparts developed in corpus lingusitics. These notions under examination included structure, system, context of situation, collocation, colligation, and prosody. It was concluded that corpus linguistics inherited these notions from Firth and endowed them with some new features. For example, for Firth, the notion of collocation was used to account for the meaning of the single word. But in corpus linguistics, collocation defined the meaning of a unit as a whole. Therefore, when we traced the development of corpus linguistics, it would be misleading to claim that Firth established the framework for corpus linguistic and corpus linguitics conducted researches within this framework. It would be fair to claim that the development of corpus linguistics has been inspired by Firthian linguistic theory in terms of the ontology of language and the central status of meaning analysis in descriptive linguistics. Corpus linguistics inherited and creatively developed Firthian linguistic views and established a series of its own principles and research methods.

13:55
A corpus-based analysis of linguistic synesthesia in Korean Sign Language

ABSTRACT. Synesthesia is typically acting in the human brain due to a stimulus from various domains of perception, during which the features of one modality are transferred to another in psychology. It is also found in daily language life as well as in poetry. Synesthesia is called in linguistics as linguistic synesthesia. Ullmann (1963) proposed so-called universal hierarchical distribution from the lower sensory domains such as touch, taste, and smell to the higher domains such as hearing and vision. The present paper explores some features of the linguistic synesthesia in Korean Sign Language Corpus as an online resource provided by National Institute of Korean Language. We have two main research questions. First, do linguistic synesthesia phenomena observed in Korean Sign Language (KSL) confirm to the mapping directional route of linguistic synesthesia employed in spoken languages? Second, what are distinctive features of linguistic synesthesia in KSL, compared with other sign languages like American Sign Language and Chinese Sign Language as well as Korean (sweet voice), English (bitter sound), and Chinese as spoken languages? We hypothesize that the mapping directionality observed from sign language data is not different from that of few of previous studies of linguistic synesthesia in spoken languages thanks to the universal perception and cognition system of human beings. We will also examine that the visual domain can be emphasized meaningfully in both source and target in comparison with linguistic synesthesia in spoken language, which can be directly related to the influence of visual-spatial/visual-manual modality that sign language uses intrinsically.

14:20
Language use and structure of language knowledge in human memory: A case of word association

ABSTRACT. Our memory is vast: we retain a large amount of item-specific knowledge as well as categorise individual tokens (Brady et al., 2008; Goldberg, 2019). Accumulated experience of language use, stored in humans’ memory, changes cognitive representations of language considerably (Langacker, 1987; Tomasello, 2003). Against the background, this study investigates how individual words are associated in proportion to exposure to language use. Here we used ∆P (Allan, 1980; Gries, 2013) to measure association strength between two lexical items with a view to approximating what remains in readers’ mind as they read samkukci ‘The Three Kingdoms’, a famous novel series in Korea. We initially focused on two main characters (Yupi; Coco) and two verbs (mwut- ‘to ask’; chi- ‘to attack’) for this task. The entire series was POS-tagged, and information necessary for calculating ∆P was extracted automatically through Java programming. Results (see below) showed that, as the series developed, one verb became a better cue of one character than the other. This finding implies how connections between lexical items may change as humans are exposed to language use in daily life. Future works will seek to incorporate corpus analysis pertaining to word association, humans’ recognition of the association, and computation modelling with regard to how information is clustered in human cognition.

Volume ∆P(Yupi|mwut-) ∆P(Yupi|chi-) ∆P(Coco|mwut-) ∆P(Coco|chi-) 1-2 0.072 -0.040 0.006 0.031 1-5 0.041 -0.011 0.004 0.050 1-10 0.024 0.005 0.001 0.061

Note. The closer ∆P(outcome|cue) is to 1, the more likely the cue co-occurs with the outcome.

13:30-15:10 Session 25B
Location: IBK Hall
13:30
Teaching English with BNClab: An online interactive platform for the analysis of British English

ABSTRACT. This presentation is a tool demonstration that will introduce BNClab, an online interactive corpus platform for the analysis of spoken British English, developed with the aim to bring corpora and corpus methods into classrooms to teach students about the use of the English language. The focus of the project is on exploring variation in different uses of English depending on the situational (register, topic, context) and speaker-related (gender, age, region etc.) variables. Corpus-based learning is a form of ‘discovery learning’ (Hammer & Gunstone, 2015) which contributes to the cognitive, pedagogical and motivational aspects of the learning process (Flowerdew 2015). It has positive effects on learning both vocabulary (Lee et al, 2018) and lexico-grammatical structures (e.g. Hinkel, 2016). BNClab offers access to a large collection of spoken British English, making it easy for teachers and students to work with spoken English. The platform contains two large samples (approx. 5M words each) of informal spoken English taken from major corpora of British English: i) The British National Corpus, representing language from the 1990s and ii) The British National Corpus 2014, representing the use from the 2010s. The two samples can be compared to observe patterns of language change or can be explored individually. BNClab offers a number of different searchers and automatic visualisations of the sociolinguistic and linguistic patterns found in the data to help students with interpretations of the findings. The platform also contains ready-made teaching materials as well as teacher handouts.

13:55
The use of delexical verbs in Korean college students’ English essays focusing on make, take, and get: A corpus-based comparative study

ABSTRACT. With recent interest in collocation learning, interest in 'delexcial verbs' or 'light verbs' has drawn substantial attention in vocabulary education. Delexical verbs are ones whose meaning is very weak itself, but whose meanings changes according to situations or contexts. The study will investigate the use of the delexical verbs in Korean college students’ English persuasive essays compared with that of English native speakers, focusing on make, take, and get. The study will select two corpora, KELC (The Korean English learner Corpus) and LOCNESS (The Louvan Corpus of Native speaker). The nouns that come together with delexical verbs will be analyzed with the noun list used in the study by Quirk (1985). The semantic meanings for each delexical verb will be based on the ones in LDOCE (Longman Dictionary of Contemporary English, 5th edition, 2009). Analysis procures are as follows. First, the study will analyze the meanings of make, take, and get written by Korean learners and native speakers. If possible, the study will also find the wrong use of delexical verbs in the Korean leaner corpus in comparison with that in the native corpus. Second, the frequency of noun phrases used together with make, take, and get will be analyzed. Finally, the study will examine the collocational expressions utilized together with these three verbs. Comparing the use of delexical verbs by learners to that of native speaker is an essential process because relatively untrained EFL learner could use native speaker’s corpora as learning material in their composition classes.

14:20
Improving the pedagogical usefulness of Word Lists by adding word stress data

ABSTRACT. Specialized wordlists are influential in English for Specific Purposes (ESP) as teachers’ resources for classroom activities. Most existing lists generally present word forms with frequency, usually with extra details on word specificity, e.g. coverage in GSL, AWL, or BNC/COCA, and some existing lists have classified words as general, academic, semi-technical and technical words to help teachers in prioritizing and choosing word forms to teach. However, these existing wordlists do not facilitate English teachers in terms of pronunciation when they deliver the selected words in classroom instructions, which could be students’ main exposure to specialized English in formal education. This study identifies basic characteristics and stress patterns of 1,146 words from the Engineering Word List (EWL) generated from the Engineering English Corpus (EEC) of 29 textbooks collected from 12 sub-disciplines. Syllable stress of individual words was identified and assigned as syllable-stressed codes, e.g. 3-2 for ‘equation’ (3 syllables; stress the second syllable). The analysis shows that words in EWL have one to five syllables. While the majority of words in EWL are one- and two-syllable words, over 80 percent of multi-syllable words are two- and three-syllable words. Most common stress patterns of words in EWL categorized based on the number of syllables are 2-1, 3-2 and 3-1, 4-2, and 5-4 for two- to five-syllable words, respectively. Sub-lists of EWL according to syllable-stressed codes are also provided to facilitate teachers on pronunciation of engineering words.

13:30-15:10 Session 25C
Chair:
13:30
A Corpus-Based Analysis of Dystopian Themes of Kurt Vonnegut and Ray Bradbury

ABSTRACT. This study aims to discuss the different characteristics of dystopian themes of two American writers, Ray Bradbury and Kurt Vonnegut, through a corpus-based analysis of their short stories. Both authors published their works related to the political, social, and technological changes, and often set up hypothetical situations or create a dystopian landscape. They warn about dangers including the war, extreme dependency on machinery, or growing indifference to nature and other human beings. Nevertheless, their approaches to theme are not identical: while Bradbury is known for recurring, connotative motifs and poetic imagery that relies on word variation (Foster 1973), Vonnegut expresses pessimism with glimpses of hope that causes discomfort, but carries depth (Hume 1982). The contrast in theme and style is most visible throughout their vocabulary. In this study, a corpus of 10 Vonnegut short stories and another of 12 Brabury were constructed, each consisting of 43090 and 46802 tokens respectively. These corpora consist of two types of stories: works that describe hypothetical situations related to contemporary social issues; or thoses that display dystopian societies. By using the AntConc, a list of the most frequent words was created for each corpus, and different semantic categories emerged from these two lists: nature/climate, politics or technology. Moreover, collocational properties of these keywords revealed authors’ attitudes towards the main themes. The empirical approach adopted in this study may provide an objective and reliable foundation for literary research and education: it can contribute to an in-depth examination of an often overlooked element of their canons.

13:55
Personification Song Of "Aamiin Paling Serius" By Nadin Amizah And Salpriadi

ABSTRACT. Personification is a type of figurative word that uses language as if it gave up inanimate functions that could be considered human. The method which is used in this research is descriptive method. In collecting data, the writer focused on identifying the words, phrases or noun. The results show in terms of the contents of the lyrics that lead to personification, the writer finds the song lyrics can be Categorized as personification. The data have been collected and then analyzed using the supporting theories in this research. The theory used in this research is semantic style of language. semantic theory is needed to analyze every personification meaning contained in the lyrics of the song song Nadin Amizah and Sal Priadi titled Amin most serious. From the analysis, the researchers collected 7 figurative words of personifications with different meanings and had the entire meaning of this song with the most serious bond of love with the most serious amen like the tittle of the song.

14:20
Digital Textuality and the Salient Features of Instagram Poetry

ABSTRACT. Digital literary genres are bound to change or disappear like their dynamic digital platforms (Rettberg, 2016; Trimarco, 2015; Ciccorrico, 2012), so constant preservation of these genres are vital. With this notion, this paper aims to establish Instagram poetry as an emerging genre of digital literature by determining the salient features of the said genre. This study is a preliminary effort to recognize and preserve Instagram poetry, and promote the reading and writing of the said genre among EFL and ESL teachers and learners.

To discover the salient features of Instagram poetry, traditional and digital frameworks were used in the analysis. Bhatia’s (1993) genre analysis was used to examine the corpus of the study, together with the theories for digital textuality (Trimarco, 2015) which included De Beaugrande and Dressler's (1981) seven standards of textuality and Peirce's (1958) social semiotic triangle.

The analysis of the study showed that Instagram poems used short and simplistic words. The meanings of the poems were also motivational and sentimental, and these meanings were highly affected by the aesthetics of their visual styles. Notably, no particular number of lines were established in the poems. The present findings suggest that the new features of Instagram such as Instagram Stories should also be analyzed to identify more notable features from Instagram poetry’s changing digital environment.

14:45
Stylometrics Analysis for Authorship Attribution: Determining the Writer of “Joseonminsiron”

ABSTRACT. Over the years, academic research on identifying the author of “Joseonminsiron” has led to the creation of two contesting camps within Korean Studies. Scholars like Lee Young-hwa (2003), Lee Kyeong-Don (2005), Ryu Si-hyun (2008), and Cha Nam Hee (2014) maintain that Choe Nam-seon wrote “Joseonminsiron” while other scholars like Kim Jong-Kyun (1996) and Lim Kyu-Chan (2004) consider it to be the work of Yeom Sang-seop. By identifying the author of "Joseonminsiron”, the breadth of research on Korea’s modern history can gain greater depths of clarity and understanding in three ways: 1) reevaluating the dynamics of power and resistance in colonized Korea; 2) examining how much of an impact the intellectuals’ – specifically Choe’s or Yeom’s – reconstructed nationalism and mobilized the spirit of resistance among Koreans in the wake of the Independence Movement; and 3) contributing to the existing research on Korean history and thought by exemplifying how works like “Joseonminsiron” served as a catalyst that propelled national ideology and resistance among the Korean people.

13:30-15:10 Session 25D
Location: Helinox Hall
13:30
A Study on the Discourse Markers of Chinese L1 Korean Learners: Focusing on Comparison between Spoken and written Learner corpus

ABSTRACT. This study aims at describing and analyzing discourse markers on Chinese learners of Korean, and comparing the results between written and spoken. This research base on Learner Corpus with 697,215 words (spoken corpus: 22,662 words; written corpus: 674,553 words). The results of this study show that: 1) Discourse markers is used in spoken corpus than written corpus. A usage frequency of discourse marker per 1000 word is 0.2 and 180 in written and spoken corpus each; 2) Type of discourse marker in spoken is also more various than written corpus. The type frequency of discourse markers in spoken is 193 and the type frequency of discourse markers in spoken is 34. 3) ‘Ah’(아), ‘Nae’(네), ‘Cham’(참), ‘Ya’(야), ‘Euo-di’(어디), ‘Wa’(와), ‘Geu-rae’(그래), ‘Muo’(뭐), ‘Eng’(응), ‘Ja(자) are high 10 frequency rank in written corpus, ‘Euo(어)’, ‘Umm(음)’, ‘Nae(네)’, ‘Ah(아)’, ‘Geu(그)’, ‘Eng’(응), ‘Eou(어)’, ‘Umm(음)’, ‘Muo(뭐)’ are high 10 frequency rank in spoken corpus. Thus, it seems reasonable to conclude that Korean discourse marker can be easily appeared in spoken context and separated to spoken oriented discourse marker and written oriented discourse marker.

13:55
Longitudinal Study on Acquisition and Development of Connective Endings of Korean Language Learners

ABSTRACT. This study examines Korean language learners’ order of acquiring connective endings and the specific aspect of its acquisition. Korean connective endings are grammar elements that put together clauses to create a single sentence and plays a significant role in the learners’ syntactic development. Connective endings possess diverse meanings following the semantic relations between clauses. Meanwhile, because several connective endings show a slight difference in its nuance although within the same semantic category, and that each connective ending show morphologic, and syntactic difference in its regulations, Korean language educators must pay caution to their arrangement. The corpus of this study is the longitudinally collected spontaneous speeches of 24 learners. The speeches were periodically collected ten times, starting when the subjects began learning Korean; 3 million words of data have been collected. We analyzed the frequency of each connective endings, the number of its users, error rates and error aspects, revealed the order of acquiring connective endings and observed the aspects of the acquisition. Furthermore, we will compare the acquiring order with the instructing order of each connective ending to investigate their relation. This study is important because it is the first research, based on longitudinally collected spontaneous speech data of Korean language learners to examine the development of connective endings. The result of this study is expected to contribute to promoting a better instructing arrangement of each connective endings in Korean language education.

14:20
A study on vocabulary related to “朝鮮人” that used in the Japanese colonial Korean newspaper

ABSTRACT. This study explored the expression of “Koreans” that appeared in Korean newspapers during the Japanese colonial period. During the Japanese colonial period, the names of Korea and Japan have several expressions that we are familiar with, such as “朝鮮”, “日本”, “皇國” and “大日本”, etc. And at this time, there were “Korean” as “朝鮮人” and “Korean” as “Chosenjin”. For the research data, the Korean History Database of the National History Compilation Committee and the Korean Newspaper Archives of the National Library of Korea were used.From the Korean history database, DB of newspapers built in “Colonial Period of Japan” material, “共立新報” (1905-1909) that did not fit in time out of 9 types, was not enough in quantity 7 types excluding “朝鮮時報”, “東亞日報”, “時代日報”, “中外日報”, “中央日報”, “朝鮮中央日報”, “新韓民報”, and Kyujanggak old literature library the total amount of newspaper scrap built in was investigated. From the Korean Newspaper Archives, we examined 10 types published from 1910 to 1945 in a database of 82 types of newspapers. Of these, four types of “時代日報”, “中外日報”, “中央日報”, and “朝鮮中央日報” that overlapped with the Korean History Database investigated newspapers that had their publication date earlier, and the publication date If they were the same, we chose newspapers that had been published until later. At that time, most of the expressions referring to Koreans were “朝鮮人” as expected, but “people with Korean nationality” and “ethnic slur” coexisted here. In addition to “朝鮮人”, related vocabularies such as “皇國臣民”, “皇民”, “鮮人”, “요보 ”, and “半島人” also appeared.

14:45
Statistical Analysis on Linguistic Features Affecting Coreference in Korean

ABSTRACT. The purpose of this study is to investigate the distributions of co-referring expressions in Korean. Based on the empirical study, especially, I suggest a statistical analysis on Korean coreference resolution. It is well known that deep learning-based approach to coreference resolution, which needs few linguistic knowledge, has improved the performance of the coreference resolvers, but the performance itself has not reached to the satisfactory level yet. It indicates the need of specifying linguistic features which still can play important role in developing a better resolver. Compared to the broad statistical analysis in other languages, there are insufficient empirical studies on Korean coreference to extract the useful linguistic features. By using a small coreference-annotated corpus, which consists of 120 newswire articles, I analyze several constraints on the choice of types of anaphora in terms of heuristics and cognitive concepts, such as accessibility, discourse prominence, etc. Besides, the paper identifies the linguistic features affecting coreference by means of a multivariate statistical technique, Correspondence Analysis. Consequently, I show that several linguistic features including salience of antecedents affect the choice of types of anaphor. The implication of current study is that designing a coreference resolution model with these features will help achieve significant improvements on performance of the model.

13:30-14:20 Session 25E: [Keynote 7] Hong Won Suh. Narrative strategies in early translations of English literature into Korean, late 19th to early 20th century: a corpus-based study

Keynote Session

Location: Grand Ballroom
13:30
Narrative strategies in early translations of English literature into Korean, Chinese and Japanese, late 19th to early 20th century: A corpus-based comparative study

ABSTRACT. TBA

14:20-15:10 Session 26: [Keynote 8] Mark Brenchley. Developing a corpus-embedded architecture: the Cambridge English approach

Keynote Session

Location: Grand Ballroom
14:20
Developing a corpus-embeded architecture: the Cambridge English approach

ABSTRACT. TBA

15:20-16:20 Session 27: [Plenary 6] Michaela Mahlberg. Corpus Linguistics and the digital humanities

Plenary Session

Location: Grand Ballroom
15:20
Corpus linguistics and the digital humanities

ABSTRACT. TBA