CFP

CLBib-2017: Second Workshop on Mining Scientific Papers: Computational Linguistics and Bibliometrics

Wuhan

Wuhan, China, October 16-20, 2017

Conference web page	http://www.issi2017.org/
Submission link	https://easychair.org/conferences/?conf=clbib2017
Abstract registration deadline	September 3, 2017
Submission deadline	September 10, 2017
Notification of acceptance	October 8, 2017
Camera-ready papers	October 16, 2017

Topics: citation analysis information retrieval natural language processing bibliometrics

Update: List of accepted papers and workshop programme added (October 13, 2017).

Scope and Motivation

The open access movement in scientific publishing and search engines like Google Scholar have made scientific articles more broadly accessible. During the last decade, the availability of scientific papers in full text has become more and more widespread thanks to the growing number of publications on online platforms such as ArXiv and CiteSeer.

The efforts to provide articles in machine-readable formats and the rise of Open Access publishing have resulted in a number of standardized formats for scientific papers (such as NLM-JATS, TEI, DocBook), full-text datasets for research experiments (PubMed, JSTOR, etc.) and corpora (iSearch, etc.). At the same time, research in the field of Natural Language Processing have provided a number of open source tools for versatile text processing (e.g. NLTK, Mallet, OpenNLP, CoreNLP, Gate, CiteSpace).

Scientific papers are highly structured texts and display specific properties related to their references but also argumentative and rhetorical structure. Recent research in this field has concentrated on the construction of ontologies for citations and scientific articles (e.g. CiTO, LinkedScience) and studies of the distribution of references. However, up to now full-text mining efforts are rarely used to provide data for bibliometric analyses. While bibliometrics traditionally relies on the analysis of metadata of scientific papers (see e.g. a recent special issue on Combining Bibliometrics and Information Retrieval, Mayr & Scharnhorst, 2015), we will explore the ways full-text processing of scientific papers and linguistic analyses can play. With this workshop we like to discuss novel approaches and provide insights into scientific writing that can bring new perspectives to understand both the nature of citations and the nature of scientific articles. The possibility to enrich metadata by the full-text processing of papers offers new fields of application to bibliometrics studies.

Working with full text allows us to go beyond metadata used in bibliometrics. Full text offers a new field of investigation, where the major problems arise around the organization and structure of text, the extraction of information and its representation on the level of metadata. Furthermore, the study of contexts around in-text citations offers new perspectives related to the semantic dimension of citations. The analyses of citation contexts and the semantic categorization of publications will allow us to rethink co-citation networks, bibliographic coupling and other bibliometric techniques.

Goals of the workshop

The workshop aims to bring together researchers in bibliometrics and computational linguistics in order to study the ways bibliometrics can benefit from large-scale text analytics and sense mining of scientific papers, thus exploring the interdisciplinarity of Bibliometrics and Natural Language Processing.

The first edition of this workshop, co-located with ISSI 2015, attracted more than 70 participants and six full paper contributions, showing a large interest in these topics in the community. The goal of this second edition of the workshop is to continue to encourage the collaboration between these two domains and to answer questions like: How can we enhance author network analysis and Bibliometrics using data obtained by text analytics? What insights can NLP provide on the structure of scientific writing, on citation networks, and on in-text citation analysis?

See the proceedings of the first edition of the workshop: http://ceur-ws.org/Vol-1384/.

List of accepted papers

Dongxiao Gu, Bo Liu, Isabelle Bichindaritz and Changyong Liang. Temporal Evolution, Research Themes, and Emerging Trends in Case-Based Reasoning Literature
Jie Wang and Chengzhi Zhang. CitationAS: A Summary Generation Tool Based on Clustering of Retrieved Citation Content
Yufang Peng, Dongxiao Gu and Jin Shi. Mining the Potential Collaborative Relationships Based on the Author Keyword Coupling Analysis and Social Network Analysis
Jiangen He and Chaomei Chen. Understanding the Changing Roles of Scientific Publications via Citation Embeddings

Workshop Programme

The workshop will be held on October 17 Tuesday 2017 at Wuhan, China.

	Title	Presenter
14:00-14:30	Introduction to the workshop	Marc Bertin and Iana Atanassova
14:30-15:00	Understanding the Changing Roles of Scientific Publications via Citation Embeddings	Jiangen He and Chaomei Chen
15:00-15:30	CitationAS: A Summary Generation Tool Based on Clustering of Retrieved Citation Content	Jie Wang and Chengzhi Zhang
15:30-16:00	Coffee break
16:00-16:30	Temporal Evolution, Research Themes, and Emerging Trends in Case-Based Reasoning Literature	Dongxiao Gu, Bo Liu, Isabelle Bichindaritz and Changyong Liang
16:30-17:00	Mining the Potential Collaborative Relationships Based on the Author Keyword Coupling Analysis and Social Network Analysis	Yufang Peng, Dongxiao Gu and Jin Shi
17:00-17:10	Summary and outlook

Submission Guidelines

All papers must be original and not simultaneously submitted to another journal or conference.

All submissions must be written in English up to 6 pages and following the ISSI 2017 Template for full papers. Long papers up to 12 pages are also accepted.

Submissions require registration as a user in the EasyChair system. Please go to https://easychair.org/conferences/?conf=clbib2017 to register.

All submissions will be reviewed by at least two independent reviewers. Please be aware of the fact that at least one author per paper needs to register for the workshop and attend the workshop to present the work.

The accepted papers will be invited for a publication in a special issue of the Journal Frontiers in Research Metrics and Analytics.

List of Topics

Linguistic modeling and discourse analysis for scientific texts
User interfaces, text representations and visualizations
Structure of scientific articles (discourse / argumentative / rhetorical / social)
Scientific corpora and paper standards
Act of citations, in-text citations and Content Citation Analysis
Co-citation and bibliographic coupling
Text enhanced bibliographic coupling
Terminology extraction
Text mining and information extraction
Scientific information retrieval
Ontological descriptions of scientific content
Knowledge extraction

The workshop will involve research project reports, system demonstrations and a panel discussion on the perspectives for the development of new text analytics approaches for bibliometrics.

Committees

Program Committee (to be confirmed)

Lee Giles (College of Information Sciences and Technology, Pennsylvania State University, USA)
Yves Gingras (CIRST, Université du Québec à Montréal, Canada)
Vincent Lariviere (EBSI, Université de Montréal, Canada)
Stefanie Haustein (EBSI, Université de Montréal, Canada)
Timothy Bowman (EBSI, Université de Montréal, Canada)
Cassidy R. Sugimoto (School of Informatics and Computing, Indiana University, USA)
Sylviane Cardey (Centre Tesniere - CRIT, Université de Bourgogne Franche-Comte, France)
Cherifa Boukacem (Elico, Université Claude Bernard Lyon 1, France)
Guillaume Cabanac (IRIT, Université de Toulouse, France)
Beatrice Milard (Université de Toulouse 2, France)
Ruslan Mitkov (University of Wolverhampton, England)
Constantin Orasan (University of Wolverhampton, England)
Tomi Kauppinen (Aalto University, Finland)
Roman Kern (Know-Center, Austria)
Angelo Di Iorio (Department of Computer Science and Engineering, University of Bologna, Italy)

Organizing committee

Iana Atanassova (Centre Tesnière - CRIT, Université de Bourgogne Franche-Comté, France)
Marc Bertin (Elico, Université Claude Bernard Lyon 1, France)
Philipp Mayr (GESIS - Leibniz Institute for the Social Sciences, Germany)

Related workshops

First Workshop on Mining Scientific Papers: Computational Linguistics and Bibliometrics (CLBib 2015) at ISSI 2015. Workshop proceedings can be found at: http://ceur-ws.org/Vol-1384/.
Bibliometric-enhanced Information Retrieval (BIR) at ECIR 2014-2017 and JCDL 2016.
Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL) at JCDL 2016 and SIGIR 2017.

Venue

The ISSI 2017 conference and the workshop will be held in Wuhan, China from 16 to 20 October 2017.

More information can be found at: http://www.issi2017.org/.

Acknowledgements

Part of this research has been funded by the FEDER (Fonds européen de développement régional) and selected by the French-Swiss programme Interreg V: Webso+ project (http://tesniere.univ-fcomte.fr/projet-webso/).

Contact

All questions about submissions should be emailed to iana.atanassova (at) univ-fcomte.fr.