Program for Friday, May 22nd

GTM 2026: 16TH GLOBAL TECHMINING CONFERENCE

PROGRAM AUTHORS KEYWORDS

PROGRAM FOR FRIDAY, MAY 22ND

Days:

View: session overview talk overview

08:30-09:10 Session 17: Keynote: The Hidden Ecology of Scientific Writing (Scott Cunningham)

Chair:

Location: Chuyun Hall

09:10-09:50 Session 18: Keynote: Towards an Observatory for Global Research Systems: Open Data, New Indicators, and Evidence from China (Lin Zhang)

Chair:

Location: Chuyun Hall

09:50-10:10Coffee Break

10:10-11:50 Session 19A: Science and Technology Landscape

Chair:

Location: Yellow Crane Hall

10:10	Yixuan Zhang Mapping Technology Landscapes and Optimization Pathways for Age-Friendly Mobile Applications ABSTRACT. This study aims to systematically identify core technical features of current age-friendly apps and quantify differences in aging-friendly technologies across app categories through technical mining methods, thereby providing scientific data support for app optimization. With app store reviews, product descriptions, and interview transcripts of elderly users as research materials, we adopt LDA topic modeling to build a research framework featuring “text data mining-feature extraction”. Results indicate that technical mining successfully identified “large font display,” “voice interaction,” and “simplified navigation” as high-frequency technical features in age-friendly apps. Financial payment apps performed worst in “simplified operation processes,” while news and information apps showed significant deficiencies in “information density control.” Positive mentions related to “voice assistance features” and “real-time feedback mechanisms” in elderly user interview texts showed a significant positive correlation with the intensity of technical application. This study expands the application scope of technical mining in evaluating age-friendly products, providing crucial decision-making support for enterprise app iterations and regulatory bodies in establishing technical standards.
10:30	Jiahui Huang Identification of Technological Innovation Types Based on Multi-source Heterogeneous Information Network Under Policy Orientation ABSTRACT. Against the backdrop of strengthening the strategic layout of science and technology and promoting industrial transformation towards innovation-driven development,policies have become the core basis for defining key areas of technological innovation and allocating innovation resources.However, current research on technological innovation generally suffers from insufficient integration of the policy dimension,lack of multi-source heterogeneous data fusion,and weak forward-looking prediction capabilities.In response to these issues,this study integrates policy documents,patent literature,and scientific papers to construct a policy-oriented multi-source heterogeneous information network (HIN).Furthermore,the HAN-LSTM model is designed to achieve collaborative fusion of multi-dimensional node features and mine potential policy-technology associations through link prediction.A three-dimensional indicator system is then established to identify types of technological innovation.Subsequently, an empirical study is conducted focusing on the field of solar cells.Experimental results demonstrate that the HAN-LSTM model significantly outperforms baseline models in link prediction performance,effectively identifying incremental technological innovations and radical technological innovations under policy orientation,and providing differentiated guidance on innovation directions for different types of enterprises.This paper not only improves the identification method of technological innovation types under policy orientation but also expands the application scenarios of heterogeneous information networks in the field of technological innovation,thereby providing support for promoting the efficient allocation of policy-led science and technology resources and enterprise innovation decision-making.
10:50	Lidan Jiang and Jingyan Chen Identifying the “Sleeping Beauty” in Science: Can Generative AI Predict the Future Impact of Non-Consensus Innovations? PRESENTER: Lidan Jiang ABSTRACT. This study investigates whether Generative AI (GenAI) can overcome the limitations of traditional bibliometrics to identify high-potential, non-consensus scientific research early. We address the critical dilemma of whether GenAI acts as a visionary “scout” for disruptive ideas or a conservative “guardian” that amplifies existing biases. Employing a novel counterfactual historical prediction framework, we simulate an early assessment point for AI publications, using only contemporaneous data to task GenAI models with forecasting long-term impact, which is validated against patent citations and “Sleeping Beauty” papers. Expected results indicate a nuanced role for GenAI: it demonstrates conditional superiority in predicting technological impact but struggles to identify paradigm-shifting academic work and exhibits systematic biases favoring prestigious institutions. This suggests GenAI’s utility is currently better suited for spotting applied innovations than scientific revolutions, underscoring the necessity of ethical governance and human oversight in its integration into research evaluation.
11:10	Xiao Zhou, Qiaoyang Ren, Jiawei Lin, Ying Guo and Jing Ma Beyond Flat Keywords: A Hierarchical and Functional Framework for Fine-Grained Scientific Intelligence Mining PRESENTER: Qiaoyang Ren ABSTRACT. Current keyword extraction methodologies typically treat documents as flat sequences, failing to capture global hierarchical structures and often suffering from “functional blindness” regarding the specific semantic roles of keywords. To address these limitations, we propose HC-SEKE, a framework that augments a DeBERTa-v3-large backbone with a parallel Mixture-of-Experts (MoE) module and a hierarchical context scoring mechanism. Furthermore, to efficiently categorize extracted keyphrases into five functional dimensions (i.e., Task, Method, Field, Dataset, Metric), we implement a knowledge distillation and Supervised Fine-Tuning (SFT) strategy that transfers the reasoning capabilities of GLM-4.5 into a lightweight Qwen3-0.6B model, thereby ensuring low-latency inference. Empirical results across benchmark datasets (including Inspec and Krapivin) demonstrate that HC-SEKE significantly outperforms state-of-the-art supervised baselines, achieving an F1@10 score of 57.9% on Inspec (a 1.4% improvement over the SEKE baseline). Additionally, qualitative evaluations and case studies validate the functional analyzer's effectiveness, demonstrating that it can precisely categorize keywords into five functional dimensions (e.g., Method, Task) and successfully recall semantically critical long-tail terms that are often missed by statistical methods.
11:30	Dianyuan Zhang, Chuanming Yu, Lu An, Xueqing Fu and Xiping Hao Hybrid Metric-Guided Multi-Agent Debate for Keyphrase Extraction from Scientific Literature PRESENTER: Dianyuan Zhang ABSTRACT. Keyphrase Extraction (KPE) is a foundational task for navigating the exponential growth of scientific literature, facilitating essential applications such as information retrieval, document indexing, and text summarization. While Large Language Models (LLMs) have revolutionized zero-shot information extraction, existing unsupervised methods face significant challenges: (1) Hallucination, where models generate linguistically fluent but factually deviant phrases; and (2) Reasoning Stagnation, where a lack of objective self-correction mechanisms causes models to lock into erroneous initial stances, preventing the generation of new insights.To address these limitations, this paper proposes MetricMAD, a Hybrid Metric-Guided Multi-Agent Debate framework specifically designed for keyphrase extraction. We engineer an adversarial "Extractor-Critic" environment that leverages dialectical interaction to refine candidate phrases. To prevent the debate from devolving into varying consensus without quality improvement, we introduce a novel destructiveness-based hybrid metric as a hard arbitration mechanism. This metric objectively evaluates the contribution of each phrase to the document's semantic integrity, guiding the multi-agent system toward convergence on semantically precise and factually reliable keyphrases.Extensive experiments across six standard benchmarks (Inspec, Krapivin, NUS, SemEval-2010, SemEval-2017, and DUC2001) demonstrate that MetricMAD significantly outperforms strong baselines and standard LLM prompting strategies without requiring annotated training data. These results establish a new state-of-the-art for zero-shot keyphrase extraction and offer a robust methodology for high-fidelity knowledge acquisition from scientific texts.

10:10-11:50 Session 19B: Special Session on Information Extraction in the era of Large Language Models

Chair:

Discussants:

Jin Mao, Shuo Xu and Guancan Yang

Location: Chuyun Hall

11:50-12:20 Session 20: Closing Ceremony

Chair:

Location: Chuyun Hall

12:20-13:30Lunch (Buffet)

Disclaimer | Powered by EasyChair Smart Program