GTM2018: 8TH GLOBAL TECHMINING CONFERENCE
PROGRAM FOR TUESDAY, SEPTEMBER 11TH

View: session overviewtalk overview

09:00-09:30 Session 1: Welcome and Keynote

Welcome

Denise Chiavetta and Alan Porter

Keynote

"Sleeping Beauties Cited by Patents: Is there a Dormitory of Inventions?"

Dr. Ton (A.) F.J. van Raan

Location: Aalmarktzaal
09:35-10:55 Session 2A: Using Topics
Location: Aalmarktzaal
09:35
Visualizing Dispersed Risk Signals for a Specific Emerging Technology: A Novel Approach of Keywords Aggregation across Topics (KAaT)

ABSTRACT. During the past decades, the rapid growth of emerging technologies (e.g. nanotechnology, biomass energy, synthetic biology, genetic engineering and so forth) had brought huge impact on environment, economic and social systems; and some industrial symbiosis also encountered dramatic or revolutionary changes and new opportunities. Meanwhile, timely visualizing the potential or possible risk to social-economic systems and ecosystems increasingly becomes critical issue for public policy, strategic management and other relevant areas on governance. However, the studies on risk analysis or signal risk for a specific emerging technology are very trivial, and dispersed into many different categories and multidisciplinary areas. For example, the risk analyses on nanotechnology involve social communication, environment science, toxicology, occupational health and so on. Obviously, too much domain knowledge is required to identify and collect the relevant risk signals for a specific emerging technology. Based on the background mentioned above, such research questions are raised in this article: Q1: For a specific emerging technology, can we timely and efficiently visualize the risk signal / relevant works, especially in early period? Q2: How can we evaluate the visualization or analysis results, which are outputted in Q1? In terms of these two research questions mentioned above, a novel approach of keywords aggregation across topics (KAaT) is proposed; and a subsequent evaluation method also is argued. To verify the validity and completeness of KAaT, an empirical study on synthetic biology is conducted.

09:55
Text Enrichment-Based Enhanced Patent Mining using Clustering Techniques
SPEAKER: C. Okan Sakar

ABSTRACT. In order to follow the rapid developments in a high-tech field and to be leading firm in a cutting-edge technology, either R&D intensive (i.e. IBM) or innovation efficient approach (i.e. Apple) is essential. Therefore, there is an increasing effort in analysing various kinds of text data gathered from diverse data sources, such as patent sources, scientific publications, social network platforms to explore the promising technology fields and potential applications of the cutting-edge technologies. In this study, we aim to investigate, to the best of our knowledge for the first time, the usefulness of text enrichment approach in clustering of patent documents which has been successfully used in other fields with different enrichment methods. Considering the sparsity of the co-occurrence matrix that is constituted from the abstracts and titles of the patent documents, using text enrichment we aim to obtain better clustering that represent the fields in the relevant topic. The knowledge base used to enrich the co-occurrence matrix is the Wikipedia English corpus which has been successfully used for text enrichment in the literature. The results showed that text enrichment technique improves the evaluation metrics of all clustering algorithms by 6% up to 18% when compared to the results obtained with the initial matrix. This proved to be closely related with the initial sparsity of the matrix as the scores were getting higher when the term frequency threshold is increased. We also observe that the resulting clusters have a better distribution of words which form meaningful cloud computing related phrases.

10:15
Technological convergence as antecedent of technological speciation - Applying dynamic topic modelling and patent-laning to the action camera technology

ABSTRACT. In a recent paper, Moehrle and Caferoglu (2017) introduced technological speciation as a source for emerging technologies. Applying a three steps method to the mainstream camera technology, the authors identified technological speciation candidates such as action-camera, dashboard camera or depth camera. Remarkably, they found for each speciation candidate some technological elements from other knowledge fields. They kept the question open how to trace back those technological elements in order to understand better the specific speciation. In particular, it is of interest for analysts which knowledge roots characterize the speciation technology as it may help assessing speciation candidates regarding novelty and complexity. To examine this open question, we focus on action-camera and apply a four step method to understand which knowledge converged with the mainstream technology knowledge and lead to the emergence of the action-camera. Applying our method to the action-camera technology, we observe that action camera emerges due to convergence of technical knowledge about attachment & mounting, stable lens parts (image stabilization) and wide-angle lenses (image stabilization). Our approach delivers some theoretical as well as managerial implications; for instance, it proves that recombination of existing technical knowledge with other technologies can lead to the emergence of new technologies due to technological speciation. On a practical side, we provide mainstream technology managers a method to detect which characteristic knowledge is needed to enter a market niche.

10:35
Evaluation of Enterprise Technology Competitiveness Based on Technology Topics
SPEAKER: Xuemei Yang

ABSTRACT. Technology is the most fundamental part of company core competitiveness. Patent as the most important carrier of technology, has gradually become one of the objects of research on technology competitiveness at home and abroad. At present,the methods of IPC and Derwent classification codes used in patent mining are too general, short of timeliness and scientific. In this regard, this paper uses the LDA model for topic clustering to determine the probability distribution of company-patent texts-topics-feature words. The company's technology competitiveness evaluation system has been established from the company level, based on the technology coverage index, technology specialization index, and technology activity index. 3D printing technology is taken as a practical application in this paper. This method is used to research the technology competitiveness of key companies in this technology field, identifying the company's competitors and determining the company's R&D strategies. In addition, the paper also provides a new perspective for the evaluation of technology competitiveness.

09:35-10:55 Session 2B: Innovation Indicators
09:35
Insights from bibliometric network properties into technology evolution - wind energy example
SPEAKER: Elisa Boelman

ABSTRACT. This paper explores possibilities for analysing the evolution in time of basic properties of bibliometric networks, putting numbers to different network topologies observed in keyword co-occurrence maps for wind energy, to help detect and better understand patterns of technology emergence and advancement. We use the JRC's Tools for Innovation Monitoring (TIM) to retrieve bibliometric data on wind energy from the SCOPUS database, and Gephi for tabulating the network metrics. Author-keyword co-occurrence networks for 'Wind Energy' densify faster than both the more established 'Electric Power' and the more emerging sub-technology 'Wind Turbine Blades'. For 'Wind' author-keywords, changes in network structure and metrics suggest evolution from a relatively sparse to a centre-periphery structure with a visible giant component. The results seem to confirm the potential relevance of bibliometric-network metrics for mapping technology families.

09:55
Knowledge Transfer in Industry-University-Research Institute Collaboration Network: A Perspective of Crossing the “Valley of Death”

ABSTRACT. With the enterprises participating in the innovation activities, the Industry-University-Research Institute Collaboration(IURIC) has become an emerging innovation mode. By means of this collaboration, it is possible to make a bridge for the transfer of new knowledge and technologies generated within the university or research institutes. This transfer can be a way of crossing the “Valley of Death”, which refers to often problematic shift from research to product development. From the perspective of crossing the “Valley of Death”, we focus on the research question--how can we better identify the potential partners of IURIC who may have the potential of knowledge transfer? The solution of this question may be a contribution to cross the “Valley of Death”. We address this question by combing with analytical method of multilayer complex networks. In this study, the innovation actors (firms, universities and research institutes) and diverse knowledge are heterogeneous. By displaying the knowledge in a single layer, the link between layer and layer can show the process of knowledge transfer clearly, and interaction between innovation actors and knowledge can be analyzed by node attributes in multilayer networks. All above can help select the potential partners of IURIC and transferable knowledge. We design the research framework including three steps and detail it in the extended abstract.

10:15
How do the innovative technologies spread?

ABSTRACT. Technological innovation has been influencing the economy and society. It is essential in order to survive intense competition and market saturation. It may be effective to pursue the technological innovation in emerging areas. The emerging areas have various topics, and such topics experience the evolution, such as the co-existence, competition, or extinction. Therefore, understanding the topic dynamics of emerging innovations can contribute to further pursuing the technological development. Particularly, the city can be considered as an important unit of innovation. The occurrence of technological innovation can vary by cities, since technological advances of economic agents can generate innovations in regions. Furthermore, the spillovers of technological innovation among regions have become more important and have spread in the open innovation system.

How can the diffusion of innovative ideas globally occur? What does it imply to the technology management? In this paper, In terms of technology flow, this paper identifies the emerging areas from the entire triadic patents. Then, Latent Dirichlet Allocation (LDA), the topic modeling technique, is applied for extracting the hot topics from the triadic patents and their IPCs on the emerging areas. LDA can methodologically represent that the technology can be affiliated with multiple topics. The structure of spread dynamics is examined with use of the network model of Susceptible-Infectious-Susceptible. The results imply that the policy effort for facilitating the global spillover of innovation is important for pursuing the technological innovation. Also, findings imply that such effort needs to be in proper manner for globally profiting from innovation.

10:35
The Techno Economic Segments (TES) analytical approach applied to the Artificial Intelligence emerging ecosystems in the perspective of policy support

ABSTRACT. Digital transformation is making more important for policy makers to better understand technological segments and business ecosystems which are key but cannot be tracked by referring to available official statistics neither correspond to existing industrial sectors. In the case of Artificial Intelligence (AI), the understanding of this segment is expected of pivotal importance for the deep impacts it is expected to have on every aspects of economy and society. Policy makers can benefit from knowing the value chain of the emerging AI techno-economic segment (TES), understanding the different involvement and roles of stakeholders worldwide and interpreting the evolutionary trends while they are developing. To be able to monitor and interpret the characteristics, boundaries and dynamics of the AI TES, the "TES" analytical approach is applied, aiming at the identification and analysis of techno-economic segments based on tech-mining of a number of data sources (including for example, patent documents, bibliographic information, other data on innovative activities and funding but also specific literature and media among others sources) and on application of analytical methods including pattern recognition, topic modelling, knowledge discovery and relying on a graph database. The objective is to describe the AI ecosystems and networks of players but also of technologies, and of locations, finally extracting aspects relevant to a policy-oriented analysis (assessment of strengths and weaknesses, orientation or evaluation of R&D programs, etc.). Multilayer network analysis and identification of communities allow unfolding hidden behaviour patterns or dynamics, and topic modelling to understating emerging specific technology combinations over time.

10:55-11:20Coffee Break
11:20-12:40 Session 3A: Science-Technology
Location: Aalmarktzaal
11:20
Study on innovation path recognition based on topics association of science and technology
SPEAKER: Haiyun Xu

ABSTRACT. This study focuses on both science and technology through bibliometric analysis, thereby exploring the identification method of innovation paths based on the association of science and technology topics. The structure of this paper is as follows: firstly, we define the connotations, concepts and research status of scientific innovation path. Secondly, we discuss the relationship between science-technology based on the existing researches. Thirdly, in order to recognize the innovation evolution path in an interactive perspective, we reveal the interaction of science and technology by analyzing the relevance among science and technology innovation topics in a micro level. At the same time, the genetic engineered vaccine (GEV) is selected as our empirical field. Finally, we summarize the contributions and limitations of this study.

11:40
Science-technology interactions: using NPLRs as glue

ABSTRACT. Scientific and scholarly research may result in a new discovery. The nature and impact of such a discovery on the cognitive structure and evolution of science may vary considerably. The impact of discoveries may extend beyond the domain of science and may be crucial steps towards technological applications, and to innovations and products. Scientific discoveries and their incorporation in technology are often interlinked in complex ways within research and development (R&D) systems. Such interactions may span several years, decades, or even centuries. The complex relations between scientific discoveries and technological developments has already — for dozens of years — been the subject of several studies. Some of the well-known landmark studies are the work conducted by Jewkes et al. (1958) and the Hindsight study (Isenson, 1969). The goal of these studies was not only to identify linkages between scientific discoveries and technological developments but also in finding relevant conditions that play an important role. We use non-patent literature references in patents (NPLRs) to construct an infrastructure to analyse science-technology linkages. One of our goals is to use this infrastructure to detect new and emerging technologies.

12:00
Neural Network-Based Paper-Matching with Relevant Products through Patents
SPEAKER: Seonho Hwang

ABSTRACT. Firms would like to take advantage of research publications to interpret research activities from a viewpoint of products of interest to plan future R&D that can lead to launching successful products. That, however, is not easy because publications are of knowledge-level and products are of artifact-level. We propose a methodology by which research publications can be linked to product fields and examined from the product perspective. For this, we used a classifier that was implemented by a CNN-based neural network trained by large size of patent data (122,411 patents) and word vectors generated from a word embedding algorithm. We applied the methodology to Google’s publications in ‘machine intelligence’ area as this area has more publications than any other area at Google. We were able to find major product fields for which the machine intelligence is actively utilized at Google and the changes in the distributions of the product fields over time. We also compared the distribution in publications with that in patents to figure out the difference between two different domains from the common product perspective. We think the methodology can be extended from one research topic in a firm to multiple topics in multiple firms to construct the research landscape from a product perspective.

12:20
Technology opportunity identification combining SAO semantic analysis and link prediction
SPEAKER: Jia Li

ABSTRACT. Technology opportunity identification has been regarded as a crucial process for companies due to the success of many entrepreneurs who have identified and exploited these opportunities Most of the researchers focus on finding emerging technologies based on Keyword-based analysis(KWA). However, it cannot represent how a technology is used and how it interacts with other technologies. Thus, Subject-Action-Object(SAO) analysis is generally used to solve the shortcoming of the keyword-based method. But previous studies simply analyze the SAO network characteristics using degree analysis, centrality analysis or other SNA (social network analysis) indices, having no predictive effect on future links among nodes, or combine S and O that are not connected by manual observation, time-consuming, laborious and lacking of prediction accuracy. To solve the problem, we build a SAO network by using SAO structures extracted from the abstracts of the patent documents and predict links between S and O having no links at present by link prediction. We regard the network as unauthorized and undirected and make a preliminary prediction using Common Neighbors(CN), Adamic-Adar(AA) and Resource Allocation(RA). Finally, we use a patent dataset of 3D printing which builds objects by layer-by-layer printing based on digital model files as a case study. The case study to measure the connection possibility between two nodes indicates the feasibility of our method and accuracy of link prediction.

11:20-12:40 Session 3B: Methods in Technology Management
Chair:
11:20
Exploring Barriers to Interdisciplinary Research (IDR)

ABSTRACT. Interdisciplinary research (IDR) – i.e. research that builds on a set of theories, data, and methods that are not available within a single discipline – is conceived as capable of generating novel knowledge to address complex societal problems. Nonetheless, our understanding of which barriers hinder researchers from undertaking IDR is somewhat limited. To address this gap, we surveyed three groups of stakeholders in the UK Higher Education (HE) system: a sample of 16,625 researchers that we identified on the basis of Web of Science publication records from 2013 to 2015; a sample of 1,080 research managers that we identified by examining the websites of 15 UK HE institutions; and a sample of 962 managers in research funding organisations that we identified from funding calls released from May 2015 to May 2016. We received 2,183 responses from researchers, 367 from managers in HE institutions, and 94 from managers in research funding organisations. The survey analysis revealed barriers to IDR in relation to (i) collaboration, (ii) career, (iii) evaluation, and (iv) funding. For example, respondents were particularly concerned with the need of more time and resources for IDR to enable researchers to identify partners and to develop shared languages. Also, recruitment and promotion criteria were reported to hinder considerably IDR efforts since IDR is often perceived as being less rigorous than more established lines of research. Publishing of the outcomes of IDR efforts in leading disciplinary journals were also found to be more challenging together with receiving funding for IDR research proposals.

11:40
A new method for Monitoring Competitors’ Innovation Activities. Creating Competitive Patent Maps Based on Semantic Anchor Points

ABSTRACT. The movements of competitors, their innovative endeavors and the targets of their efforts provide necessary information for the business intelligence of a company. Observing competitors’ publications and websites or existing products can answer some questions, but can offer future oriented insights only to a limited degree. In contrast, the patenting behavior is one future-oriented indicator for competitive innovation activities and thus can be used as a proxy data for monitoring processes (Peeters and de la Potterie, 2006). The early availability of patents and their structure confirm the advantage of patents. Usually, research and analysis of competitors’ patents are done manually. For sure, this leads to qualitative information, but quantitative information could be used to measure and visualize what competitors really do. For this purpose, we will adapt and develop further an approach based on a method introduced by Moehrle and Passing (2016) and Passing (2017) for the analysis of technological convergence. The primary idea behind this approach is to use semantic analyses to take the unstructured data of patents into account. To introduce this approach for monitoring competitors and their innovation activities, we use four design decisions. Beginning with the operationalization of the competitive environment in design decision 1, we develop semantic anchor points in design decision 2. In design decision 3, we measure semantic similarities between selected patents and the semantic anchor points. In design decision 4, we analyze the data in different ways and show the competitive landscape of the analyzed companies.

12:00
Exploring TF-IDF encoding for comprehensive industry partners’ selection
SPEAKER: Yuri Campbell

ABSTRACT. We propose a novel method for public research organizations in the search for industry partners, which differs from existing approaches with regard to its range. While most approaches are mine information only from companies with R&D and/or patent filling activities, our approach is able to consider a vast spectrum of potential partners by using widely available data, as text content and firm attributes from commercial databases. We show that the TF-IDF information encoding technique together with an indirect technological fit estimation can reliably identify promising industry partners from a wide range of companies. We test the performance of the approach using data on cooperations of over sixty research institutes belonging to a large research organization in Europe. Using the proposed indicators in a classifier model turns out to be a powerful tool for predicting cooperation activity, especially when combined with some economic indicators like turnover, age and number of managers hired by the company.

12:20
Identifying Potential R&D Partners: Combination of Technology Complementarity and Absorptive Capacity
SPEAKER: Yali Qiao

ABSTRACT. In current fiercely competitive environment, firms are compelled to integrate external technology to sustain their innovation capability through collaboration. In order to achieve desired innovation, it is essential for firms to identify potential appropriate R&D collaborators. Previous research on identifying potential partners mostly focused on technology similarity or enterprises’ acquisition and development ability, but ignored the importance of technology diversification for innovation. Yet, research on complementary technology neglected the fact that too much heterogeneity hinders enterprises’ absorption thus lower innovation performance. To bridge these gaps, we proposed a systematic framework to help enterprises choose appropriate R&D partners by combining technology complementarity with enterprises’ own absorptive capacity. And the framework mainly involves three perspectives: Firstly, we introduced an improved method to measure technology complementarity to identify potential R&D partners. Then we proposed a framework to evaluate enterprises’ absorptive capacity to help find potential partners with appropriate complementary technology that they can absorb. Finally, by exploring the aimed partners’ development strategy and collaborating willingness, potential collaborating candidates will finally be located. Comparing with single-perspective methods, the framework proposed in this paper stressed the importance of both technology complementarity and enterprises’ absorptive capacity, and can help extend their scope of invention search and create higher quality inventions.

12:40-14:10Lunch Break
13:15-13:55 Session 4: Power Talks
Location: Aalmarktzaal
13:15
DETERMINATION OF TECHNOLOGY FRONTS AND DYNAMICS OF CHANGE IN 3D PRINTING USING TECH MINING AND LDA ANALYSIS

ABSTRACT. 3d printing is one of the technologies that aim at transforming the way manufacturing operations are designed and managed all over the world. This paper presents a new approach for determining the “technology fronts” underlying the development of 3d printing technology, using Latent Dirichlet Allocation (LDA) combined with tech mining techniques for the identification and dynamic characterization of the main fronts where actual technology solutions are put into practice. The results show that the development of new materials and the methods for automatic obtaining and processing of 3D printing data are present in all the years analyzed, while Selective Laser Sintering (SLS), transmission & positioning and printing head positioning mechanics gain relevance in the latter years. Forward and backward citation analysis of each topic shows that transmission & positioning and SLS topics increase their dynamism and relevance, and results of text mining analysis bear out these trends, showing increasing divergence in the relevance of the main technology concepts each topic deals with, as well as a noticeable emerging of new concepts. IPC analysis shows a significant and almost simultaneous increase in the amount of new IPC’s per patent in 3d printing data, printing materials and SLS topics, the latter two possibly pointing at an increasingly relevant area of development of new materials for laser-melting based 3d printing techniques, as other evidences we are collecting may point at. We consider that this work presents an effective method for the identification and dynamic characterization of “technology problems” underlying a broad technology field.

13:20
Science Map of Nature Index High-quality Research
SPEAKER: Guopeng Li

ABSTRACT. Determining recent science structure and research fronts at early stages is extremely helpful for scientific decision makers to make R&D policies and develop future movements. This research is aimed to create a science map for identifying the latest research structure and fronts of natural science based on the most recent high-quality journals selected by Nature Index (NI). The science map has been implemented using the following steps: Research topics detection by the SLM clustering with hybrid similarity (Bibliographic coupling and text), macro structure visualization of four basic research disciplines by force direct graph with OpenOrd layout, subject group division by K-means and the DBSCAN clustering. Once the research structure was discovered, we calculated and analyzed the distribution of the diversity and altmetrics score over topics and topic groups. Differing from existing Nature Index statistical indicators, this research compared the outputs, layouts and impact of research fronts by overlaying the publications shared among countries visually on the map.

13:25
Detecting the Landscapes and Hotspots of Scientometrics: A Full-Text Citation Analysis based on Semantic Technology
SPEAKER: Zili Li

ABSTRACT. With the rapid development of information technology, the era of Big Data has come. Big Data technology has brought great opportunities for the research of technology mining, while the "data dizzy" and "data redundancy" effects brought by it cannot be ignored. As one of the basic methods of technology mining, the research of scien-tometrics also faces the same opportunities and challenges. In order to meet the challenges, an in-depth analysis of scientometrics was conducted. By using the papers of Scientometrics in SpringLinker Database from 1978 to 2017, a Full-Text citation analysis based on semantic technology is used to quantitatively assess the basic status, landscapes, hotspots and future development trends of the “Scientometrics” research area. Besides traditional methods such as co-word analysis, main path analysis and sleeping beauty paper recognization, novel methods such as dynamic topic model and word vectors models are used, furthermore a three-dimensional visualization technology was proposed. It shows that these methods can provide a dynamic view of the evolution of scientometrics research landscapes, hotspots and trends from various perspectives which may serve as a potential guide for future research.

13:30
Bibliometric Analysis of the Semantic Mining Research Status with the Data from Web of Science
SPEAKER: Zhao Zhao

ABSTRACT. By using the 2460 papers obtained from the Web of Science database from 1991 to 2018 as the research sample, this paper demonstrates a comprehensive bibliometric analysis of the research status, trends and hotspots in the domain of Semantic Mining. The results indicate that the current global semantic mining research is of great value; Knowledge is mainly distributed in computer science, engineering and linguistics; the international academic communications in semantic mining field are pretty prosperous, which are concentrated on three major region: East Asia, North America and West Europe. In addition, the research hotspots be shown in keywords co-occurring mapping is the research of technology which is represented by text mining, the research of theory which is represented by ontology and semantic network, and the research of application which is represented by knowledge discovery and information extraction. And the current research fronts can be categorized into two layers: the model research by using deep learning technology for semantic mining, the application research such as applying semantic mining to social media. Finally, we discussed to use the mathematical models of logistic curve to predict the number of papers in the future which told us the study is still in the growth stage at present and we need to grasp the golden age of the next five years.

13:35
ANALYZING THEORETICAL ROOTS OF EMERGENCE WITH AN EVOLUTIONARY PERSPECTIVE

ABSTRACT. Emergence in technology has been subjected to many research studies since knowledge accepted as the engine of economic growth. However, even there are growing number of publications in literature and many different phrases used, concept remains ambiguous up to now. In this study, it is aimed to trace emergence discussions to start and find the evolution of the concept to compare its usage in technological context. For achieving this aim, philosophy of science, complexity, and economics literatures are reviewed in accordance with emergence concept qualitatively. Then, a bibliometrics study is performed to strengthen the qualitative arguments and find the emergence in technology studies for comparison. Based on the findings, it is asserted that technological emergence should be handled by describing predictive aspects of emergence, finding scientific creativity networks and distinguishing qualitative synergy in scientific knowledge production networks.

13:40
Technology Complementarity Measurement on Enterprise Level Based on Technical Topics
SPEAKER: Yujia Hou

ABSTRACT. With the rapid economic growth and constant changes in social needs, the complexity of new products is constantly increasing. For some emerging industries, the cumulative features of technological innovation are significant and often need to be introduced complementary technologies to develop new products and services,which can promote many major technological innovations and is also important for enterprises to seek technically complementary collaborators. However, research on technology complementarity is relatively weak especially in terms of quantitative measurement of the complementarity of patent technologies on enterprise level, which is essential to enterprises selecting collaborators. What’s more, studies that quantitatively measure the complementarity of patent technologies are mainly based on IPC classification codes and patent citations, which have some problems such as lacking of accuracy and timeliness. To overcome these shortcomings, we choose to proceed from the contents of patent texts, based on the technical topics, to make some attempts to measure the technology complementarity on enterprise level.We propose a framework to measure the technology complementarity on enterprise level. A case study to measure the technology complementarity between the enterprises about the 3D printing demonstrates the reliability of our method and the results indicate the practical meaning of our method to get more accurate result.

13:45
Bibliometric evaluation of space earth science in countries along the Belt and Road
SPEAKER: Fan Yang

ABSTRACT. Strengthening the international cooperation between China and the countries along the Belt and Road (B&R) routes in the field of Earth observation science, technology and applications will significantly improves the comprehensive capability of national spatial information system and thus promotes a common development of all related countries. The study analyses the research performances of the countries along the B&R routes from 2000 to 2016 in the field of space earth science based on bibliometrics statistics and visualization analysis method (bibliographic data of the major space earth science missions are derived from the Web of Science database). The work assesses the publication scale, academic influence, dominant disciplines, and research hotspots in major countries and core regions, hope to contributes to a comprehensive understanding of national research strength, discipline and development potential in the field of space earth science along the B&R routes. Moreover, the work also discusses the characteristics and evolution trends of research cooperations between major countries and core regions, including the international cooperation and the cooperations between the countries and regions along the B&R routes. B&R regions and countries show significant difference in cooperation research. China is far away from the center of the B&R international cooperation network. The cooperation strength of B&R countries with China is significantly lower than with other space powers, suggesting that there are much room and opportunities for cooperation between China and B&R in space earth science.

13:50
Exploration of a Science-technology Relationship Index and its Measurement Algorithm
SPEAKER: Yan Qi

ABSTRACT. This paper focuses on measuring the linkage between science and technology from bidirectional and content perspective which is different from exsiting study. The indexes of ‘Science linkage’and ‘Technolgy linkage’ based on the citation analysis are directing at onesided association of science and technology, and co-word analysis is too detailed to be easy to operate. Therefore, we fix on the mesoscopic level of ‘Topic’. We believe that the accumulated relationship of all the relevance between the topics of papers and patents sets, to a certain degree, reflects the relationship between science and technology represented by these two sets. The Topic Model (e.g., LDA or PLDA) is used to generate the topics of papers set and patents set, and then the common research topics of the two sets are found out to be prepared for calculating our new index—STL, which can measure the bilateral relationship between science and technology from the content dimension. We designed the simplest algorithm and carried out an empirical test using the hepatitis C virus(HCV) research field. Finally, we summarize the contribution and limitations of this study.

14:10-15:30 Session 5A: Technology Evolution
Location: Aalmarktzaal
14:10
Patenting in Post-Secondary Institutions
SPEAKER: Marc Neville

ABSTRACT. Canada is a worldwide leader in terms of academic research and has one of the most-educated workforces in the world. A recently released report from the Canadian House of Commons Standing Committee on Industry, Science and Technology entitled Intellectual Property and Technology Transfer: Promoting Best Practices provided policy recommendations on topics that included commercializing academic intellectual property (IP). In anticipation of the future demand for analysis around academic patenting activities in Canada, the Canadian Intellectual Property Office (CIPO) decided to create a repository of patents held by post-secondary institutions (PSIs) and their associated inventors (professors, post-docs, graduate students, etc.). The main challenge with creating this repository is that not all Canadian PSIs follow the same policy around IP protection and the association between patent and PSI can be missed.

The objective of this project is to create an academic patent data repository. This repository will be an easily manageable dataset of patents associated with PSIs that will be used to identify and analyze their patenting activities. This exercise, using web scrapping and data matching techniques, will serve as an opportunity for CIPO to document useful techniques and best practices for data matching using python scripts that will be useful in future projects. It will also create a repository that will be used to undertake an extensive analysis of patenting by PSIs in Canada and the creation of useful indicators around patent data and PSIs.

14:30
A patent citation-based perspective to explore the technology life cycle
SPEAKER: Ying Huang

ABSTRACT. The ability to analyze and monitor the history and current stage of a particular technology is a critical asset to gain competitive advantage and to identify promising opportunities. Technology often presents different development tracks; therefore, it is necessary to consider the technology life cycle when creating a distinct R&D strategy plan. The technology life cycle comprises a pattern of dynamic characteristics pertaining to technology, in which its innovative and economic outcomes change over time. Nowadays, more and more researchers tend to introduce multiple indicators to measure the technology life cycle. Though such statistical indicators offer a convenient way to make a quick sense of the technological stage, they ignore the technology nature of internal knowledge flow and knowledge overflow. In other words, such traditional indicator-based methods cannot explain the dynamic mechanism of technology evolution and fail to determine inner representation. In this paper, we hold the view that the process of technology evolution can be interpreted through the evolution of patent citation behavior.

14:50
Technology Evolution Analysis Based on SPO using patent documents: a Case Study of Induced Pluripotent Stem Cells
SPEAKER: Chunjiang Liu

ABSTRACT. SPO predications consist of a Subject argument (noun phrase), an object argument (noun phrase), and the relation that binds them (verb phrase), which can represent science and technology (S&T) information with more details in a simple manner and have been widely applied in Knowledge Discovery in Biomedical Literature (KDiBL). The SPO predications are extracted from literature and cleaned. The technology is stated by SPO predications. Young et al. approached a method that can be used to draw technology evolution map of keywords by calculating the distributions of keywords over the documents cluster groups. This paper follows Young’s research using SPO predications instead of keywords. Induced Pluripotent Stem Cells (IPSC) patent documents are selected as a case study.

15:10
R&D trend analysis based on patent mining:an integrated use of patent application and invalidation data
SPEAKER: Xiaotong Han

ABSTRACT. To formulate suitable R&D strategies, enterprises should know the R&D trends of their competitors and the industry they are involved in. Patent legal status, which is ignored by previous studies, plays a significant role in more accurate analysis of R&D trends. We propose an approach to analyze R&D trends customized to a whole industry based on both patent application and invalidation data, aiming at providing advice for enterprises on R&D strategies. Firstly, LDA topic model is adapted to identify technology topics of each patent. Secondly, two measures are constructed to evaluate application and invalidation level of each technology topic. Thirdly, A two-dimension portfolio map is used to show scatter plots of technology topics. According to the values of application and invalidation measures, the map is divided into four areas, three of which present emerging, updating and declining stages of a technology topic. Finally, Electronic Information Industry of China is taken as a practical application to our approach.

14:10-15:30 Session 5B: Novel Data
14:10
Social Media Mining for Ideation by using Classification Methods
SPEAKER: Sercan Ozcan

ABSTRACT. Ideation is the most crucial and the first step of almost any innovation process. It starts with either identifying a problem or creating a need for any product/service development process. Considering external ideation resources, open innovation, co-creation or crowdsourcing are popular concepts and approaches where companies interact with consumers, inventors, and other organisations to enhance their innovation capability. This paper aims to mine Twitter data to explore the trends and retrieve ideas for different purposes such as product development, technology, and sustainability-oriented considerations. The main approach of this study is to classify the tweets to be an “idea” or “not an idea”. These retrieved ideas provide insights about expectations, problems or needs of consumers and organisations. The results also illustrate the reactions of consumers to technological developments. Various supervised and unsupervised classification algorithms are used to classify tweets where it consists of an idea. The classification algorithms are also compared for various validation metrics. The results demonstrate that our method based on text mining and classification methods can extract ideas from consumers and is a great method to show technological trends. In addition, this study illustrates the conditions where semi-supervised or unsupervised classification methods work the best. The quality and the accuracy of the results are increased when the data is retrieved from a combination of hashtags and the classification methods are optimised for these specific hashtags. This study can be beneficial to those companies and entrepreneurs that would like to identify ideas for their innovation and product development activities.

14:30
Examining Consumer Oriented Innovations: A Crowdfunding Text Mining Approach
SPEAKER: Sercan Ozcan

ABSTRACT. Majority of text mining analysis focus on patent (Ozcan and Islam, 2017; Wang et al, 2018), publication (Rafols et al, 2014; Li, Porter and Suominen, 2017; Ebrahim and Bong, 2018) and recently social media (He, Zha and Li, 2013; Zhuravleva, Bot and Hilton, 2016; Mehrazar et al, 2018) as data sources.

Recently, radical consumer desirable products have emerged such as GoPro, Pebble, Oculus Rift, and The Dash (Schroter, 2014). Although consumers and individual inventors have always been in the picture when it comes to innovation and new product development process (Franke and Shah, 2003; Lettl, 2005; Poetz and Schreier, 2012; Ende, Frederiksen and Prencipe, 2015), crowdfunding platforms are data sources with a concentration of consumer oriented projects. Previous research on product and consumer oriented activities was more closed and within firms but the shift to an open innovation approach (Garbarino and Mason, 2016;Cui and Wu, 2017) needs to looked into by the text mining community in order to uncover insights with regards to developing and commercializing more sustainable and desirable products.

The aim of this study is to examine crowdfunding investment by consumers to show desirable products and emerging linkages by using text mining methods. A custom crawler was developed to download web contents and extract key data points from a crowdfunding platform. The dataset of approximately 1,500 projects from the United Kingdom were analysed from a period of 2013 to 2018.

14:50
An Exploration on the Frontier of Energy Industries: A Perspective of Scientific-Innovation

ABSTRACT. It is of great significance, both theoretically and practically, to conduct a study on the frontier of energy industries. This is related to the establishment of a modern energy system characterized by cleanness, low-carbon, safety and efficiency; refinement of energy structure; acceleration of energy revolution; and promotion of social and economic transformation and development in any given country. Previous studies on detecting industrial frontiers were mostly conducted by analyzing data of scientific papers or patents; whereas in this research we adopt a brand-new perspective by employing the statistical data of Projects fostered by Scientific User Facility Program, US Department of Energy (US DOE), to explore the frontiers of energy industries. By adopting such data in exploring the frontier technologies of energy industries, it would be more valid and effective on one hand, new and more focused on the other. From a theoretical perspective, this is an attempt based on the theory and conceptions of “Scientific Innovations”. From a methodological perspective, this is an attempt on Altmetrics, and would serve as an important compliment and development to the data resources and methodology of Scientific Statistics. The emerging industry fostering program based on innovative concepts and ideas has greatly shortened the path from science to technology and then to industry. The practice of US Department of Energy on basis of “Scientific Innovations” has provided a good example for the innovation and development of the emerging industries in China.

15:10
Mapping Research Funding in 2D and 3D by t-SNE
SPEAKER: Ting Chen

ABSTRACT. With an exponentially growing number of research awards funded each year, a visualization tool for exploring funding's hotspots and gaps is becoming indispensable. However, revealing the landscape of funding is a very challenging task. One researcher created the map of funding using the tree map[1]. Others used the pLSA or paragraph vector to extract relationships between all pairs of awards. Then the network maps are created with force direct layout[2,3]. The network map is the most widely used visualization method in bibliometrics, but the relationship between awards is a non-sparse distance matrix, the threshold needs to be manually set in the visualization task. Besides, when high-dimensional text features are converted into relationships between pairs, some information in the high-dimensional space will be lost. This paper is creating a new way to sort and view the research funding by mapping high- dimensional representation of awards in a 2D and 3D space with a nonlinear dimensionality reduction technique t-SNE. 4669 NSF awards data from 2008 to 2017 were downloaded from the Information and Intelligent Systems department in this research.

15:30-16:00Coffee Break
16:00-16:45 Session 6: TechMining Panel Discussion

Tech Mining’s Contribution to Understanding “Science-to-Technology”

Panelists:
Alan Porter
Daniele Rotolo
Jos Winnink
Serhat Burmaoglu
Ton (A.F.J.) van Raan

 

Location: Aalmarktzaal
16:45-17:00 Session 7: TechMining for Global Good Award and Closing

TechMining for Global Good Award

Dr. Bruna Fonseca accepting for Dr. Carlos Morel

National Institute of Science and Technology for Innovation in Diseases of Neglected Populations (INCT-IDPN)

Centre for Technological Development in Health (CDTS)

Oswaldo Cruz Foundation (Fiocruz)

Closing

Denise Chiavetta and Alan Porter

Location: Aalmarktzaal
17:00-19:00 Session 8: Poster Session
17:00
Social Innovation as a research field: worldwide landscape of leading actors and research themes

ABSTRACT. The concept of Social Innovation has become increasingly relevant in recent years influencing policy makers and researchers in both developed and developing countries. Although different definitions for the concept can be found in the literature, social innovation is often associated with the process of providing effective solutions aimed at meeting the needs of social groups and the community, focusing on the well-being of individuals and the collectivity. In general, the different definitions for social innovation are linked to the purposes of the various issues related to research in this area. An analysis of the literature shows that the concept encompass a great variety of topics as social change processes, business strategies, organizational management, social entrepreneurship, new products and services, governance, training and capacitation, among others. In addition, social innovations are also related to specific areas and disciplines as sustainable production and consumption, education, psychology and design. Based on a bibliometric analysis approach, this study addresses the recent development of social innovation research in order to identify how the concept has evolved over the years from a thematic point of view, including the actors and the main aspects driving the scientific progress on the issue.

17:00
Performance Comparison For Multi Class Classification Intrusion Detection In SCADA Systems Using Apache Spark
SPEAKER: Raogo Kabore

ABSTRACT. SCADA (Supervisory Control And Data Acquisition).are industrial control systems, that allow the monitoring and control of large indutrial systems. Those systems are more and more subject to cyber attacks due to their interconnexion with corporate networks and the Internet. We are comparing in this work the performances of a SCADA-specific Intrusion Detection system built with apache Spark, using Decision Tree, Naïve Bayes, Random Forest and Multilayer Perceptron approaches. Our Comparison criterias are the recall, specificity, precision, training time and detection time. The dataset used is obtained from a Modbus control system that monitors a water storage tank system. The dataset contains normal as well as different caregories of attack tuples. Our Intrusion Detection framework is a Hadoop cluster using Hive and the Spark ML library. The experimentation results show that the Decision Tree classifier has a very good detection rate (recall of 100 %) for all tuples categories except the Denial-of-Service (recall of 0). Decision Tree has also a fairly good training and detection time (7.84 s and 0.23 s respectively). The Random Forest also has a good detection rate for all classes apart DoS. But it has only 60% detection rate for the DoS class and longer training and detection. Naïve Bayes and Multilayer Perceptron have an overall poor classification results, but Naïve Bayes is very fast at training (2.96 s) and detecting (0.14 s) . Multilayer Perceptron on the other hand, while taking time to train (155.51 s) is very fast in the prediction phase (0.16 s).

17:00
EVALUATING TECHNOLOGICAL EMERGENCE FOR STRATEGIC DECISION MAKING: A HYBRID MODEL PROPOSAL

ABSTRACT. Identifying, tracking and conceptualizing the emerging ideas are significant issues in the literature. Therefore, there were many studies regarding to conceptualize (Alexander, Chase, Newman, Porter, & Roessner, 2012; Rotolo, Hicks, & Martin, 2015; Small, Boyack, & Klavans, 2014) and model (C. M. Chen, 2006) the technological emergence. However, there is not a consensus either conceptually or model-based in literature. Although searching technical or technological emergence has been very popular subject and there were many models applied, the conclusion of search may generally be described as expert-dominated. After analyzing heaps of scientific data, extracted topics have been evaluated verbally by experts based on their previous experiences and then these topics interpreted for future consequences. However, there are not much studies focusing the expert decision making process. In this study, it is aimed to propose a scientometrics based fuzzy Multi-Criteria Decision Making model for evaluating emerging topics multi-dimensionally. By using fuzzy approach, it is thought that ambiguity of expert decisions can be considered and with applying decision making process a compromise solution can be reached. Nanoenergy field is applied for illustration. Finally, based on aggregated expert opinions, an evaluation diamond generated to demonstrate five aspects of technological emergence at once. Then, emergent terms in diamond interpreted for strategic extension.

17:00
Bibliometric Analysis of Solar System Exploration Missions
SPEAKER: Lin Han

ABSTRACT. The international trends of solar system exploration researches are depicted and analyzed by bibliometrics and text mining method based on solar system exploration missions. The overall situation and development trends of robotic solar system exploration missions and their scientific outputs are emphatically analysed and estimated. The analysis of solar system exploration missions shows that Mars and the moon will continue as top-priority exploration targets, meanwhile asteroids, giant planets and their satellites and Venus are also atrracting targets. The boom of solar system exploration since 1990s is reflected in scientific papers. Bibliometric analysis based on solar system exploration missions shows that the volume of papers had been increasing rapidly since 1990s, and kept relatively steady after 2008. The United States is in the absolute leading position, the volume of papers and ESI papers and citations are way ahead of the other countries. France and Germany are on the second tier, outstanding in the above three aspects. China started late, but has made considerable progress in the volume of papers in recent years. Most countries have made extensive international cooperation. Among the top 10 countries, only the United States, Japan and Chinese have published more independent research papers than international cooperation papers. The proportion of international cooperation papers of France, Germany, Britain, Holland and Spain is over 70%. At the institutional level, the U.S. research institutions performance is outstanding, with NASA, Caltech, and University of California being the Top 3.