CCKS2016: CHINA CONFERENCE ON KNOWLEDGE GRAPH AND SEMANTIC COMPUTING (全国知识图谱与语义计算大会)
PROGRAM FOR WEDNESDAY, SEPTEMBER 21ST
Days:
previous day
next day
all days

View: session overviewtalk overview

08:45-09:00 Session 7: 开幕式(Welcome)

中国中文信息学会理事长 李生 致欢迎辞 

大会主席 孙乐 致欢迎辞 

程序委员会主席 陈华钧 会议组织情况

Chair:
Location: 1号楼三层银杏大厅(Building No. 1, Gingko Hall)
09:00-09:45 Session 8: 特邀报告(Keynote)
Chair:
Location: 1号楼三层银杏大厅(Building No. 1, Gingko Hall)
09:00
Using Semantic Technology to Tackle Industry’s Data Variety Challenge
SPEAKER: Ian Horrocks

ABSTRACT. Big Data technologies have made significant progress in addressing problems related to the volume and velocity of data, but they are less effective at dealing with data variety and heterogeneity; this so-called “variety challenge” is the main barrier to effective data access in many industry applications. Semantic Technologies offer a potential solution to the variety challenge, and in the Ontology Based Data Access (OBDA) approach they do so in a way that layers on top of existing infrastructure and exploits its scalability. In this talk I will explain the OBDA approach, and show how it is being used to address the variety challenge in two large companies: Siemens and Statoil. I will also highlight some of the problems and limitations of OBDA, discuss how these can be mitigated, and present some recent research that shows how semantic data access can go beyond what is possible with OBDA.

09:45-10:30 Session 9: 特邀报告(Keynote)
Location: 1号楼三层银杏大厅(Building No. 1, Gingko Hall)
09:45
Short Text Understanding
SPEAKER: Haixun Wang

ABSTRACT. Billions of short texts are produced every day, in the form of search queries, ad keywords, tags, tweets, messenger conversations, social network posts, etc. Unlike documents, short texts have some unique characteristics which make them difficult to handle. First, short texts, especially search queries, do not always observe the syntax of a written language. This means traditional NLP techniques, such as syntactic parsing, do not always apply to short texts. Second, short texts contain limited context. The majority of search queries contain less than 5 words, and tweets can have no more than 140 characters. Because of the above reasons, short texts give rise to a significant amount of ambiguity, which makes them extremely difficult to handle. On the other hand, many applications, including search engines, ads, automatic question answering, online advertising, recommendation systems, etc., rely on short text understanding. In this talk, I will go over various techniques in knowledge acquisition, representation, and inferencing has been proposed for text understanding, and will describe massive structured and semi-structured data that have been made available in the recent decade that directly or indirectly encode human knowledge, turning the knowledge representation problems into a computational grand challenge with feasible solutions insight.

10:30-11:00茶歇(Coffee Break)
11:00-12:00 Session 10A: 学术论文(Paper Session)- Knowledge Representation & Learning
Chair:
Location: 1号楼三层银杏大厅(Building No. 1, Gingko Hall)
11:00
A Joint Embedding Method for Entity Alignment of Knowledge Bases (Full Paper)
SPEAKER: unknown

ABSTRACT. We propose a model which jointly learns the embeddings of multiple knowledge bases(KBs) in a uniform vector space to align entities in KBs. Instead of using content similarity based methods, we think the structure information of KBs is also important for KB alignment. When facing the cross-linguistic or different encoding situation, what we can leverage are only the structure information of two KBs. We utilize seed entity alignments whose embeddings are ensured the same in the joint learning process. We perform experiments on two datasets including a subset of Freebase comprising 15 thousand selected entities, and a dataset we construct from real-world large scale KBs -- Freebase and DBpedia. The results show that the proposed approach which only utilize the structure information of KBs also works well.

11:20
A Multi-dimension Weighted Graph-based Path Planning with Avoiding Hotspots (Full Paper)
SPEAKER: unknown

ABSTRACT. With the development of industrialization rapidly, vehicle has become an important part of people's life. However, with a large number of population, the transportation system is becoming more and more complicate in the world. The core problem of the transportation system is how to avoid hotspots. While these current path planning systems, because they plan paths by applying one dimension weighted graph, cannot always describe hotspots in an exact way. In this paper, we present a graph model based on a multi-dimension weighted graph for path planning with avoiding hotspot. Firstly, we extend one dimension weighted graph to multi-dimension weighted graph where multi-dimension weights are used to characterize more features of transportation. Secondly, we develop a framework equipped with many aggregate functions for transforming multi-dimension weighted graphs into one dimension weighted graphs in order to reduce the path planning of multi-dimension weighted graphs into the shortest path problem of one dimension weighted graphs. Finally, we implement our proposed framework and evaluate our system in some interesting practical examples. The experiment shows that our approach can provide “optimal” paths under the consideration of avoiding hotspots.

11:40
Position Paper:The Unreliability of Language-A Common Issue for Knowledge Engineering and Buddhism (Short Paper)
SPEAKER: unknown

ABSTRACT. The core of \emph{knowledge engineering} is to apply different kinds of formal languages (or models) to represent and manage human languages (or knowledge). However, according to the studies of Kurt G\"{o}del and Ludwig Wittgenstein, both of formal languages and human languages are unreliable. This finding inherently influences the development of artificial intelligence and knowledge engineering. On the other hand, their finding, i.e., the unreliability of languages, was early discussed by Gautama Buddha who founded Buddhism. In this paper, we discuss the issue of the unreliability of language by bridging the perspectives of G\"{o}del, Wittgenstein and Gautama. Based on the discussion, we further give some philosophical thoughts from the perspective of knowledge engineering.

11:50
construction of domain ontology for engineering equipment maintenance support (Short Paper)
SPEAKER: unknown

ABSTRACT. According to the problem in the domain of engineering equipment maintenance support, such as too many knowledge points, broad scope, complex relationships, difficult in sharing and reuse, this paper put forward the category and professional field of engineering equipment maintenance support, and analyzed the knowledge sources, extracted eight core concepte such as case,product, function, damage, enviroment, phenomena, disposal and resource, and formed concept hierarchy model further, and then analyzed data properties and object properties of core concepts, and tried to construct the engineering equipment maintenance domain ontology with protege4.3, which put a solid foundation for the knowledge base and engineering equipment maintenance appication ontology.

11:00-12:00 Session 10B: 海报与演示(Posters & Demos)
Chairs:
Location: 1号楼三层银杏大厅(Building No. 1, Gingko Hall)
12:00-13:30午餐(Lunch Break)5号楼大堂二层赏园餐厅
13:30-15:10 Session 11A: 学术论文(Paper Session)- Knowledge Representation & Learning
Location: 1号楼二层第五会议室(Building No1, No. 5 Meeting Room)
13:30
基于表示学习的开放域中文知识推理 (Full Paper)
SPEAKER: Tianwen Jiang

ABSTRACT. 知识库通常以网络的形式被组织起来,网络中每个节点代表实体,而每条连边则代表实体间的关系。为了利用这种网状知识库中的知识,往往需要设计专门的复杂度较高的图算法,但这些算法并不能很好适用于知识推理,尤其是随着知识库的知识规模不断扩大,基于网状结构知识库的推理很难较好地满足实时计算的需求。本文的主要研究内容是,使用基于TransE模型的知识表示学习进行知识推理,包括对实体关系三元组中关系指示词以及尾实体的推理,其中关系指示词推理的实验取得了较好的结果,且推理过程无需设计复杂的算法,仅涉及向量的简单运算。另外,本文对原始TransE模型的代价函数进行改进,以更好地适用于开放域中文知识库表示学习。

13:50
基于概念层次网络的知识表示与本体建模 (Full Paper)
SPEAKER: unknown

ABSTRACT. 摘要:自然语言处理(NLP)领域,知识表示不统一、语义信息无法系统化利用是目前存在的一个重要问题。要解决这个问题就要解决语义知识表示的问题。本文基于概念层次网络,以语言理解基因为基础和主线,结合语义和语境信息,描述了词语、句子、句群和篇章层面的语义知识表示方法。基于这种表示方法,构建了一个多语言本体知识库。该知识库的知识表示方法不仅可以为知识表示理论提供基础,也可以为自然语言处理相关领域技术及系统构建提供资源支持。

14:10
基于位置的知识图谱链接预测 (Full Paper)
SPEAKER: unknown

ABSTRACT. 链接预测是知识图谱的补全和分析的基础。由于位置相关的实体和关系本身拥有丰富的位置特征,本文提出了一种基于位置的知识图谱链接预测方法。该方法首先通过分析实体和关系的语义特征对关系进行分类,然后提出了一种基于位置的实体和关系的位置特征和规则的挖掘方法;其次,通过挖掘出的实体位置特征和规则,对实体和关系的向量化方法的预测结果进行约束,得到最终的结果。本文通过对WikiData、FB和WN数据集的实验,证明本方法针对基于位置的关系和实体链接预测拥有较好的效果。

14:30
Knowledge Representation Learning based on Chinese Knowledge Graph in Vegetable Domain (Full Paper)
SPEAKER: unknown

ABSTRACT. Knowledge graph is stored as a graph where each node represents entity and each edge represents relation between entities. Due to the problems of high complexity of the graph algorithms and severe data sparsity, it becomes very important for the researches and applications of knowledge graph that to achieve effective representation the entities and relations on the basis of knowledge graph construction.In this paper, we use vegetable entries of Baidu encyclopedia and HDwiki as data source to study the knowledge representation learning models base on vegetable knowledge graph construction.we adopted TransE model to represent vegetable triples, embedded the entities and relations into a continuous low- dimensional vector space. Thirdly, faced with the complex attribute relations of 1-N, N-1 and N-N, we came up with PTA model, constructed the relation path by combining attribute relations and hyponymy relation, and embedded the relation path into vector space as well. The result showed, without taking into account the relations classification, the link prediction results of PTA model is better than TransE models. And the total value of Hits@10 is higher than TransE model.

14:50
Space Projection and Relation Path based Representation Learning for Construction of Geography Knowledge Graph (Full Paper)
SPEAKER: unknown

ABSTRACT. Human-like intelligence has developed rapidly and it benefited from the complete knowledge graph especially primary education knowledge graph represented by geography. The traditional knowledge graph is represented by network knowledge and it is high computational complexity and can’t measure or make use of semantic association between entities effectively. This paper puts forward a new algorithm based on deep learning of knowledge representation--PTransW (Path-based TransE and Considering Relation Type by Weight). It combines the space projection with the semantic information of relation path, and consider the semantic information of relation type for further improvement. The experiment results on the FB15K and GEOGRAPHY data sets show that the ability of deal-ing with complex relation in knowledge graph is improved greatly for PTransW model. For small data sets, the training of TransE and TransR which are low complexity will be more enough. However, PTransE model and PTransW model utilize the semantic information of relation path and reverse relation and perform more outstanding in relation prediction than TransE model and TransR model.

13:30-15:10 Session 11B: 学术论文(Paper Session)- Knowledge Graph Construction and Information Extraction
Chair:
Location: 5号楼金缘厅(Building No.5, Jingyuan Hall)
13:30
Boosting to Build a Large-scale Cross-lingual Ontology (Full Paper)
SPEAKER: unknown

ABSTRACT. The global knowledge sharing makes large-scale multi-lingual knowledge bases an extremely valuable resource in the Big Data era. However, current mainstream Wikipedia-based multi-lingual ontologies still face the following problems: the scarcity of non-English knowledge, the noise in the multi-lingual ontology schema relations and the limited coverage of cross-lingual "owl:sameAs" relations. Building a cross-lingual ontology based on other large-scale heterogenous online wikis is a promising solution for those problems. In this paper, we firstly propose a cross-lingually boosting approach to iteratively reinforce the performance of ontology building and instance matching. Experiments on English Wikipedia and Hudong Baike output an ontology containing over 3,520,000 English instances and 800,000 Chinese instances. The F1-measure improvement of Chinese "instanceOf" relation prediction achieve the highest 32\%. At last, over 150,000 cross-lingual instance "owl:sameAs" relations are constructed.

13:50
Large Scale Semantic Relation Discovery: Toward Establishing the Missing Link between Wikipedia and Semantic Network (Full Paper)
SPEAKER: unknown

ABSTRACT. Wikipedia has been the largest knowledge repository on the web. However, most of the semantic knowledge in Wikipedia is documented in natural language, which is mostly only human readable and incomprehensible for computer pro-cessing. To establish the missing link from Wikipedia to semantic network, this paper proposes a relation discovery method, which can: 1) discover and charac-terize a large collection of relations from Wikipedia by exploiting the relation pat-tern regularity, the relation distribution regularity and the relation instance re-dundancy; and 2) annotate the hyperlinks between Wikipedia articles with the discovered semantic relations. Finally we discover 14,299 relations, 105,661 rela-tion patterns and 5,214,175 relation instances from Wikipedia, and this will be a valuable resource for many NLP and AI tasks.

14:10
Biomedical Event Trigger Detection Based on Hybrid Methods Integrating Word Embeddings (Full Paper)
SPEAKER: unknown

ABSTRACT. Trigger detection as the preceding task is of great importance in biomedical event extraction. By now, most of the state-of-the-art systems have been based on single classifiers, and the words encoded by one-hot are unable to represent the semantic information. In this paper, we utilize hybrid methods integrating word embeddings to get higher performance. In hybrid methods, first, multiple single classifiers are constructed based on rich manual features including dependency and syntactic parsed results. Then multiple predicting results are integrated by set operation, voting and stacking method. Hybrid methods can take advantage of the difference among classifiers and make up for their deficiencies and thus improve performance. Word embeddings are learnt from large scale unlabeled texts and integrated as unsupervised features into other rich features based on dependency parse graphs, and thus a lot of semantic information can be represented. Experimental results show our method outperforms the state-of-the-art systems.

14:30
GRU-RNN based Question Answering over Knowledge Base (Full Paper)
SPEAKER: unknown

ABSTRACT. Building system that could answer questions in natural language is one of the most important natural language processing applications. Recently, the raise of large-scale open-domain knowledge base provides a new possible approach. Some existing systems conduct question-answering relaying on hand- craft features and rules, others work try to extract features by popular neural networks. In this paper, we adopt recurrent neural network to understand questions and find out the corresponding answer entities from knowledge bases based on word embedding and knowledge bases embedding. Question-answer pairs are used to train our multi-step system. We evaluate our system on FREEBASE and WEBQUESTIONS. The experimental results show that our system achieves com- parable performance compared with baseline method with a more straightforward structure.

14:50
An Initial Ingredient Analysis of Drugs Approved by China Food and Drug Administration ( (Short Paper))
SPEAKER: unknown

ABSTRACT. Drug is an important part of medicine. Drug knowledge bases that organize and manage drugs have attracted considerable attention, and have been widely used in human health care in many countries and regions. There are also a large number of electronic drug knowledge bases publicly available. In China, however, there is hardly any publicly available well-structured drug knowledge base, may due to two different types of medicine: Chinese traditional medicine (CTM) and modern medicine (ME). In order to analyse components of drugs approved by China Food and Drug Administration (CFDA), we developed a preliminary ingredient drug analysis system. This system collects all drug names from the website of CFDA, obtains their manuals from three medical websites, extracts the ingredients of drugs, and analyses the distribution of the extracted ingredients. Totally, 12,918 out of 19,490 drug manuals were collected. Evaluation on randomly selected 50 drug manuals shows that the system achieves an F-score of 95.46% on ingredient extraction. According to the distribution of the extraction ingredients, we find that ingredient multiplexing is very common in medicine, especially in herbal medicine, which may provide a clue for drug safety as taking more than one type of drug that contains partially the same ingredients may cause overtaking the same ingredients.

15:00
A Tableau-based Forgetting in ALCQ (Short Paper)
SPEAKER: unknown

ABSTRACT. Forgetting is a useful tool for tailoring ontologies by reducing the number of concepts and roles. The issue of forgetting for general ontologies in more expressive description logics, such as ALCQ and SHIQ, is largely unexplored. In particular, those problems of characterizing the forgetting-based reasoning and computing the result of forgetting are still open. In this paper, we develop a decidable, sound and complete tableau-based algorithm to implement the forgetting-based reasoning. Our tableau algorithm is technically feasibly extended to explore the forgetting in more expressive ontology languages. Furthermore, we employ the rolling-up technique to compute the resulting of forgetting based on the complete forest after forgetting.

13:30-17:30 Session 11C: 知识图谱竞赛(Shared Tasks)
Location: 1号楼三层银杏大厅(Building No. 1, Gingko Hall)
13:30
知识表示学习与知识获取 (Invited Talk)
SPEAKER: Zhiyuan Liu
14:10
评测竞赛总体报告
SPEAKER: Kang Liu
14:40
ICRC-DSEDL:基于知识图谱的影视领域实体发现与链接系统 (Task1)
SPEAKER: 李昊迪
15:00
Coffee Break
SPEAKER: Na
15:30
TEDL: A System for CCKS2016 Domain-Specific Entity Discovery and Linking Task (Taks1)
SPEAKER: Feng Zhang
15:50
Knowledge Base Completion via Rule-Enhanced Relational Learning (Taks2)
SPEAKER: Shu Guo
16:10
Knowledge Graph Embedding for Link Prediction and Triplet Classification (Task2)
SPEAKER: Shijia E
16:30
基于平均互信息量和知识图谱的产品预测 (Task3)
SPEAKER: 邹震
16:50
Product Prediction with Deep Neural Networks (Task3)
SPEAKER: Shijia E
15:00-15:30茶歇(Coffee Break)
15:30-17:30 Session 12A: 学术论文(Paper Session)Knowledge Graph Construction & Information Extraction
Location: 1号楼二层第五会议室(Building No1, No. 5 Meeting Room)
15:30
基于混合模型的电子产品属性值识别 (Full Paper)
SPEAKER: unknown

ABSTRACT. 针对电子产品种类繁多,属性值多样化的特点,提出了一种基于混合模型的电子产品属性值识别方法。该方法根据属性的特点,将其分为通用属性和专用属性两类,对于前者,因其具有良好的规律,故采用基于规则的方法,对于后者,由于不同产品之间的差异性较大,采用了一种两阶段的方法,即在边界检测阶段采用条件随机场模型;在类别判定阶段采用支持向量机模型。实验表明,对于通用属性,基于规则的方法不仅可以减少人工标注的任务量,而且能提升识别结果;对于专用属性,本文在边界检测基础上又进行了边界后处理工作,使边界检测的结果得到了进一步的优化。最后,本文采用的混合模型融合了规则、边界后处理以及CRF与SVM的优势,在F值达到0.9417的同时模型的训练效率也得到了很大的提升。

15:50
基于字信息学习词汇分布的实体上位关系识别 (Full Paper)
SPEAKER: Tianwen Jiang

ABSTRACT. 本文在实体上位关系识别任务上,使用基于字信息的词向量学习模型学习词向量表示,并以此学习上位关系向量表示,在实体上位关系识别实验结果上效果较好,并且很大程度上缓解了未登录词的问题。首先基于字信息的词向量模型可以学习出几乎任意词语的词向量,然后根据语料中的上下位词对学习上位关系向量并聚类,学习每个簇的上位关系映射矩阵。最后利用上位关系映射矩阵来判别上位关系是否成立。实验结果表明,在未登录词多的数据集中,上位关系判别依然有着近80%的准确率,达到了可以应用的结果。

16:10
基于句式模板的人物关系三元组判定方法研究 (Full Paper)
SPEAKER: Zhao Jiapeng

ABSTRACT. 从海量无结构化文本中抽取的人物关系三元组(主语、谓语、宾语)对人物知识图谱的构建、知识表示及人物关系的推理具有重要作用。针对从无结构化文本中抽取的三元组准确率不高这一问题,提出一种有监督的判定三元组关系是否正确的方法。该方法需要预先构建包含人物属性的知识库,再根据人物属性知识库和训练语料学习一种句式模板判定树。在训练阶段,首先利用信息抽取,从训练集的句子中抽取三元组,人工标注抽取的三元组关系是否正确;然后根据“三元组”、“代词”、“字”在句子中的位置关系按层次构建模板,各层模板逐层细化对句子的刻画,同时记录该模板匹配到的正确和错误三元组的个数;在测试阶段,根据句子在各层模板中匹配到模板的正确与错误的个数,判定从句子中抽取的三元组是否正确。通过测试数据的测试结果显示,本文方法的训练时间、测试时间与F1值(76.6%)均优于一般的基于特征工程的机器学习方法(75.7%)。本文最后将句式模板判定树的判定结果作为一维特征,提升了一般特征工程方法的判定效果(77.5%)。此外,该方法具有传统方法不具备的扩展性,并且对训练集的构建具有指导意义。

16:30
DRTE:面向基础教育的术语抽取方法 (Full Paper)
SPEAKER: Siliang Li

ABSTRACT. 术语抽取是一项从非结构化文本中自动提取专业术语的基础性课题。该工作在中文分词、信息抽取、知识库的构建中发挥着重要的作用。目前的术语抽取方法很大程度上依赖于词的统计信息。然而在基础教育学科中术语具有极强的长尾特性,导致基于统计的术语抽取方法很难提取出处于尾端的术语。我们结合基础教育的学科特点,提出了DRTE:一种利用术语定义与术语关系挖掘为主,综合构词规则与边界检测的术语抽取方法。我们以初高中的数学课本为数据源进行术语抽取。实验结果表明我们的术语抽取方案准确度达到90.4%,并且相较于人工术语标注的结果有更高的召回率(78.0%),能够有效地在中文基础教育领域进行自动化地术语抽取。

16:50
Mining RDF Data for OWL 2 RL Axioms (Short Paper)
SPEAKER: unknown

ABSTRACT. The huge amounts of linked data available on the web are a valuable resource for the development of semantic applications. However, these applications often meet the challenges posed by flawed or incomplete schema, which would lead to the loss of meaningful facts. Association rule mining, as a successive way to discover implicit knowledge in RDF data, has been applied to learn many types of axioms. In this paper, we first make use of a statistical approach based on the association rule mining to enrich OWL ontologies. Then we propose some improvements according to this approach. Finally, we describe the quality of the automatically acquired axioms by evaluations on DBpedia datasets.

17:00
A Mixed Method for Building the Uyghur and Chinese Domain Ontology (Short Paper)
SPEAKER: unknown

ABSTRACT. As the increasing demands of multilingual semantic query on the World Wide Web, the research on multilingual ontology has gradually become a hot spot. But the study of multilingual ontology on professional field is relatively rare, and a few of the many existing are about the public domain. This paper describes and designs the mixed method for building a new multilingual ontology. By using the above mixed method, construct Uyghur and Chinese bilingual ontology about University management field, through alignment and mapping the concepts and the relations between the different language ontology then merging into one body - multilingual ontology. Finally, preliminary realized semantic query about multilingual ontology using SPARQL, so that will provide basic support for minority languages cross-lingual information retrieval from the perspective of the professional field.

15:30-17:30 Session 12B: 学术论文(Paper Session)- Linked Data & Knowledge-based Systems
Location: 5号楼金缘厅(Building No.5, Jingyuan Hall)
15:30
Link Prediction via Mining Markov Logic Formulas to Improve Social Recommendation (Full Paper)
SPEAKER: unknown

ABSTRACT. Social networks have been a main way to obtain information in recent years, but the huge amount of information obstructs people from obtaining something that they are really interested in. Social recommendation system is introduced to solve this problem and brings a new challenge of predicting people’s preferences. In a graph view, social recommendation can be viewed as link prediction task on the social graph. Therefore, some link prediction technique can apply to social recommendation. In this paper, we propose a novel approach to bring logic formulas in social recommendation system and it can improve the accuracy of recommendations. This approach is made up of two parts: (1) It treats the whole social network with kinds of attributes as a semantic network, and finds frequent structures as logic formulas via random graph algorithms. (2) It builds a Markov Logic Network to model logic formulas, attaches weights to each of them to measure formulas’ contributions, and then learns the weights discriminatively from training data. In addition, the formulas with weights can be viewed as the reason why people should accept a specific recommendation, and supplying it for people may increase the probability of people accepting the recommendation. We carry out several experiments to explore and analyze the effects of various factors of our method on recommendation results, and get the final method to compare with baselines.

15:50
Graph-based Jointly Modeling Entity Detection and Linking in Domain-Specific Area (Full Paper)
SPEAKER: unknown

ABSTRACT. The current state-of-the-art Entity Detection and Linking (EDL) systems are geared towards general corpora and cannot be directly applied to the specific domain effectively due to the fact that texts in domain-specific area are often noisy and contain phrases with ambiguous meanings that easily could be recognized as entity mention by traditional EDL methods but actually should not be linked to real entities (i.e., False Entity mention (FEM)). Moreover, in most current EDL literatures, ED (Entity Detection) and EL (Entity Linking) are frequently treated as equally important but separate problems and typically performed in a pipeline architecture without considering the mutual dependency between these two tasks. Therefore, to rigorously address the domain-specific EDL problem, we propose an iterative graph-based algorithm to jointly model the ED and EL tasks in domain-specific area by capturing the local dependency of mention-to-entity and the global interdependency of entity-to-entity. We extensively evaluated the performance of proposed algorithm over a data set of real world movie comments, and the experimental results show that the proposed approach significantly outperforms the state-of-the-art baselines and achieve 82.7% F1 score for ED and 89.0% linking accuracy for EL respectively.

16:10
LD2LD: Integrating, Enriching and Republishing Library Data as Linked Data (Full Paper)
SPEAKER: unknown

ABSTRACT. The development of digital library increases the need of integrate, enrich and republish library data as linked data. Linked library data could provide high quality and more tailored service for researchers as well as for the public. However, even though there are many data sets containing metadata about publications and researchers, it is cumbersome to integrate and analyze them, since the collection is still a manual process and the sources are not connect-ed to each other upfront. In this paper, we present an approach for integrating, enriching and republishing library data as linked data. In particular, we first adopt duplication detection and disambiguation techniques to reconcile researcher data, and then we connect researcher data with publication data such as papers, patents and monograph using entity linking methods. After that, we use simple reasoning to predict missing values and enrich the library data with external data. Finally, we republish the integrated and predicted values as linked data.

16:30
Object Clustering in Linked Data using Centrality (Full Paper)
SPEAKER: unknown

ABSTRACT. The volume of linked data is growing continuously. Large-scale linked data, such as DBpedia, is becoming a challenge to many Semantic Web tasks. While clustering of graphs has been deeply researched in network science and machine learning, not many researches are carried on clustering in linked data. To identify this meta-structure in large-scale linked data, the scalability of clustering should be considered. In this paper, we propose a scalable approach of centrality-based clustering, which works on a model of Object Graph derived from RDF graph. Centrality of objects is calculated as indicators for clustering. Both relational and linguistic closeness among objects are considered in clustering to produce coherent clusters.

16:50
Research on Knowledge Fusion Connotation and Process Model (Full Paper)
SPEAKER: unknown

ABSTRACT. The emergence of big-data brings diversified structures and constant growths of knowledge. The objective of knowledge fusion (KF) research is to integrate, discover and exploit valuable knowledge from distributed, heterogeneous and autonomous knowledge sources, which is the necessary prerequisite and effective approach to implement knowledge services. In order to apply KF practice, this paper firstly discusses KF connotations in terms of analyzing the relations and differences among various notions, i.e. knowledge fusion, knowledge integration, information fusion and data fusion. Then, based on knowledge representation methods using ontology, this paper investigates several KF implementation patterns and provides two types of dimensional KF process models oriented to demands of knowledge services.

17:10
E-SKB: A Semantic Knowledge Base for Emergency (Short Paper)
SPEAKER: unknown

ABSTRACT. Although the number of knowledge bases in Linked Open Data has grown explosively, there are few knowledge bases about emergency, an important issue in the area of social management. In this paper, we introduce a semantic knowledge base of emergency, extracted from an authoritative website. According to the characteristics of the website, a framework is suggested to convert web into RDF. In order to help researchers acquire more knowledge, we follow the publishing rules of Linked Open Data—not only using URIs to label the objects in the semantic knowledge base, but also providing links to DBpedia. Finally, we employ Sesame to store and publish the semantic knowledge base, and develop a query interface to retrieve the knowledge base with SPARQL.

18:30-20:30 Session : 晚宴(Banquet)5号楼大堂二层赏园餐厅
Chair:
Location: 5号楼大堂二层赏园餐厅(Building No. 5, Shangyuan Hall)