• 川村 隆浩
    (株)東芝 研究開発センター(現在は,国立研究開発法人 科学技術振興機構 情報分析室に所属.)
  • 大須賀 昭彦
    電気通信大学 大学院情報システム学研究科

書誌事項

タイトル別名
  • ~ テキスト情報のLOD化に向けたWeb APIの開発 ~
  • -- Development of Web API for Triplification of Text Information --

抄録

Linked Open Data (LOD) is recently attracting attention as a vast amount of distributed knowledge base on the Web.Thus, semi-structured data such as tables and hierarchical data in several domains have been triplified to the LOD.In the research area, however, triplification of unstructured data such as text and sensor data is actively studied as the next target.Therefore, we developed a Web API for mainly extracting triples from text data, which is useful for the triplification of text data.We defined two steps for the text triplication.The first step is the extraction of phrases, which correspond to triple <subject, verb, object>, location and time from a natural language sentence, and the second one is a conversion of the extracted phrases to the existing (or new) resources and properties in the LOD. In this paper, we first describe the service specification corresponding to the first step, technical background, and evaluation of the current extraction accuracy, then finally introduce some use cases of the service. Although this service adopts a novel combination of a restrictive method using ontology-based rules and an example-based machine learning method using conditional random field, based on probability distribution, the main cotribution of the service is in practical aspect, that is, mash-up of several natural language processing techniques as a text triplification service, and deployment as a Web API freely available for public use so that non-expert easily use it.

収録刊行物

参考文献 (6)*注記

もっと見る

関連プロジェクト

もっと見る

詳細情報 詳細情報について

問題の指摘

ページトップへ