人工知能学会論文誌
Online ISSN : 1346-8030
Print ISSN : 1346-0714
ISSN-L : 1346-0714
原著論文
意味構成のための非線形類似度学習
椿 真史新保 仁松本 裕治
著者情報
ジャーナル フリー

2016 年 31 巻 2 号 p. O-FA2_1-10

詳細
抄録

The notion of semantic similarity between text data (e.g., words, phrases, sentences, and documents) plays an important role in natural language processing (NLP) applications such as information retrieval, classification, and extraction. Recently, word vector spaces using distributional and distributed models have become popular. Although word vectors provide good similarity measures between words, phrasal and sentential similarities derived from composition of individual words remain as a difficult problem. To solve the problem, we focus on representing and learning the semantic similarity of sentences in a space that has a higher representational power than the underlying word vector space. In this paper, we propose a new method of non-linear similarity learning for compositionality. With this method, word representations are learnedthrough the similarity learning of sentences in a high-dimensional space with implicit kernel functions, and we can obtain new word epresentations inexpensively without explicit computation of sentence vectors in the high-dimensional space. In addition, note that our approach differs from that of deep learning such as recursive neural networks (RNNs) and long short-term memory (LSTM). Our aim is to design a word representation learning which combines the embedding sentence structures in a low-dimensional space (i.e., neural networks) with non-linear similarity learning for the sentence semantics in a high-dimensional space (i.e., kernel methods). On the task of predicting the semantic similarity of two sentences (SemEval 2014, task 1), our method outperforms linear baselines, feature engineering approaches, RNNs, and achieve competitive results with various LSTM models.

著者関連情報
© 2016 JSAI (The Japanese Society for Artificial Intelligence)
前の記事
feedback
Top