Using the Ship-Gram Model for Japanese Keyword Extraction Based on News Reports

Complexity 2021:1-9 (2021)
  Copy   BIBTEX

Abstract

In this paper, we conduct an in-depth study of Japanese keyword extraction from news reports, train external computer document word sets from text preprocessing into word vectors using the Ship-gram model in the deep learning tool Word2Vec, and calculate the cosine distance between word vectors. In this paper, the sliding window in TextRank is designed to connect internal document information to improve the in-text semantic coherence. The main idea is to use not only the statistical and structural features of words but also the semantic features of words extracted through word-embedding techniques, i.e., multifeature fusion, to obtain the importance weights of words themselves and the attraction weights between words and then iteratively calculate the final weight of each word through the graph model algorithm to determine the extracted keywords. To verify the performance of the algorithm, extensive simulation experimental studies were conducted on three different types of datasets. The experimental results show that the proposed keyword extraction algorithm can improve the performance by a maximum of 6.45% and 20.36% compared with the existing word frequency statistics and graph model methods, respectively; MF-Rank can achieve a maximum performance improvement of 1.76% compared with PW-TF.

Links

PhilArchive



    Upload a copy of this work     Papers currently archived: 91,322

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

Similar books and articles

Analytics

Added to PP
2021-04-17

Downloads
5 (#1,505,296)

6 months
2 (#1,263,261)

Historical graph of downloads
How can I increase my downloads?

Citations of this work

No citations found.

Add more citations

References found in this work

No references found.

Add more references