Using the Ship-Gram Model for Japanese Keyword Extraction Based on News Reports

Miao Teng

Download from

dx.doi.org

More download options

Using the Ship-Gram Model for Japanese Keyword Extraction Based on News Reports

Miao Teng

Complexity 2021:1-9 (2021) Copy BIBT_EX

Abstract

In this paper, we conduct an in-depth study of Japanese keyword extraction from news reports, train external computer document word sets from text preprocessing into word vectors using the Ship-gram model in the deep learning tool Word2Vec, and calculate the cosine distance between word vectors. In this paper, the sliding window in TextRank is designed to connect internal document information to improve the in-text semantic coherence. The main idea is to use not only the statistical and structural features of words but also the semantic features of words extracted through word-embedding techniques, i.e., multifeature fusion, to obtain the importance weights of words themselves and the attraction weights between words and then iteratively calculate the final weight of each word through the graph model algorithm to determine the extracted keywords. To verify the performance of the algorithm, extensive simulation experimental studies were conducted on three different types of datasets. The experimental results show that the proposed keyword extraction algorithm can improve the performance by a maximum of 6.45% and 20.36% compared with the existing word frequency statistics and graph model methods, respectively; MF-Rank can achieve a maximum performance improvement of 1.76% compared with PW-TF.

Cite

Plain text

BibTeX

Formatted text

Zotero

EndNote

Reference Manager

RefWorks

Options

Mark as duplicate

Find it on Scholar

Request removal from index

Revision history

Edit

Keywords

Add keywords

Reprint years

DOI

10.1155/2021/9965843

My notes

Similar books and articles

Web News Data Extraction Technology Based on Text Keywords.Kun Zhang - 2021 - Complexity 2021:1-11.

Model and Simulation of Maximum Entropy Phrase Reordering of English Text in Language Learning Machine.Weifang Wu - 2020 - Complexity 2020:1-9.

Detecting Pronunciation Errors in Spoken English Tests Based on Multifeature Fusion Algorithm.Yinping Wang - 2021 - Complexity 2021:1-11.

SynoExtractor: A Novel Pipeline for Arabic Synonym Extraction Using Word2Vec Word Embeddings.Rawan N. Al-Matham & Hend S. Al-Khalifa - 2021 - Complexity 2021:1-13.

A Graph Convolutional Network-Based Sensitive Information Detection Algorithm.Ying Liu, Chao-Yu Yang & Jie Yang - 2021 - Complexity 2021:1-8.

Word Extraction and Character Segmentation from Text Lines of Unconstrained Handwritten Bangla Document Images.Mita Nasipuri, Mahantapas Kundu, Subhadip Basu, Nibaran Das, Samir Malakar & Ram Sarkar - 2011 - Journal of Intelligent Systems 20 (3):227-260.

Synthetic Network and Search Filter Algorithm in English Oral Duplicate Correction Map.Xiaojun Chen - 2021 - Complexity 2021:1-12.

Design and Implementation of English Intelligent Communication Platform Based on Similarity Algorithm.Yujie Chai - 2021 - Complexity 2021:1-10.

A Novel Chinese Entity Relationship Extraction Method Based on the Bidirectional Maximum Entropy Markov Model.Chengyao Lv, Deng Pan, Yaxiong Li, Jianxin Li & Zong Wang - 2021 - Complexity 2021:1-8.

Deep Learning- and Word Embedding-Based Heterogeneous Classifier Ensembles for Text Classification.Zeynep H. Kilimci & Selim Akyokus - 2018 - Complexity 2018:1-10.

Collaborative Filtering Recommendation Algorithm for MOOC Resources Based on Deep Learning.Lili Wu - 2021 - Complexity 2021:1-11.

Extracting indices from Japanese legal documents.Tho Thi Ngoc Le, Kiyoaki Shirai, Minh Le Nguyen & Akira Shimazu - 2015 - Artificial Intelligence and Law 23 (4):315-344.

Deep Belief Network-Based Multifeature Fusion Music Classification Algorithm and Simulation.Tianzhuo Gong - 2021 - Complexity 2021:1-10.

Commodity Image Classification Based on Improved Bag-of-Visual-Words Model.Huadong Sun, Xu Zhang, Xiaowei Han, Xuesong Jin & Zhijie Zhao - 2021 - Complexity 2021:1-10.

Log Posterior Approach in Learning Rules Generated using N-Gram based Edit distance for Keyword Search.M. Priya & R. Kalpana - 2018 - Journal of Intelligent Systems 27 (4):555-563.

Analytics

Added to PP
2021-04-17

Downloads
5 (#1,505,296)

6 months
2 (#1,263,261)

Historical graph of downloads

How can I increase my downloads?

Citations of this work

No citations found.

Add more citations

References found in this work

No references found.

Add more references

Applied ethics	Epistemology	History of Western Philosophy	Meta-ethics	Metaphysics	Normative ethics
Philosophy of biology	Philosophy of language	Philosophy of mind	Philosophy of religion	Science Logic and Mathematics	More ...

Using the Ship-Gram Model for Japanese Keyword Extraction Based on News Reports

Abstract

Categories

Keywords

Reprint years

DOI

Links

PhilArchive

External links

Through your library

My notes

Similar books and articles

Analytics

Citations of this work

References found in this work