Arabic Word Segmentation With Long Short-Term Memory Neural Networks and Word Embedding
In this paper, we propose an Arabic word segmentation technique based on a bi-directional long short-term memory deep neural network. This paper addresses the two tasks of word segmentation only and word segmentation for nine cases of the rewrite.
Abdulrahman Almuhareb +2 more
doaj +3 more sources
CWSXLNet: A Sentiment Analysis Model Based on Chinese Word Segmentation Information Enhancement
This paper proposed a method for improving the XLNet model to address the shortcomings of segmentation algorithm for processing Chinese language, such as long sub-word lengths, long word lists and incomplete word list coverage.
Shiqian Guo +4 more
doaj +1 more source
Do Chinese readers follow the national standard rules for word segmentation during reading? [PDF]
We conducted a preliminary study to examine whether Chinese readers' spontaneous word segmentation processing is consistent with the national standard rules of word segmentation based on the Contemporary Chinese language word segmentation specification ...
Ping-Ping Liu +3 more
doaj +1 more source
Detecting “protein words” through unsupervised word segmentation [PDF]
Unsupervised word segmentation methods were applied to analyze protein sequences. Protein sequences, such as “MTMDKSELVQKA…,” were used as input to these methods. Segmented protein word sequences, such as “MTM DKSE LVQKA,” were then obtained.
Liang, Wang, KaiYong, Zhao
openaire +2 more sources
An Algorithm Rapidly Segmenting Chinese Sentences into Individual Words [PDF]
This paper proposes an improved Trie tree structure. The tree node records the position information of the characters participating in the word formation, and the child node uses the hash search mechanism.
Xiong Zhibin
doaj +1 more source
Reduplication facilitates early word segmentation [PDF]
AbstractThis study explores the possibility that early word segmentation is aided by infants’ tendency to segment words with repeated syllables (‘reduplication’). Twenty-four nine-month-olds were familiarized with passages containing one novel reduplicated word and one novel non-reduplicated word.
Skarabela, Barbora, Ota, Mitsuhiko
openaire +3 more sources
Hybrid Feature Fusion Learning Towards Chinese Chemical Literature Word Segmentation
The rapid increase in the number of chemical science literature has brought challenges to researchers in search and data analysis. For many chemical scientific literature, extracting information from text and using knowledge is the focus of research ...
Xiang Li +4 more
doaj +1 more source
Synthetic Word Parsing Improves Chinese Word Segmentation [PDF]
We present a novel solution to improve the performance of Chinese word segmentation (CWS) using a synthetic word parser. The parser analyses the internal structure of words, and attempts to convert out-of-vocabulary words (OOVs) into in-vocabulary fine-grained sub-words.
Fei Cheng, Kevin Duh, Yuji Matsumoto
openaire +1 more source
Domain-Specific Chinese Word Segmentation Based on Bi-Directional Long-Short Term Memory Model
Most of the current word segmentation methods are rule-based and traditional machine learning methods. Universal word segmentation tools do not work well in the field such as metallurgy. Domain-specific Chinese word segmentation is rarely studied.
Dangguo Shao +6 more
doaj +1 more source
Feature Extraction and Analysis of Natural Language Processing for Deep Learning English Language
NLP (Natural Language Processing) is a technology that enables computers to understand human languages. Deep-level grammatical and semantic analysis usually uses words as the basic unit, and word segmentation is usually the primary task of NLP.
Dongyang Wang, Junli Su, Hongbin Yu
doaj +1 more source

