Synthetic Word Parsing Improves Chinese Word Segmentation [PDF]
We present a novel solution to improve the performance of Chinese word segmentation (CWS) using a synthetic word parser. The parser analyses the internal structure of words, and attempts to convert out-of-vocabulary words (OOVs) into in-vocabulary fine-grained sub-words.
Fei Cheng, Kevin Duh, Yuji Matsumoto
openaire +1 more source
Arabic Word Segmentation With Long Short-Term Memory Neural Networks and Word Embedding
In this paper, we propose an Arabic word segmentation technique based on a bi-directional long short-term memory deep neural network. This paper addresses the two tasks of word segmentation only and word segmentation for nine cases of the rewrite.
Abdulrahman Almuhareb +2 more
doaj +1 more source
Feature Extraction and Analysis of Natural Language Processing for Deep Learning English Language
NLP (Natural Language Processing) is a technology that enables computers to understand human languages. Deep-level grammatical and semantic analysis usually uses words as the basic unit, and word segmentation is usually the primary task of NLP.
Dongyang Wang, Junli Su, Hongbin Yu
doaj +1 more source
An efficient, font independent word and character segmentation algorithm for printed Arabic text
Characters segmentation is a necessity and the most critical stage in Arabic OCR system. It has attracted the interest of a wide range of researchers. However, the nature of the Arabic cursive script poses extra challenges that need further investigation.
Aziz Qaroush +5 more
doaj +1 more source
Domain-Specific Chinese Word Segmentation Based on Bi-Directional Long-Short Term Memory Model
Most of the current word segmentation methods are rule-based and traditional machine learning methods. Universal word segmentation tools do not work well in the field such as metallurgy. Domain-specific Chinese word segmentation is rarely studied.
Dangguo Shao +6 more
doaj +1 more source
Segmenting Chinese Texts into Words for Semantic Network Analysis [PDF]
Unlike most languages, written Chinese has no spaces between words. Word segmentation must be performed before semantic network analysis can be conducted.
James A. Danowski
doaj +1 more source
A Domain Feature Word Vector Description Method for Military Texts [PDF]
According to the large number of named entities and deep domain of feature words in military text information,this paper proposes a vector description method for domain feature words.It compresses the vector space through the optimization of word ...
QIN Jie,CAO Lei,PENG Hui,LAI Jun
doaj +1 more source
A Graph-based Model for Joint Chinese Word Segmentation and Dependency Parsing
Chinese word segmentation and dependency parsing are two fundamental tasks for Chinese natural language processing. The dependency parsing is defined at the word-level.
Yan, Hang, Qiu, Xipeng, Huang, Xuanjing
doaj +1 more source
In natural language, the phenomenon of polysemy is widespread, which makes it very difficult for machines to process natural language. Word sense disambiguation is a key issue in the field of natural language processing.
Lei Wang, Qun Ai
doaj +1 more source
Word-based largest chunks for Agreement Groups processing: Cross-linguistic observations
The present study reports results from a series of computer experiments seeking to combine word-based Largest Chunk (LCh) segmentation and Agreement Groups (AG) sequence processing.
László Drienkó
doaj +1 more source

