Results 21 to 30 of about 191,011 (278)
An efficient, font independent word and character segmentation algorithm for printed Arabic text
Characters segmentation is a necessity and the most critical stage in Arabic OCR system. It has attracted the interest of a wide range of researchers. However, the nature of the Arabic cursive script poses extra challenges that need further investigation.
Aziz Qaroush +5 more
doaj +1 more source
Segmenting Chinese Texts into Words for Semantic Network Analysis [PDF]
Unlike most languages, written Chinese has no spaces between words. Word segmentation must be performed before semantic network analysis can be conducted.
James A. Danowski
doaj +1 more source
A Domain Feature Word Vector Description Method for Military Texts [PDF]
According to the large number of named entities and deep domain of feature words in military text information,this paper proposes a vector description method for domain feature words.It compresses the vector space through the optimization of word ...
QIN Jie,CAO Lei,PENG Hui,LAI Jun
doaj +1 more source
Speculation detection for Chinese clinical notes: Impacts of word segmentation and embedding models. [PDF]
Zhang S +5 more
europepmc +2 more sources
Integrating Prosodic and Lexical Cues for Automatic Topic Segmentation [PDF]
We present a probabilistic model that uses both prosodic and lexical cues for the automatic segmentation of speech into topically coherent units. We propose two methods for combining lexical and prosodic information using hidden Markov models and ...
Andreas Stolcke +6 more
core +6 more sources
Modelling function words improves unsupervised word segmentation [PDF]
Inspired by experimental psychological findings suggesting that function words play a special role in word learning, we make a simple modification to an Adaptor Grammar based Bayesian word segmentation model to allow it to learn sequences of monosyllabic “function words” at the beginnings and endings of collocations of (possibly multi-syllabic) words ...
Mark Johnson +3 more
openaire +1 more source
A Graph-based Model for Joint Chinese Word Segmentation and Dependency Parsing
Chinese word segmentation and dependency parsing are two fundamental tasks for Chinese natural language processing. The dependency parsing is defined at the word-level.
Yan, Hang, Qiu, Xipeng, Huang, Xuanjing
doaj +1 more source
In natural language, the phenomenon of polysemy is widespread, which makes it very difficult for machines to process natural language. Word sense disambiguation is a key issue in the field of natural language processing.
Lei Wang, Qun Ai
doaj +1 more source
Word-based largest chunks for Agreement Groups processing: Cross-linguistic observations
The present study reports results from a series of computer experiments seeking to combine word-based Largest Chunk (LCh) segmentation and Agreement Groups (AG) sequence processing.
László Drienkó
doaj +1 more source
Systran's Chinese word segmentation [PDF]
SYSTRAN's Chinese word segmentation is one important component of its Chinese-English machine translation system. The Chinese word segmentation module uses a rule-based approach, based on a large dictionary and fine-grained linguistic rules. It works on general-purpose texts from different Chinese-speaking regions, with comparable performance.
Jin Yang, Jean Senellart, Remi Zajac
openaire +1 more source

