Results 21 to 30 of about 55,011 (275)
Multi-Grained Chinese Word Segmentation [PDF]
Traditionally, word segmentation (WS) adopts the single-grained formalism, where a sentence corresponds to a single word sequence. However, Sproat et al. (1997) show that the inter-native-speaker consistency ratio over Chinese word boundaries is only 76%, indicating single-grained WS (SWS) imposes unnecessary challenges on both manual annotation and ...
Chen Gong +3 more
openaire +1 more source
Rethinking Chinese word segmentation [PDF]
no ...
Huang, C.R. +3 more
openaire +2 more sources
The purpose of this article aims to analyze the effect of word-word space in written Chinese to advanced non-native speakers when they read and process Mandarin texts.
Ken Chen +3 more
doaj +1 more source
Bidirectional Gated Recurrent Unit Neural Network for Chinese Address Element Segmentation
Chinese address element segmentation is a basic and key step in geocoding technology, and the segmentation results directly affect the accuracy and certainty of geocoding.
Pengpeng Li +6 more
doaj +1 more source
Domain-Specific Chinese Word Segmentation Based on Bi-Directional Long-Short Term Memory Model
Most of the current word segmentation methods are rule-based and traditional machine learning methods. Universal word segmentation tools do not work well in the field such as metallurgy. Domain-specific Chinese word segmentation is rarely studied.
Dangguo Shao +6 more
doaj +1 more source
Integrated approaches to prosodic word prediction for Chinese TTS [PDF]
We focus on integrated prosodic word prediction for Chinese TTS. To avoid the problem of inconsistency between lexical words and prosodic words in Chinese, lexical word segmentation and prosodic word prediction are taken as one process instead of two ...
Fu, G, Luke, KK
core +1 more source
Word-Context Character Embeddings for Chinese Word Segmentation [PDF]
Neural parsers have benefited from automatically labeled data via dependency-context word embeddings. We investigate training character embeddings on a word-based context in a similar way, showing that the simple method improves state-of-the-art neural word segmentation models significantly, beating tri-training baselines for leveraging auto-segmented ...
Hao Zhou +5 more
openaire +1 more source
Ergodic multigram HMM integrating word segmentation and classtagging for Chinese language modeling [PDF]
A novel ergodic multigram hidden Markov model (HMM) is introduced which models sentence production as a doubly stochastic process, in which word classes are first produced according to a first order Markov model, and then single or multi-character words ...
Chan, C, Law, HHC
core +1 more source
Introduction to CKIP Chinese word segmentation system for the first international Chinese Word Segmentation Bakeoff [PDF]
In this paper, we roughly described the procedures of our segmentation system, including the methods for resolving segmentation ambiguities and identifying unknown words. The CKIP group of Academia Sinica participated in testing on open and closed tracks of Beijing University (PK) and Hong Kong Cityu (HK).
Wei-Yun Ma, Keh-Jiann Chen
openaire +1 more source
Combining segmenter and chunker for Chinese word segmentation [PDF]
Our proposed method is to use a Hidden Markov Model-based word segmenter and a Support Vector Machine-based chunker for Chinese word segmentation. Firstly, input sentences are analyzed by the Hidden Markov Model-based word segmenter. The word segmenter produces n-best word candidates together with some class information and confidence measures ...
Masayuki Asahara +3 more
openaire +1 more source

