Results 11 to 20 of about 191,232 (328)
Early Word Segmentation Behind the Mask. [PDF]
Infants have been shown to rely both on auditory and visual cues when processing speech. We investigated the impact of COVID-related changes, in particular of face masks, in early word segmentation abilities. Following up on our previous study demonstrating that, by 4 months, infants already segmented targets presented auditorily at utterance-edge ...
Frota S +4 more
europepmc +4 more sources
Modelling function words improves unsupervised word segmentation [PDF]
Inspired by experimental psychological findings suggesting that function words play a special role in word learning, we make a simple modification to an Adaptor Grammar based Bayesian word segmentation model to allow it to learn sequences of monosyllabic “function words” at the beginnings and endings of collocations of (possibly multi-syllabic) words ...
Mark Johnson +3 more
openalex +2 more sources
Word Segmentation, or Makingsenseofthis
A First Look at Google’s N-Gram Corpus In this post we will focus on the problem of finding the appropriate word boundaries in strings like “homebuiltairplanes”, as is common in web URLs like www.homebuiltairplanes.com. This is an interesting problem because humans do it so easily, but there is no obvious programmatic solution.
Jeremy Kun
openalex +2 more sources
Nonparametric Bayesian Semi-supervised Word Segmentation [PDF]
This paper presents a novel hybrid generative/discriminative model of word segmentation based on nonparametric Bayesian methods. Unlike ordinary discriminative word segmentation which relies only on labeled data, our semi-supervised model also leverages a huge amounts of unlabeled text to automatically learn new “words”, and further constrains them by
Ryo Fujii, Ryo Domoto, Daichi Mochihashi
doaj +2 more sources
Detecting “protein words” through unsupervised word segmentation [PDF]
Unsupervised word segmentation methods were applied to analyze protein sequences. Protein sequences, such as “MTMDKSELVQKA…,” were used as input to these methods. Segmented protein word sequences, such as “MTM DKSE LVQKA,” were then obtained.
Liang, Wang, KaiYong, Zhao
openaire +2 more sources
An Algorithm Rapidly Segmenting Chinese Sentences into Individual Words [PDF]
This paper proposes an improved Trie tree structure. The tree node records the position information of the characters participating in the word formation, and the child node uses the hash search mechanism.
Xiong Zhibin
doaj +1 more source
Universal Word Segmentation: Implementation and Interpretation [PDF]
Word segmentation is a low-level NLP task that is non-trivial for a considerable number of languages. In this paper, we present a sequence tagging framework and apply it to word segmentation for a wide range of languages with different writing systems and typological characteristics.
Yan Shao +2 more
doaj +5 more sources
Word-Context Character Embeddings for Chinese Word Segmentation [PDF]
Neural parsers have benefited from automatically labeled data via dependency-context word embeddings. We investigate training character embeddings on a word-based context in a similar way, showing that the simple method improves state-of-the-art neural word segmentation models significantly, beating tri-training baselines for leveraging auto-segmented ...
Hao Zhou +5 more
openalex +2 more sources
Reduplication facilitates early word segmentation [PDF]
AbstractThis study explores the possibility that early word segmentation is aided by infants’ tendency to segment words with repeated syllables (‘reduplication’). Twenty-four nine-month-olds were familiarized with passages containing one novel reduplicated word and one novel non-reduplicated word.
Skarabela, Barbora, Ota, Mitsuhiko
openaire +3 more sources
Hybrid Feature Fusion Learning Towards Chinese Chemical Literature Word Segmentation
The rapid increase in the number of chemical science literature has brought challenges to researchers in search and data analysis. For many chemical scientific literature, extracting information from text and using knowledge is the focus of research ...
Xiang Li +4 more
doaj +1 more source

