Results 21 to 30 of about 192,314 (323)

Hybrid Feature Fusion Learning Towards Chinese Chemical Literature Word Segmentation

open access: yesIEEE Access, 2021
The rapid increase in the number of chemical science literature has brought challenges to researchers in search and data analysis. For many chemical scientific literature, extracting information from text and using knowledge is the focus of research ...
Xiang Li   +4 more
doaj   +1 more source

Synthetic Word Parsing Improves Chinese Word Segmentation [PDF]

open access: yesProceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), 2015
We present a novel solution to improve the performance of Chinese word segmentation (CWS) using a synthetic word parser. The parser analyses the internal structure of words, and attempts to convert out-of-vocabulary words (OOVs) into in-vocabulary fine-grained sub-words.
Fei Cheng, Kevin Duh, Yuji Matsumoto
openaire   +1 more source

Arabic Word Segmentation With Long Short-Term Memory Neural Networks and Word Embedding

open access: yesIEEE Access, 2019
In this paper, we propose an Arabic word segmentation technique based on a bi-directional long short-term memory deep neural network. This paper addresses the two tasks of word segmentation only and word segmentation for nine cases of the rewrite.
Abdulrahman Almuhareb   +2 more
doaj   +1 more source

Feature Extraction and Analysis of Natural Language Processing for Deep Learning English Language

open access: yesIEEE Access, 2020
NLP (Natural Language Processing) is a technology that enables computers to understand human languages. Deep-level grammatical and semantic analysis usually uses words as the basic unit, and word segmentation is usually the primary task of NLP.
Dongyang Wang, Junli Su, Hongbin Yu
doaj   +1 more source

An efficient, font independent word and character segmentation algorithm for printed Arabic text

open access: yesJournal of King Saud University: Computer and Information Sciences, 2022
Characters segmentation is a necessity and the most critical stage in Arabic OCR system. It has attracted the interest of a wide range of researchers. However, the nature of the Arabic cursive script poses extra challenges that need further investigation.
Aziz Qaroush   +5 more
doaj   +1 more source

BabyLM’s First Words: Word Segmentation as a Phonological Probing Task [PDF]

open access: goldProceedings of the 29th Conference on Computational Natural Language Learning
Language models provide a key framework for studying linguistic theories based on prediction, but phonological analysis using large language models (LLMs) is difficult; there are few phonological benchmarks beyond English and the standard input representation used in LLMs (subwords of graphemes) is not suitable for analyzing the representation of ...
Zébulon Goriely, Paula Buttery
openalex   +3 more sources

Segmenting Chinese Texts into Words for Semantic Network Analysis [PDF]

open access: yesJournal of Contemporary Eastern Asia, 2017
Unlike most languages, written Chinese has no spaces between words. Word segmentation must be performed before semantic network analysis can be conducted.
James A. Danowski
doaj   +1 more source

Word-Context Character Embeddings for Chinese Word Segmentation [PDF]

open access: goldProceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017
Neural parsers have benefited from automatically labeled data via dependency-context word embeddings. We investigate training character embeddings on a word-based context in a similar way, showing that the simple method improves state-of-the-art neural word segmentation models significantly, beating tri-training baselines for leveraging auto-segmented ...
Hao Zhou   +5 more
openalex   +2 more sources

A Graph-based Model for Joint Chinese Word Segmentation and Dependency Parsing

open access: yesTransactions of the Association for Computational Linguistics, 2020
Chinese word segmentation and dependency parsing are two fundamental tasks for Chinese natural language processing. The dependency parsing is defined at the word-level.
Yan, Hang, Qiu, Xipeng, Huang, Xuanjing
doaj   +1 more source

A Domain Feature Word Vector Description Method for Military Texts [PDF]

open access: yesJisuanji gongcheng, 2016
According to the large number of named entities and deep domain of feature words in military text information,this paper proposes a vector description method for domain feature words.It compresses the vector space through the optimization of word ...
QIN Jie,CAO Lei,PENG Hui,LAI Jun
doaj   +1 more source

Home - About - Disclaimer - Privacy