Results 1 to 10 of about 787,403 (209)
Enhancement of Short Text Clustering by Iterative Classification [PDF]
Short text clustering is a challenging task due to the lack of signal contained in such short texts. In this work, we propose iterative classification as a method to b o ost the clustering quality (e.g., accuracy) of short texts. Given a clustering of short texts obtained using an arbitrary clustering algorithm, iterative classification applies outlier
Rakib M, Zeh N, Jankowska M, Milios E.
europepmc +5 more sources
TextNetTopics Pro, a topic model-based text classification for short text by integration of semantic and document-topic distribution information [PDF]
With the exponential growth in the daily publication of scientific articles, automatic classification and categorization can assist in assigning articles to a predefined category.
Daniel Voskergian+3 more
doaj +2 more sources
Using Topic Modeling Methods for Short-Text Data: A Comparative Analysis [PDF]
With the growth of online social network platforms and applications, large amounts of textual user-generated content are created daily in the form of comments, reviews, and short-text messages.
Rania Albalawi+2 more
doaj +2 more sources
Efficient Long-Text Understanding with Short-Text Models
Transformer-based pretrained language models (LMs) are ubiquitous across natural language understanding, but cannot be applied to long sequences such as stories, scientific articles, and long documents due to their quadratic complexity. While a myriad of
Maor Ivgi, Uri Shaham, Jonathan Berant
doaj +3 more sources
Representation Learning for Short Text Clustering [PDF]
Effective representation learning is critical for short text clustering due to the sparse, high-dimensional and noise attributes of short text corpus. Existing pre-trained models (e.g., Word2vec and BERT) have greatly improved the expressiveness for short text representations with more condensed, low-dimensional and continuous features compared to the ...
Yin, Hui+4 more
arxiv +4 more sources
Learning-based short text compression using BERT models [PDF]
Learning-based data compression methods have gained significant attention in recent years. Although these methods achieve higher compression ratios compared to traditional techniques, their slow processing times make them less suitable for compressing ...
Emir Öztürk, Altan Mesut
doaj +3 more sources
LCSTS: A Large Scale Chinese Short Text Summarization Dataset [PDF]
Automatic text summarization is widely regarded as the highly difficult problem, partially because of the lack of large text summarization data set. Due to the great challenge of constructing the large scale summaries for full text, in this paper, we introduce a large corpus of Chinese short text summarization dataset constructed from the Chinese ...
Baotian Hu, Qingcai Chen, Fangze Zhu
arxiv +3 more sources
End-to-end Learning for Short Text Expansion [PDF]
Effectively making sense of short texts is a critical task for many real world applications such as search engines, social media services, and recommender systems. The task is particularly challenging as a short text contains very sparse information, often too sparse for a machine learning algorithm to pick up useful signals.
Jian Tang+3 more
arxiv +3 more sources
Topic Modeling over Short Texts by Incorporating Word Embeddings [PDF]
Inferring topics from the overwhelming amount of short texts becomes a critical but challenging task for many content analysis tasks, such as content charactering, user interest profiling, and emerging topic detecting. Existing methods such as probabilistic latent semantic analysis (PLSA) and latent Dirichlet allocation (LDA) cannot solve this prob ...
Jipeng Qiang+3 more
arxiv +3 more sources
Short Modern Winnebago Text with Song [PDF]
O. Hymes (1981) discusses a number of cases of hitherto overlooked implicit structuring in Amerindian narratives and song texts. His principles of analysis themselves remain largely implicit, but in general the approach seems to be to search for organizing principles which are multiply justified.
Miner, Kenneth L.
doaj +4 more sources