Results 221 to 230 of about 61,609 (261)
Some of the next articles are maybe not open access.
Proceedings of the 23rd international conference on Machine learning - ICML '06, 2006
A family of probabilistic time series models is developed to analyze the time evolution of topics in large document collections. The approach is to use state space models on the natural parameters of the multinomial distributions that represent the topics.
David M. Blei, John D. Lafferty
openaire +1 more source
A family of probabilistic time series models is developed to analyze the time evolution of topics in large document collections. The approach is to use state space models on the natural parameters of the multinomial distributions that represent the topics.
David M. Blei, John D. Lafferty
openaire +1 more source
Neurocomputing, 2012
In this paper the problem of performing external validation of the semantic coherence of topic models is considered. The Fowlkes-Mallows index, a known clustering validation metric, is generalized for the case of overlapping partitions and multi-labeled collections, thus making it suitable for validating topic modeling algorithms.
Eduardo H. Ramírez +3 more
openaire +2 more sources
In this paper the problem of performing external validation of the semantic coherence of topic models is considered. The Fowlkes-Mallows index, a known clustering validation metric, is generalized for the case of overlapping partitions and multi-labeled collections, thus making it suitable for validating topic modeling algorithms.
Eduardo H. Ramírez +3 more
openaire +2 more sources
2015 International Conference on Information Technology (ICIT), 2015
Topic Modeling has been a useful tool for finding abstract topics (which are collections of words) governing a collection of documents. Each document is then expressed as a collection of generated topics. The most basic topic model is Latent Dirichlet Allocation (LDA).
Nishma Laitonjam +3 more
openaire +1 more source
Topic Modeling has been a useful tool for finding abstract topics (which are collections of words) governing a collection of documents. Each document is then expressed as a collection of generated topics. The most basic topic model is Latent Dirichlet Allocation (LDA).
Nishma Laitonjam +3 more
openaire +1 more source
Proceedings of the 28th ACM International Conference on Information and Knowledge Management, 2019
Topic modeling has been widely applied in a variety of industrial applications. Training a high-quality model usually requires massive amount of in-domain data, in order to provide comprehensive co-occurrence information for the model to learn. However, industrial data such as medical or financial records are often proprietary or sensitive, which ...
Di Jiang 0004 +6 more
openaire +1 more source
Topic modeling has been widely applied in a variety of industrial applications. Training a high-quality model usually requires massive amount of in-domain data, in order to provide comprehensive co-occurrence information for the model to learn. However, industrial data such as medical or financial records are often proprietary or sensitive, which ...
Di Jiang 0004 +6 more
openaire +1 more source
On collocations and topic models
ACM Transactions on Speech and Language Processing, 2013We investigate the impact of preextracting and tokenizing bigram collocations on topic models. Using extensive experiments on four different corpora, we show that incorporating bigram collocations in the document representation creates more parsimonious models and improves topic coherence.
Jey Han Lau +2 more
openaire +1 more source
Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries, 2012
Concept taxonomies such as MeSH, the ACM Computing Classification System, and the NY Times Subject Headings are frequently used to help organize data. They typically consist of a set of concept names organized in a hierarchy. However, these names and structure are often not sufficient to fully capture the intended meaning of a taxonomy node, and ...
Anton Bakalov +3 more
openaire +1 more source
Concept taxonomies such as MeSH, the ACM Computing Classification System, and the NY Times Subject Headings are frequently used to help organize data. They typically consist of a set of concept names organized in a hierarchy. However, these names and structure are often not sufficient to fully capture the intended meaning of a taxonomy node, and ...
Anton Bakalov +3 more
openaire +1 more source
2009 Ninth IEEE International Conference on Data Mining, 2009
In this paper we propose the multirelational topic model (MRTM) for multiple types of link modeling such as citation and coauthor links in document networks. In the citation network, the MRTM models the citation link between each pair of documents as a binary variable conditioned on their topic distributions.
Jia Zeng +3 more
openaire +1 more source
In this paper we propose the multirelational topic model (MRTM) for multiple types of link modeling such as citation and coauthor links in document networks. In the citation network, the MRTM models the citation link between each pair of documents as a binary variable conditioned on their topic distributions.
Jia Zeng +3 more
openaire +1 more source
Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval, 2002
In this paper, we present a method based on document probes to quantify and diagnose topic structure, distinguishing topics as monolithic, structured, or diffuse. The method also yields a structure analysis that can be used directly to optimize filter (classifier) creation.
David A. Evans 0001 +2 more
openaire +1 more source
In this paper, we present a method based on document probes to quantify and diagnose topic structure, distinguishing topics as monolithic, structured, or diffuse. The method also yields a structure analysis that can be used directly to optimize filter (classifier) creation.
David A. Evans 0001 +2 more
openaire +1 more source
Machine Learning, 2013
Topic models have been used extensively as a tool for corpus exploration, and a cottage industry has developed to tweak topic models to better encode human intuitions or to better model data. However, creating such extensions requires expertise in machine learning unavailable to potential end-users of topic modeling software. In this work, we develop a
Yuening Hu +3 more
openaire +1 more source
Topic models have been used extensively as a tool for corpus exploration, and a cottage industry has developed to tweak topic models to better encode human intuitions or to better model data. However, creating such extensions requires expertise in machine learning unavailable to potential end-users of topic modeling software. In this work, we develop a
Yuening Hu +3 more
openaire +1 more source
2014 IEEE 4th Workshop on Mining Unstructured Data, 2014
Topic modeling has been applied to several areas of software engineering, such as bug localization, feature location, triaging change requests, and traceability link recovery. Many of these approaches combine mining unstructured data, such as bug reports, with topic modeling a snapshot (or release) of source code.
Christopher S. Corley +3 more
openaire +1 more source
Topic modeling has been applied to several areas of software engineering, such as bug localization, feature location, triaging change requests, and traceability link recovery. Many of these approaches combine mining unstructured data, such as bug reports, with topic modeling a snapshot (or release) of source code.
Christopher S. Corley +3 more
openaire +1 more source

