Results 141 to 150 of about 219,349 (181)
Some of the next articles are maybe not open access.
Data Mining on Imbalanced Data Sets
2008 International Conference on Advanced Computer Theory and Engineering, 2008The majority of machine learning algorithms previously designed usually assume that their training sets are well-balanced, and implicitly assume that all misclassification errors cost equally. But data in real-world is usually imbalanced. The class imbalance problem is pervasive and ubiquitous, causing trouble to a large segment of the data mining ...
Qiong Gu, Zhihua Cai, Li Zhu, Bo Huang
openaire +1 more source
Imbalanced Multi-instance Data
2016Class imbalance is widely studied in single-instance learning and refers to the situation where the data observations are unevenly distributed among the possible classes. This phenomenon can present itself in MIL as well. Section 9.1 presents a general introduction to the topic of class imbalance, list the types of solutions to deal with it, and the ...
Francisco Herrera +6 more
openaire +1 more source
Imbalanced Data Preprocessing for Big Data
2020The negative impact on learning associated with imbalanced proportion of classes has exploded lately with the exponential growth of “cheap” data. Many real-world problems present scarce number of instances in one class whereas in others their cardinality is several factors greater.
Julián Luengo +4 more
openaire +1 more source
IIvotes ensemble for imbalanced data
Intelligent Data Analysis, 2012In the paper we present IIvotes – a new framework for constructing an ensemble of classifiers from imbalanced data. IIvotes incorporates the SPIDER method for selective data pre-processing into the adaptive Ivotes ensemble. Such an integration is aimed at improving balance between sensitivity and specificity (evaluated by the G-mean measure) for the ...
Błaszczyński, Jerzy +3 more
openaire +1 more source
Classifying Severely Imbalanced Data
2011Learning from data with severe class imbalance is difficult. Established solutions include: under-sampling, adjusting classification threshold, and using an ensemble. We examine the performance of combining these solutions to balance the sensitivity and specificity for binary classifications, and to reduce the MSE score for probability estimation.
William Klement +3 more
openaire +1 more source
2011
An imbalanced training dataset can pose serious problems for many real-world data-mining tasks that conduct supervised learning. In this chapter,\(^\dagger\) we present a kernel-boundary-alignment algorithm, which considers training-data imbalance as prior information to augment SVMs to improve class-prediction accuracy.
openaire +1 more source
An imbalanced training dataset can pose serious problems for many real-world data-mining tasks that conduct supervised learning. In this chapter,\(^\dagger\) we present a kernel-boundary-alignment algorithm, which considers training-data imbalance as prior information to augment SVMs to improve class-prediction accuracy.
openaire +1 more source
Credal Clustering for Imbalanced Data
2021Traditional evidential clustering tends to build clusters where the number of data for each cluster fairly close to each other. However, it may not be suitable for imbalanced data. This paper proposes a new method, called credal clustering (CClu), to deal with imbalanced data based on the theory of belief functions.
Zhang, Zuowei +4 more
openaire +2 more sources
Learning in imbalanced relational data
2008 19th International Conference on Pattern Recognition, 2008Traditional learning techniques learn from flat data files with the assumption that each class has a similar number of examples. However, the majority of real-world data are stored as relational systems with imbalanced data distribution, where one class of data is over-represented as compared with other classes.
Amal S. Ghanem +2 more
openaire +1 more source
Imbalanced Data and Resampling Techniques
2021The SPSS Modeler helps us to build statistical models to predict certain variables. These variables can be that, e.g., a customer buys a product or not or a patient is sick or healthy. Here we have a binary target variable. So far, we discussed methods to predict these variables based on the assumption that the frequency of each possible value is ...
Tilo Wendler, Sören Gröttrup
openaire +1 more source
Handling Imbalanced Data: A Survey
2017Nowadays, handling of the imbalance data is a major challenge. Imbalanced data set means the instances of one class are much more than the instances of another class where the majority and minority class or classes are taken as negative and positive, respectively.
Neelam Rout +2 more
openaire +1 more source

