Strong consistency of factorial $$K$$ -means clustering

Terada, Yoshikazu

doi:10.1007/s10463-014-0454-0

Strong consistency of factorial $K$-means clustering

Published: 27 March 2014

Volume 67, pages 335–357, (2015)
Cite this article

Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Yoshikazu Terada¹

405 Accesses
Explore all metrics

Abstract

Factorial $k$-means (FKM) clustering is a method for clustering objects in a low-dimensional subspace. The advantage of this method is that the partition of objects and the low-dimensional subspace reflecting the cluster structure are obtained, simultaneously. In some cases that reduced $k$-means (RKM) clustering does not work well, FKM clustering can discover the cluster structure underlying a lower dimensional subspace. Conditions that ensure the almost sure convergence of the estimator of FKM clustering as the sample size increases unboundedly are derived. The result is proved for a more general model including FKM clustering. Moreover, it is also shown that there exist some cases in which RKM clustering becomes equivalent to FKM clustering as the sample size goes to infinity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Braverman’s Spectrum and Matrix Diagonalization Versus iK-Means: A Unified Framework for Clustering

Multiple Correspondence K-Means: Simultaneous Versus Sequential Approach for Dimension Reduction and Clustering

Clustering and dimension reduction for mixed variables

Article 11 March 2019

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

References

Akama, Y., Irie, K., Kawamura, A., Uwano, Y. (2010). VC dimensions of principal component analysis. Discrete and Computational Geometry, 44(3), 589–598.
Google Scholar
Arabie, P., Hubert, L. (1994). Cluster Analysis in Marketing Research. In R. P. Bagozzi (Ed.), Advanced methods of marketing research (pp. 160–189). Oxford: Blackwell.
Chang, W. (1983). On using principal components before separating a mixture of two multivariate normal distributions. Applied Statistics, 32(3), 267–275.
Article MATH MathSciNet Google Scholar
De Soete, G., Carroll, J. D. (1994). $K$-means clustering in a low-dimensional Euclidean space. In E. Diday, Y. Lechevallier, M. Schader, P. Bertrand, & B. Burtschy (Eds.), New approaches in classification and data analysis (pp. 212–219). Berlin: Springer.
Dehardt, J. (1971). Generalizations of the Glivenko-Cantelli theorem. Annals of Mathematical Statistics, 42(6), 2050–2055.
Article MATH MathSciNet Google Scholar
Devroye, L., Györfi, L., Lugosi, G. (1996). A probabilistic theory of pattern recognition. New York: Springer.
Peskir, G. (2000). From uniform laws of large numbers to uniform ergodic theorems. Lecture notes series. 66, University of Aarhus, Department of Mathematics.
Pfanzagl, J. (1994). Parametric statistical theory. Berlin: de Gruyter.
Pollard, D. (1981). Strong consistency of $k$-means clustering. Annals of Statistics, 9(1), 135–140.
Article MATH MathSciNet Google Scholar
Pollard, D. (1982). Quantization and the method of $k$-means. IEEE Transactions of Information Theory, 28(2), 199–205.
Article MATH MathSciNet Google Scholar
Terada, Y. (2014). Strong consistency of reduced $k$-means clustering. To appear in Scandinavian Journal of Statistics.
Timmerman, M. E., Ceulemans, E., Kiers, H. A. L., Vichi, M. (2010). Factorial and reduced $K$-means reconsidered. Computational Statistics and Data Analysis, 54(7), 1858–1871.
Google Scholar
Vichi, M., Kiers, H. A. L. (2001). Factorial $k$-means analysis for two-way data. Computational Statistics and Data Analysis, 37(1), 49–64.
Google Scholar

Download references

Acknowledgments

The author wishes to express his thanks to Dr. Michio Yamamoto for his helpful comments and discussions related to Example 2. The author gratefully acknowledges the helpful comments and suggestions of the two anonymous reviewers. Moreover, the author also thank Professor Yutaka Kano for his supervision of this research. This work was supported by Grant-in-Aid for JSPS Fellows Number $24\cdot 2466$.

Author information

Authors and Affiliations

Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama, Toyonaka, Osaka, 560-0043, Japan
Yoshikazu Terada

Authors

Yoshikazu Terada
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Yoshikazu Terada.

Appendix: Existence of $\Theta ^{\prime }$

Here, we prove the existence of population global optimizers.

Lemma 5

Suppose that $\int \psi (\Vert \varvec{x}\Vert )P(d\varvec{x})<\infty $. There exists $M>0$ such that

$$\begin{aligned} \inf _{A\in \mathcal {O}(p\times q)}\Psi (F^{\prime },A,P) > \inf _{\theta \in \Theta _k^*(M)}\Psi (\theta ,P) \end{aligned}$$

for all $F^{\prime }\in \mathcal {R}_k$ satisfying $F^{\prime }\cap B_q(M)=\emptyset $.

Proof

Conversely, suppose that, for all $M>0$, there exists $F^{\prime }\in \mathcal {R}_k$ such that $F^{\prime }\cap B_q(M)=\emptyset $ and

$$\begin{aligned} \inf _{A\in \mathcal {O}(p\times q)}\Psi (F^{\prime },A,P) \le \inf _{\theta \in \Theta _k^*(M)}\Psi (\theta ,P). \end{aligned}$$

Choose $r>0$ to satisfy that the ball $B_p(r)$ has a positive $P$ measure; that is $P(B_p(r))>0$. Let $M$ be sufficiently large such that $M>r$ and that it satisfies inequality (3). Since $\Vert A^T\varvec{x}-\varvec{f}\Vert \ge \Vert \varvec{f}\Vert -\Vert A^{T}\varvec{x}\Vert >M-r$ for all $\varvec{f}\notin B_{q}(M)$ and all $\varvec{x}\in B_p(r)$, we have

$$\begin{aligned} \int \psi (\Vert \varvec{x}\Vert )P(\varvec{x})&\ge \inf _{\theta \in \Theta _k^*(M)}\Psi (\theta ,P) \ge \inf _{A\in \mathcal {O}(p\times q)}\Psi (F^{\prime },A,P)\\&\ge \inf _{A\in \mathcal {O}(p\times q)}\int \nolimits _{\varvec{x}\in B_p(r)}\min _{\varvec{f}\in F^{\prime }}\psi \big (\big \Vert A^T\varvec{x}-\varvec{f}\big \Vert \big )P(\text {d}\varvec{x})\\&\ge \phi (M-r)P(B_p(r)). \end{aligned}$$

This is a contradiction. $\square $

Lemma 6

Suppose that $\int \psi (\Vert \varvec{x}\Vert )P(d\varvec{x})<\infty $, and for $j=2,3,\dots ,k-1$, $m_j(P) > m_{k}(P)$. There exists $M>0$ such that, for all $F^{\prime }\in \mathcal {R}_k$ satisfying $F^{\prime }\not \subset B_q(5M)$,

$$\begin{aligned} \inf _{A\in \mathcal {O}(p\times q)}\Psi (F^{\prime },A,P) > \inf _{\theta \in \Theta _k^*(5M)}\Psi (\theta ,P). \end{aligned}$$

Proof

Choose $M>0$ to be sufficiently large to satisfy inequalities (3) and (4). Suppose that, for all $M>0$, there exists $F^{\prime }\in \mathcal {R}_k$ satisfying $F^{\prime }\not \subset B_q(5M)$ and

$$\begin{aligned} \inf _{A\in \mathcal {O}(p\times q)}\Psi (F^{\prime },A,P) \le \inf _{\theta \in \Theta _k^*(5M)}\Psi (\theta ,P). \end{aligned}$$

Let $\mathcal {R}_k^{\prime }$ be the set of such $F^{\prime }$ and then

$$\begin{aligned} m_{k}(P)=\inf _{\theta \in \mathcal {R}_k^{\prime }\times \mathcal {O}(p\times q)}\Psi (\theta ,P). \end{aligned}$$

According to Lemma 5, each $F^{\prime }\in \mathcal {R}_k^{\prime }$ includes at least one point on $B_{q}(M)$, say $\varvec{f}_1$. For all $\varvec{x}$ satisfying $\Vert \varvec{x}\Vert <2M$ and all $A\in \mathcal {O}(p\times q)$, we obtain

$$\begin{aligned} \big \Vert A^T\varvec{x}-\varvec{f}\big \Vert >3M \quad \text {for all } \varvec{f}\not \in B_q(5M) \end{aligned}$$

and

$$\begin{aligned} \big \Vert A^T\varvec{x}-\varvec{g}\big \Vert <3M \quad \text {for all } \varvec{g}\in B_q(M). \end{aligned}$$

Thus,

$$\begin{aligned} \int \nolimits _{\Vert \varvec{x}\Vert < 2M} \min _{\varvec{f}\in F^{\prime }}\psi \big (\big \Vert A^T\varvec{x}-\varvec{f}\big \Vert \big )P(d\varvec{x}) = \int \nolimits _{\Vert \varvec{x}\Vert < 2M} \min _{\varvec{f}\in F^{*}}\psi \big (\big \Vert A^T\varvec{x}-\varvec{f}\big \Vert \big )P(d\varvec{x}), \end{aligned}$$

where the set $F^{*}$ is obtained by deleting all points outside $B_q(5M)$ from $F^{\prime }$. Since $ \int _{\Vert \varvec{x}\Vert \ge 2M}\psi (\Vert A^T\varvec{x}-\varvec{f}_1\Vert )P(d\varvec{x}) \le \lambda \int _{\Vert \varvec{x}\Vert \ge 2M}\psi (\Vert \varvec{x}\Vert )P(d\varvec{x}) $, we obtain that

$$\begin{aligned}&\Psi (F_{k}^{\prime },A,P) + \lambda \int \nolimits _{\Vert \varvec{x}\Vert \ge 2M}\psi (\Vert \varvec{x}\Vert )P(\text {d}\varvec{x})\\&\ge \int \nolimits _{\Vert \varvec{x}\Vert < 2M} \min _{\varvec{f}\in F^{*}}\psi \big (\big \Vert A^T\varvec{x}-\varvec{f}\big \Vert \big )P(\text {d}\varvec{x}) + \int \nolimits _{\Vert \varvec{x}\Vert \ge 2M}\psi \big (\big \Vert A^T\varvec{x}-\varvec{f}_1\big \Vert \big )P(\text {d}\varvec{x})\\&\ge \Psi (F^{*},A,P) \ge m_{k-1}(P) \end{aligned}$$

for all $A\in \mathcal {O}(p\times q)$. It follows that $m_k(P)+\epsilon \le m_{k-1}(P)$, which is a contradiction. $\square $

Let us consider $M>0$ to be sufficiently large to satisfy inequalities (3) and (4). Write $\Theta _k:=\mathcal {R}_k^*(5M)\times \mathcal {O}(p\times q)$. Proposition 1 and Corollary 1 can be proved in the same way as Proposition 1 and Corollary 1 in Terada (2014).

Proof of Proposition 1

According to Lemma 6,

$$\begin{aligned} \inf _{\theta \in \Xi _k}\Psi (\theta ,P)=\inf _{\theta \in \Theta _k}\Psi (\theta ,P). \end{aligned}$$

Moreover, for any $\theta \in (\mathcal {R}_k\setminus \mathcal {R}_k^*(5M))\times \mathcal {O}(p\times q)$, $m_k(P)<\Psi (\theta ,P)$. Thus, we only have to prove $\Theta ^{\prime }\ne \emptyset $.

Let $C:=\{\Psi (\theta ,P)\mid \theta \in \Theta _k\}$ and then $m_k(P)=\inf C$. By the definition of the infimum, for all $x > m_k(P)$, there exists $c\in C$ such that $c<x$. By the axiom of choice, we can obtain a sequence $\{c_n\}_{n\in \mathbb {N}}$ such that $c_n \rightarrow m_k(P)$ as $n\rightarrow \infty $. Using the axiom of choice again, we can obtain a sequence $\{\theta _{n}\}_{n\in \mathbb {N}}$ such that $\Psi (\theta _n,P)\rightarrow m_k(P)$ as $n\rightarrow \infty $.

By the compactness of $\Theta _k$, there exists a convergent subsequence of $\{\theta _n\}_{n\in \mathbb {N}}$, say $\{\theta _{n_i}\}_{i\in \mathbb {N}}$. Let $\theta _{*}\in \Theta _k$ denote the limit of subsequence $\{\theta _{n_i}\}_{i\in \mathbb {N}}$, i.e., $\theta _{m_i}\rightarrow \theta _*$ as $i\rightarrow \infty $. Since $\Psi (\cdot ,P)$ is continuous on $\Theta _{k}$, $\Psi (\theta _*,P)=m_k(P)$. Hence, we obtain $\Theta ^{\prime }\ne \emptyset $. $\square $

Proof of Corollary 1

Let $\Theta _\epsilon :=\{\theta _k\in \Theta _k \mid \Psi (\theta _k,P)=m_k(P)\}$. Conversely, suppose that there exists $\epsilon >0$ such that $\inf _{\theta \in \Theta _{\epsilon }}\Psi (\theta ,P)=\inf _{\theta \in \Theta ^{\prime }}\Psi (\theta ,P)$. By the definition of the infimum, there exists a sequence $\{\theta _n\}_{n\in \mathbb {N}}$ on $\Theta _{\epsilon }$ such that $\Psi (\theta _n,P)\rightarrow m_k(P)$ as $n\rightarrow \infty $. By compactness of $\Theta _k$, there exists a convergent subsequence of $\{\theta _n\}_{n\in \mathbb {N}}$, say $\{\theta _{m_i}\}_{i \in \mathbb {N}}$. Let $\theta _{*}\in \Theta _k$ denote the limit of subsequence $\{\theta _{m_i}\}_{i\in \mathbb {N}}$. Since $\theta _{m_i}\rightarrow \theta _*$ as $i\rightarrow \infty $, we have $d(\theta _{m_i},\theta _*)<\epsilon $ for a sufficiently large $i$, which is a contradiction. $\square $

About this article

Cite this article

Terada, Y. Strong consistency of factorial $K$-means clustering. Ann Inst Stat Math 67, 335–357 (2015). https://doi.org/10.1007/s10463-014-0454-0

Download citation

Received: 26 June 2013
Revised: 12 January 2014
Published: 27 March 2014
Issue Date: April 2015
DOI: https://doi.org/10.1007/s10463-014-0454-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Strong consistency of factorial \(K\)-means clustering

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Braverman’s Spectrum and Matrix Diagonalization Versus iK-Means: A Unified Framework for Clustering

Multiple Correspondence K-Means: Simultaneous Versus Sequential Approach for Dimension Reduction and Clustering

Clustering and dimension reduction for mixed variables

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix: Existence of \(\Theta ^{\prime }\)

Lemma 5

Proof

Lemma 6

Proof

Proof of Proposition 1

Proof of Corollary 1

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Strong consistency of factorial \(K\)-means clustering

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Braverman’s Spectrum and Matrix Diagonalization Versus iK-Means: A Unified Framework for Clustering

Multiple Correspondence K-Means: Simultaneous Versus Sequential Approach for Dimension Reduction and Clustering

Clustering and dimension reduction for mixed variables

Explore related subjects

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix: Existence of \(\Theta ^{\prime }\)

Appendix: Existence of \(\Theta ^{\prime }\)

Lemma 5

Proof

Lemma 6

Proof

Proof of Proposition 1

Proof of Corollary 1

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now