Abstract
Factorial \(k\)-means (FKM) clustering is a method for clustering objects in a low-dimensional subspace. The advantage of this method is that the partition of objects and the low-dimensional subspace reflecting the cluster structure are obtained, simultaneously. In some cases that reduced \(k\)-means (RKM) clustering does not work well, FKM clustering can discover the cluster structure underlying a lower dimensional subspace. Conditions that ensure the almost sure convergence of the estimator of FKM clustering as the sample size increases unboundedly are derived. The result is proved for a more general model including FKM clustering. Moreover, it is also shown that there exist some cases in which RKM clustering becomes equivalent to FKM clustering as the sample size goes to infinity.






Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Akama, Y., Irie, K., Kawamura, A., Uwano, Y. (2010). VC dimensions of principal component analysis. Discrete and Computational Geometry, 44(3), 589–598.
Arabie, P., Hubert, L. (1994). Cluster Analysis in Marketing Research. In R. P. Bagozzi (Ed.), Advanced methods of marketing research (pp. 160–189). Oxford: Blackwell.
Chang, W. (1983). On using principal components before separating a mixture of two multivariate normal distributions. Applied Statistics, 32(3), 267–275.
De Soete, G., Carroll, J. D. (1994). \(K\)-means clustering in a low-dimensional Euclidean space. In E. Diday, Y. Lechevallier, M. Schader, P. Bertrand, & B. Burtschy (Eds.), New approaches in classification and data analysis (pp. 212–219). Berlin: Springer.
Dehardt, J. (1971). Generalizations of the Glivenko-Cantelli theorem. Annals of Mathematical Statistics, 42(6), 2050–2055.
Devroye, L., Györfi, L., Lugosi, G. (1996). A probabilistic theory of pattern recognition. New York: Springer.
Peskir, G. (2000). From uniform laws of large numbers to uniform ergodic theorems. Lecture notes series. 66, University of Aarhus, Department of Mathematics.
Pfanzagl, J. (1994). Parametric statistical theory. Berlin: de Gruyter.
Pollard, D. (1981). Strong consistency of \(k\)-means clustering. Annals of Statistics, 9(1), 135–140.
Pollard, D. (1982). Quantization and the method of \(k\)-means. IEEE Transactions of Information Theory, 28(2), 199–205.
Terada, Y. (2014). Strong consistency of reduced \(k\)-means clustering. To appear in Scandinavian Journal of Statistics.
Timmerman, M. E., Ceulemans, E., Kiers, H. A. L., Vichi, M. (2010). Factorial and reduced \(K\)-means reconsidered. Computational Statistics and Data Analysis, 54(7), 1858–1871.
Vichi, M., Kiers, H. A. L. (2001). Factorial \(k\)-means analysis for two-way data. Computational Statistics and Data Analysis, 37(1), 49–64.
Acknowledgments
The author wishes to express his thanks to Dr. Michio Yamamoto for his helpful comments and discussions related to Example 2. The author gratefully acknowledges the helpful comments and suggestions of the two anonymous reviewers. Moreover, the author also thank Professor Yutaka Kano for his supervision of this research. This work was supported by Grant-in-Aid for JSPS Fellows Number \(24\cdot 2466\).
Author information
Authors and Affiliations
Corresponding author
Appendix: Existence of \(\Theta ^{\prime }\)
Appendix: Existence of \(\Theta ^{\prime }\)
Here, we prove the existence of population global optimizers.
Lemma 5
Suppose that \(\int \psi (\Vert \varvec{x}\Vert )P(d\varvec{x})<\infty \). There exists \(M>0\) such that
for all \(F^{\prime }\in \mathcal {R}_k\) satisfying \(F^{\prime }\cap B_q(M)=\emptyset \).
Proof
Conversely, suppose that, for all \(M>0\), there exists \(F^{\prime }\in \mathcal {R}_k\) such that \(F^{\prime }\cap B_q(M)=\emptyset \) and
Choose \(r>0\) to satisfy that the ball \(B_p(r)\) has a positive \(P\) measure; that is \(P(B_p(r))>0\). Let \(M\) be sufficiently large such that \(M>r\) and that it satisfies inequality (3). Since \(\Vert A^T\varvec{x}-\varvec{f}\Vert \ge \Vert \varvec{f}\Vert -\Vert A^{T}\varvec{x}\Vert >M-r\) for all \(\varvec{f}\notin B_{q}(M)\) and all \(\varvec{x}\in B_p(r)\), we have
This is a contradiction. \(\square \)
Lemma 6
Suppose that \(\int \psi (\Vert \varvec{x}\Vert )P(d\varvec{x})<\infty \), and for \(j=2,3,\dots ,k-1\), \(m_j(P) > m_{k}(P)\). There exists \(M>0\) such that, for all \(F^{\prime }\in \mathcal {R}_k\) satisfying \(F^{\prime }\not \subset B_q(5M)\),
Proof
Choose \(M>0\) to be sufficiently large to satisfy inequalities (3) and (4). Suppose that, for all \(M>0\), there exists \(F^{\prime }\in \mathcal {R}_k\) satisfying \(F^{\prime }\not \subset B_q(5M)\) and
Let \(\mathcal {R}_k^{\prime }\) be the set of such \(F^{\prime }\) and then
According to Lemma 5, each \(F^{\prime }\in \mathcal {R}_k^{\prime }\) includes at least one point on \(B_{q}(M)\), say \(\varvec{f}_1\). For all \(\varvec{x}\) satisfying \(\Vert \varvec{x}\Vert <2M\) and all \(A\in \mathcal {O}(p\times q)\), we obtain
and
Thus,
where the set \(F^{*}\) is obtained by deleting all points outside \(B_q(5M)\) from \(F^{\prime }\). Since \( \int _{\Vert \varvec{x}\Vert \ge 2M}\psi (\Vert A^T\varvec{x}-\varvec{f}_1\Vert )P(d\varvec{x}) \le \lambda \int _{\Vert \varvec{x}\Vert \ge 2M}\psi (\Vert \varvec{x}\Vert )P(d\varvec{x}) \), we obtain that
for all \(A\in \mathcal {O}(p\times q)\). It follows that \(m_k(P)+\epsilon \le m_{k-1}(P)\), which is a contradiction. \(\square \)
Let us consider \(M>0\) to be sufficiently large to satisfy inequalities (3) and (4). Write \(\Theta _k:=\mathcal {R}_k^*(5M)\times \mathcal {O}(p\times q)\). Proposition 1 and Corollary 1 can be proved in the same way as Proposition 1 and Corollary 1 in Terada (2014).
Proof of Proposition 1
According to Lemma 6,
Moreover, for any \(\theta \in (\mathcal {R}_k\setminus \mathcal {R}_k^*(5M))\times \mathcal {O}(p\times q)\), \(m_k(P)<\Psi (\theta ,P)\). Thus, we only have to prove \(\Theta ^{\prime }\ne \emptyset \).
Let \(C:=\{\Psi (\theta ,P)\mid \theta \in \Theta _k\}\) and then \(m_k(P)=\inf C\). By the definition of the infimum, for all \(x > m_k(P)\), there exists \(c\in C\) such that \(c<x\). By the axiom of choice, we can obtain a sequence \(\{c_n\}_{n\in \mathbb {N}}\) such that \(c_n \rightarrow m_k(P)\) as \(n\rightarrow \infty \). Using the axiom of choice again, we can obtain a sequence \(\{\theta _{n}\}_{n\in \mathbb {N}}\) such that \(\Psi (\theta _n,P)\rightarrow m_k(P)\) as \(n\rightarrow \infty \).
By the compactness of \(\Theta _k\), there exists a convergent subsequence of \(\{\theta _n\}_{n\in \mathbb {N}}\), say \(\{\theta _{n_i}\}_{i\in \mathbb {N}}\). Let \(\theta _{*}\in \Theta _k\) denote the limit of subsequence \(\{\theta _{n_i}\}_{i\in \mathbb {N}}\), i.e., \(\theta _{m_i}\rightarrow \theta _*\) as \(i\rightarrow \infty \). Since \(\Psi (\cdot ,P)\) is continuous on \(\Theta _{k}\), \(\Psi (\theta _*,P)=m_k(P)\). Hence, we obtain \(\Theta ^{\prime }\ne \emptyset \). \(\square \)
Proof of Corollary 1
Let \(\Theta _\epsilon :=\{\theta _k\in \Theta _k \mid \Psi (\theta _k,P)=m_k(P)\}\). Conversely, suppose that there exists \(\epsilon >0\) such that \(\inf _{\theta \in \Theta _{\epsilon }}\Psi (\theta ,P)=\inf _{\theta \in \Theta ^{\prime }}\Psi (\theta ,P)\). By the definition of the infimum, there exists a sequence \(\{\theta _n\}_{n\in \mathbb {N}}\) on \(\Theta _{\epsilon }\) such that \(\Psi (\theta _n,P)\rightarrow m_k(P)\) as \(n\rightarrow \infty \). By compactness of \(\Theta _k\), there exists a convergent subsequence of \(\{\theta _n\}_{n\in \mathbb {N}}\), say \(\{\theta _{m_i}\}_{i \in \mathbb {N}}\). Let \(\theta _{*}\in \Theta _k\) denote the limit of subsequence \(\{\theta _{m_i}\}_{i\in \mathbb {N}}\). Since \(\theta _{m_i}\rightarrow \theta _*\) as \(i\rightarrow \infty \), we have \(d(\theta _{m_i},\theta _*)<\epsilon \) for a sufficiently large \(i\), which is a contradiction. \(\square \)
About this article
Cite this article
Terada, Y. Strong consistency of factorial \(K\)-means clustering. Ann Inst Stat Math 67, 335–357 (2015). https://doi.org/10.1007/s10463-014-0454-0
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10463-014-0454-0