A Density-Based Method for the Identification of Non-Disjoint Clusters With Arbitrary and Non-Spherical Shapes

Authors

DOI:

https://doi.org/10.7494/csci.2021.22.2.4002

Keywords:

Overlapping Clustering, Non-disjoint clusters, Density-based Methods, Clusters with Non-Spherical Shapes

Abstract

Overlapping clustering is an important challenge in unsupervised learning applications while it allows for each data object to belong to more than one group. Several clustering methods were proposed to deal with this requirement by using several usual clustering approaches. Although the ability of these methods to detect non-disjoint partitioning, they fail when data contain groups with arbitrary and non-spherical shapes. We propose in this work a new density based overlapping clustering method, referred to as OC-DD, which is able to detect overlapping clusters even having non-spherical and complex shapes. The proposed method is based on the density and distances to detect dense regions in data while allowing for some data objects to belong to more than one group.
Experiments performed on articial and real multi-labeled datasets have shown the effectiveness of the proposed method compared to the existing ones.

Downloads

Download data is not yet available.

References

Amigo E., Gonzalo J., Artiles J., Verdejo F.: A comparison of extrinsic clustering evaluation metrics based on formal constraints. In: Information retrieval, vol. 12(4), pp. 461-486, 2009.

Ankerst M., Breunig M., Kriegel H.P., Sander J.: OPTICS: ordering points to identify the clustering structure. In: ACM Sigmod Record, vol. 28, pp. 49-60.ACM, 1999.

Bagga A., Baldwin B.: Entity-based cross-document coreferencing using the vector space model. In: In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics-Volume 1, pp. 79-85. Association for Computational Linguistics, 1998.

Banerjee A., Krumpelman C., Ghosh J., Basu S., Mooney R.: Model-based overlapping clustering. In: In Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, Chicago, USA, pp. 532-537. ACM, 2005.

ben N'Cir C.E., Cleuziou G., Essoussi N.: Generalization of c-means for identifying non-disjoint clusters with overlap regulation. In: Pattern Recognition Letters,

vol. 45, pp. 92-98, 2014.

Bertrand P J.M.: The k-weak hierarchical representations: an extension of the indexed closed weak hierarchies. In: Discrete Appl Math 127, 2003.

Celleux G. G.G.: A Classication EM Algorithm for Clustering and Two Stochastic Versions. In: Computational Statistics and Data Analysis, 1992.

Cleuziou G.: An extended version of the k-means method for overlapping clustering. In: In IEEE International Conference on Pattern Recognition ICPR, Florida, USA, 2008.

Cleuziou G., Moreno J.G.: Kernel methods for point symmetry-based clustering.

In: Pattern Recognit., vol. 48(9), pp. 2812-2830, 2015.

Depril D., Van M. I., Mirkin B.: Algorithms for additive clustering of rectangular data tables. In: Computational Statistics & Data Analysis, vol. 52(11), pp. 4923-4938, 2008.

Diday E.: Orders and overlapping clusters by pyramids. In: Technical Report 730, INRIA, France, 1987.

Fisher R.: The use of multiple measurements in taxonomic problems. In: Annals of eugenics, vol. 7(2), pp. 179-188, 1936.

Hinneburg A., Keim D.: An ecient approach to clustering Large multimedia databases with noise. In: In Proceeding of the 4th International Conference on Knowledge Discovery and Data Mining (KDD.98), 1998.

Jain A., Dubes R.: Algorithms for Clustering Data. In: Englewood Clis, NJ: Prentice-Hall, 1988.

Jardine N. S.R.: Mathematical Taxonomy John Wiley and Sons Ltd,London. In: , 1971.

MacQueen J.: Some methods for classication and analysis of multivariate observations. In: In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, 1967.

Maiza M.I., Ben N'Cir C.E., Essoussi N.: Overlapping Community Detection Method for Social Networks. In: R. Jallouli, O.R. Zaane, M.A. Bach Tobji, R. Srar Tabbane, A. Nijholt, eds., Digital Economy. Emerging Technologies and Business Innovation, pp. 143-151. Springer International Publishing, 2017.

Mansooreh Mirzaie Ahmad Barani N.N., Mohammad-Beigi M.: Bayesian- OverDBC: A Bayesian Density-Based Approach for Modeling Overlapping Clusters. In: Mathematical Problems in Engineering, vol. 2015, 2015.

N'cir C.B., Essoussi N., Limam M.: Kernel-Based Methods to Identify Overlapping Clusters with Linear and Nonlinear Boundaries. In: J. Classif., vol. 32(2), pp. 176-211, 2015.

Sharan R., Shamir R.: CLICK: a clustering algorithm with applications to gene expression analysis. In: In Proceedings International Conference Intell Syst Mol Biol, vol. 8, p. 16. 2000.

Wang M., Zuo W., Wang Y.: An improved density peaks-based clustering method for social circle discovery in social networks. In: Neurocomputing, vol. 179, pp. 219-227, 2016.

XU X. E.M.K.H.P.: Knowledge Discovery in Large Spatial Databases: Focusing Techniques for efficient Class Identication. In: Lecture Notes In Computer Science, Vol. 951, Springer, 1995.

Zhou X., Liu Y., Wang J., Li C.: A density based link clustering algorithm for overlapping community detection in networks. In: Physica A: Statistical Mechanics and its Applications, vol. 486, pp. 65-78, 2017.

Downloads

Published

2021-04-15

How to Cite

Ben Ncir, C. E. (2021). A Density-Based Method for the Identification of Non-Disjoint Clusters With Arbitrary and Non-Spherical Shapes. Computer Science, 22(2). https://doi.org/10.7494/csci.2021.22.2.4002

Issue

Section

Articles