A Scatter Search Approach for the Minimum Sum-of- Squares Clustering Problem Abstract

A Scatter Search Approach for the Minimum Sum-of-Squares Clustering ProblemJoaquín A. PachecoDepartment of Applied Economics, University of Burgos,Plaza Infanta Elena s/n BURGOS 09001, SPAINTf: +34 947 259021; fx +34 947 258956; email: jpacheco@ubu.esLatest version: September 19, 2003AbstractA metaheuristic procedure based on the Scatter Search approach is proposed for the non-hierarchical clustering problem under the criterion of minimum Sum-of-Squares Clustering. This algorithm incorporates procedures based on different strategies, such as Local Search, GRASP, Tabu Search or Path Relinking. The aim is to obtain quality solutions with short computation times. A series of computational experiments has been performed. The proposed algorithm obtains better results than previously reported methods, especially with small numbers of clusters.Keywords: Clusterization, Metaheuristics, Scatter Search, Local Search, GRASP,Tabu Search, Path Relinking1. IntroductionConsider a set X = {x 1, x 2, ..., x N } of N points in R q and let m be a predetermined positive integer. The Minimum Sum-of-Squares Clustering (MSSC) problem is to find a partition of X into m disjoint subsets (clusters) so that the sum of squared distances from each point to the centroid of its cluster is minimum. Specifically, let P m denote the set of all the partitions of X in m sets, where each partition PA ∈ P m is defined as PA = {C 1,C 2, ..., C m } and where C i denotes each of the clusters that forms PA . Thus, the problem can be expressed as:∑∑=∈∈−m i C x i l PA i l m x x 12 min P ,where the centroid i x is defined asl C x i i x n x i l ∑∈=1, with n i = |C i | .The problem can be written as∑=−N l k l lx x 12min ,where k l is the cluster to which point x l belongs.The design of clusters is a well known exploratory Data Analysis issue called Pattern Recognition. The aim is to find whether a given set of cases X has some structure and,in if so, to display it in the form of a partition. This problem belongs to the area of Non-Hierarchical cluster design, which has many applications in economics, social and natural sciences. It is known to be NP-Hard [5].
《A Scatter Search Approach for the Minimum Sum-of- Squares Clustering Problem Abstract.doc》
将本文的Word文档下载,方便收藏和打印
推荐:
下载文档
热门推荐
相关推荐