Skip to content
BY 4.0 license Open Access Published by De Gruyter April 25, 2018

Automatic Genetic Fuzzy c-Means

  • Khalid Jebari EMAIL logo , Abdelaziz Elmoujahid and Aziz Ettouhami

Abstract

Fuzzy c-means is an efficient algorithm that is amply used for data clustering. Nonetheless, when using this algorithm, the designer faces two crucial choices: choosing the optimal number of clusters and initializing the cluster centers. The two choices have a direct impact on the clustering outcome. This paper presents an improved algorithm called automatic genetic fuzzy c-means that evolves the number of clusters and provides the initial centroids. The proposed algorithm uses a genetic algorithm with a new crossover operator, a new mutation operator, and modified tournament selection; further, it defines a new fitness function based on three cluster validity indices. Real data sets are used to demonstrate the effectiveness, in terms of quality, of the proposed algorithm.

1 Introduction

Clustering has been widely applied in various disciplines, including medical sciences [9], computer sciences [44], bioinformatics [20, 34, 48], bankruptcy forecasting [17], astronomy [3], and weather classification [35, 36]. Clustering can be divided into two different types: crisp and fuzzy. The former, which supposes the classes are clearly separated, is a traditional technique. It assigns each object to only one class [28]. By contrast, the latter does not make assumptions about the separation of the classes. In addition, instead of allocating the object to a unique class, fuzzy methods assign membership degrees to which the objects belong to classes [9]. Therefore, these models allow, generally, a better description of the real data where the borders between the classes are often inaccurately defined.

Fuzzy c-means (FCM) [7] is a dynamic method. Objects can change the degrees of membership during the process of class training. As this technique is iterative, the outcome is sensitive to initialization [5]. As a consequence, the improper selection of initial centroids will generally lead to undesirable clustering results [25]. A simulated annealing algorithm [47], a Tabu search algorithm [2, 39, 52], a genetic algorithm (GA) [4, 12, 23, 26, 37, 53], an ant colony [27, 30, 31, 49], and an artificial bee colony [32, 33] are examples of heuristic methods that have been used over the last two decades to overcome the problem of initialization. An exhaustive review of these algorithms can be found in Ref. [15].

Another handicap has been that FCM needs the number of clusters as input parameter. Viable methods to overcome this drawback can be found in Refs. [4, 11, 12, 23, 30, 38, 45, 46, 53]. In addition, other versions of FCM based on iterative or recursive techniques attempt remedying this encumbrance [13, 19]. However, these techniques present the problems of time complexity.

As clustering data sets can be viewed as an optimization problem [14], we propose a novel technique to cluster data based on a GA [24]. A relevant advantage of this algorithm is its ability to deal with local optima by maintaining, diversifying, and comparing several candidate solutions simultaneously. However, the GA challenge is to maintain the balance between exploitation and exploration.

We address the application of a hybrid GA, called automatic genetic FCM (AGFCM), based on gravitational mutation [43], differential crossover [41, 42, 51], modified tournament selection, and the FCM algorithm. We consider in AGFCM the balance between exploitation and exploration of candidate solutions using the new genetic operators mentioned previously.

The performance of AGFCM has been tested on three real data sets from the University of California at Irvine (UCI) repository [10], and the results have been compared using other techniques. The rest of this paper is organized as follows. Section 2 presents a brief description of the FCM algorithm. Section 3 describes our proposed algorithm to solve data clustering problems. Experimental results and comparison to other available methods are discussed in Section 4. Finally, conclusions and future work are highlighted in Section 5.

2 Fuzzy c-Means

Many clustering methods are introduced in the literature. These can be classified into two methods: hard and fuzzy. In hard clustering algorithms, which are based on classical set theory, the object belongs to one class. In fuzzy clustering algorithms, objects can belong to all classes with different degrees of membership. This is appropriate for real-world data where boundaries between clusters are not well defined. That is why fuzzy clustering presents the advantage of dealing with overlapping clusters.

Let X={x1, x2, ..., xn}⊂ℜp be a set of n objects with dimension p. Partitioning X in c clusters can be defined by a matrix U=[ui,j]∈ℜn×c, which satisfies the following three conditions:

(1) 0ui,j1;     1in  and    1jc,
(2) j=1cuij=1;     1in,
(3) 0<i=1nuij<n;     1jc,

where uij is the membership degree of xi for the jth cluster.

The FCM algorithm optimizes the Jm criterion defined by

(4) Jm(U,V)=i=1nj=1c(uij)mxivj2,

where

  • V(v1, v2, …, vc)∈ℜc×p and vj is the jth prototype;

  • m (1<m<∞) is a parameter used to control the level of fuzziness in the resulting clusters;

  • || || is a norm to measure the distance between the jth prototype and the ith data point.

Bezdek showed that FCM always converges to a minimum of Jm under the following conditions [7]:

(5) uik=(j=1c(xkvixkvj)2m1)1;1ic;1kn,
(6) vi=k=1n(uik)mxkk=1n(uik)m with 1ic.

The pseudo-code of the FCM algorithm is given in Algorithm 1.

Algorithm 1:

FCM pseudo-code.

Data: Vector of objects X: (x1, x2, ..... xn)
Result: Prototypes (v1,v2,,vc)
Choose:
– 1<c<n
m>1
tmax maximum number of iterations
– ϵ tolerance threshold
– Norm for clustering criterion Jm
– Norm for calculating errors Et=||VtVt−1||
Initialization:
– Prototypes V0
t←0
While (Et>ϵ and t<tmax) do
tt+1
 Calculate Ut by using Eq. (5)
 Calculate Vt by using Eq. (6)

3 Proposed Method

In this section, we describe the AGFCM clustering algorithm. This algorithm uses a GA that utilizes new operators and a new fitness function to evolve the number of clusters and to provide the initial centroids. The results of the GA phase are then used as an input in the FCM algorithm. The pseudo-code of the AGFCM algorithm is given in Algorithm 2. The AGFCM clustering algorithm is introduced in what follows.

Algorithm 2:

General description of AGFCM.

Data: Data set
Result: Best individual: number of clusters; prototypes
Initialization
for each individual i in the population do
Choose ci∈{cmin, ..., cmax};
 In data set, Choose randomly ci objects
end
while not termination_condition do
 Fitness evaluation;
 Modified enthusiasm selection MES();
 Differential crossover;
 Gravitational mutation;
end

3.1 Chromosome Representation

To encode a chromosome, we use a real-valued representation. The chromosomes represent the coordinates of the cluster center. If we consider Pi as the ith candidate solution, Pi={vi1, vi2, …, vici }, where vij={vij1,,vijp} represent the jth cluster center and ci is a number of clusters. p is the dimensionality of the data set. A chromosome is thus a p×ci one-dimensional array. As each chromosome Pi has a different ci, the representation is of variable length.

3.2 Population Initialization

In the AGFCM clustering algorithm, an initial population is randomly generated. For each chromosome, a number of classes, ci, is randomly chosen between cmin and cmax. ci points from the data set are randomly chosen to initialize the chromosome. In this paper, cmin is set to 2 and the value n is assigned to cmax [8].

3.3 Fitness Evaluation

The fitness function of a candidate solution indicates how relevant a solution is. In this paper, the fitness function is based on three well-known clustering validity measures:

  1. Davies-Bouldin (DB) index: The DB index is a function of the ratio of the sum of within-cluster scatter to between-cluster separation [16]. This index is defined as

    (7) DB=1ci=1cmaxi,ji{(Si+Sj)dij},

    where c is the number of clusters. Si is the scatter within the ith cluster and dij is the distance between cluster centers vi and vj. Si is defined as

    (8) Si=1|Xi|xXi||xvi||2.

    Xi is the ith cluster and dij is defined as

    (9) dij= vivj2.
  2. Xie and Beni (XB) index: Xie and Beni [54] introduced a validity measure and defined it as

    (10) XB=k=1cj=1nukj2||xjvk||2n(minij{vivj2}),

    where ukj is the membership of the jth point to the kth cluster.

    In Eq. (10), the numerator measures the compactness of fuzzy partition. The denominator measures the separation between clusters.

  3. Partition entropy (VPE) index: partition entropy is a function of U proposed by Bezdek [6]. VPE is formulated as

    (11) VPE=1ni=1nj=1c[uijloga(uij)],

    where U=[ui,j] is the matrix of membership degrees.

The fitness function consists of summing the three indices outlined using a weighting coefficient for each of them. The function has the following form:

(12) f(x)=i=13wifi(x),

where fi is the index of validity expressed above.

In this study, we chose the weighting coefficients as w1=w2=w3=13. As the values of the three indices of validity may be different for the maximum and the minimum, f may be dominated by the index validity with large values. Therefore, each index of validity is normalized according to

(13) fi=fifiminfimaxfimin,

where i∈{1, 2, 3} and fimin and fimax are, respectively, the minimum and the maximum value of the component validity index recorded so far in the evolution of the algorithm.

3.4 Selection Method

The modified enthusiasm selection (MES) [1, 29] is used as a selection operator. The MES is based on the tournament selection method. It is a technique that gives another chance to worst individuals in a population to compete with the best individuals in the evolving process. With a view to increase the fitness of those individuals, an enthusiasm coefficient λ is multiplied (mathematical meaning) with the old fitness value. After the enthusiasm individual has been selected, the MES put it to its raw fitness. MES also guards in each iteration the best individual [29]. The pseudo code of the selection method is given in Algorithm 3. In Algorithm 3, TabS represents an array of n individuals’ indices in the current population. TabR is an array holding the indices of individuals in a random order. TabT represents an array of k−1 individual fitness, where k is the tournament size. TabW, an array of individual indices, is the result of the selection operation.

3.5 Crossover Operator

A crossover operator is a probabilistic process to combine selected parents to create new offsprings. The crossover operator implemented in this study is a two-points crossover; it is a two-parents-two-offsprings schema following the steps below:

  1. Firstly, two individuals P1 and P2 are randomly selected, and two crossing points are chosen.

  2. The crossover occurs between P1 and P2 using simulated binary crossover and generates two intermediate children, C1 and C2. Note that crossover points are restricted to fall on the same location within each cluster description.

  3. Differential crossover is used. It combines two strategies into one including their entire advantages [21, 42, 50]. The first strategy uses the values of the objective function to determine a “good” direction. The second strategy uses the best individual. Notice that the introduction of information such as the “good” direction and the best individual reduces the search space exploration capabilities. The new children are generated by the differential crossover according to

    Algorithm 3:

    MES().

    Data: Array: (TabR(), TabS() )
    Result: Array: (TabW())
    Initialization;
    kc;
    l←0;
    for i←0 to k do
    Shuffle TabR();
    j←0
    while j<n do
      C1←TabR(j);
      for m←1 to k do
       C2←TabR(j+m);
       if f(C1)<f(C2) then
        C1←C2
       end
       if m>1 then
        auxm;
        for m←1 to k do
         TabT(m)←f(Ij+m);
         f(Ij+m)=λf((Ij+m);
        end
        maux;
       end
      end
      for m←1 to k do
       f(Ij+m)←TabT(m);
      end
      jj+k+1;
      TabW(l)←C1;
      TabW(l+1)←TabS(l);
      ll+2;
    end
    end

    (14) C1=P1+λ(xbestP1)+F1(C1C2),

    and

    (15) C2=P2+λ(xbestP2)+F2(C1C2).

    λ is used to enhance the crossover when incorporating the current best vector xbest. F1 and F2, which control the amplification of the differential variation of the offsprings C1 and C2, are real and constant factors. In this paper, we set λ=F1=F2=0.9.

3.6 Mutation Operator

The mutation operator diversifies the population and avoids the creation of a set of homogeneous population elements. It should change the solution adequately to leave the attraction basin of the local optimum, but it should also avoid changing the solution too much and destroying already promising structures.

A novel mutation operator named gravitational mutation is used in this paper. The proposed operator is based on the gravitational search algorithm (GSA). GSA, a population-based search algorithm, is based on the law of gravity and interaction between masses [43]. To mutate an individual Pi, Eq. (16) is used:

(16) Pi=Pi+si,

where i ∈ {1, .... N}, N is the population size, and si is the next velocity of Pi computed as

(17) si=(randi×si)+ai.

randi is a random number in [0,1]. ai is the acceleration of Pi computed as

(18) ai=FiMi.

Here, Fi is the total force acting on Pi, and it is calculated as follows:

(19) Fi=j=1,jiNrandjGMjMiRij+ε(PjPi),

where randj is a random number in [0,1], G is the gravitational constant, Rij is the Euclidian distance between Pi and Pj, ϵ is a small constant to avoid division by zero, and Mi is the mass of Pi calculated as follows:

(20) Mi=fitiworstj=1N(fitjworst),

where fiti is the fitness value of Pi and worst is defined as follows (for a maximization problem):

(21) worst=min(fitj),

and j ∈ {1, …, N}.

4 Experimental Results

To evaluate the relevance of AGFCM, experiments were conducted on three real-world data sets from the UCI Machine Learning Repository [10]: the Iris data set, Glass data set, Wine data set, and Breast Cancer Wisconsin data set.

  • The Iris data set [22]: This may be one of the most used data sets in clustering problems. It consists of three classes that represent three species of iris plants: Iris setosa, Iris virginica, and Iris versicolor. Fifty observations belong to each class. Each observation consists of four characteristics: length and width of the sepal and petal of the flower [10].

  • Glass data set: It consists of six classes that represent six different types of glass – building windows float processed, building windows non-float processed, vehicle windows float processed, containers, tableware, and headlamps. Each observation consists of nine numeric attributes: refractive index, sodium, magnesium, aluminum, silicon, potassium, calcium, barium, and iron [10].

  • Breast Cancer Wisconsin: It consists of two classes that represent benign (239 objects) or malignant (444 objects) tumors. Each observation consists of nine features: clump thickness, uniformity of cell size, uniformity of cell shape, marginal adhesion, single epithelial cell size, bare nuclei, bland chromatin, normal nucleoli, and mitoses [10].

  • The Wine dataset is the result of a chemical analysis of wines grown in three different cultivars in the same region in Italy [10]. There were 178 samples: 59 in the first class, 71 in the second class, and 48 observations belong to the third class. Each observation consists of 13 numeric features: alcohol, malic acid, ash, alkalinity of ash, magnesium, total phenols, flavanoids, non-flavanoid phenols, proanthocyanins, color intensity, hue, OD280/OD315 of diluted wines, and proline [10].

The following criteria were used to compare the clustering results:

  • The number of classes found.

  • Inter-cluster distance: the distance between the centroids of the clusters.

  • Intra-cluster distance: the distance between data vectors within a cluster.

  • Correct ratio Rc: the correct ratio of cluster numbers is defined by

    (22) Rc=TCNCRT×100%,

where TCNC represents the number of times when the correct number of clusters was obtained by the algorithm and RT is the number of times where the algorithm was run.

AGFCM was compared with dynamic clustering particle swarm optimization (PSO) [40] and the standard GA (SGA). As the three algorithms used for comparison are stochastic in nature, we have undertaken 100 independent runs. The results have been stated in terms of the mean values for each couple (algorithm, data set).

The algorithms discussed in this section have been developed in a C++ language on a Core i7 PC, with a 2-GB Debian OS environment.

In Table 1, we report the mean number of classes found by the compared algorithms. The inter-cluster distance and intra-cluster distance obtained for the three algorithms are given in Tables 2 and 3.

Table 1:

Mean Number of Classes.

Data set SGA PSO AGFCM
Iris 2.23±0.07 2.50±0.08 3.04±0.01
Glass 4.71±0.03 5.68±0.04 6.05±0.02
Wine 3.71±0.07 2.68±0.04 3.05±0.05
Cancer 2.22±0.06 3.01±0.04 2.04±0.04
Table 2:

Inter-cluster Distance.

Data set FCM SGA PSO AGFCM
Iris 2.05±0.05 2.105±0.08 2.412±0.09 2.598±0.15
Glass 840.20±6.15 898.20±9.15 869.42±8.01 853.12±3.08
Wine 2.15±0.15 2.20±0.15 2.42±0.06 3.12±0.04
Cancer 2.15±0.15 2.121±0.09 2.621±0.08 3.251±0.06
Table 3:

Intra-cluster Distance.

Data set FCM SGA PSO AGFCM
Iris 3.890±0.15 3.662±0.15 3.967±0.12 3.114±0.07
Glass 670.80±5.454 663.30±4.34 661.12±3.15 563.12±2.19
Wine 6.15±1.2 5.95±1.9 5.13±0.10 4.12±0.06
Cancer 5.01±0.25 4.984±0.25 4.538±0.10 4.037±0.08

The results in Tables 13 indicate that the AGFCM algorithm succeeds in obtaining the most appropriate number of classes over 100 runs. It has a good performance in terms of the inter-cluster distance and the intra-cluster distance. AGFCM obtains a better clustering of the data.

Table 4 gives the comparative data based on the correct ratio. It can be obviously deduced that the AGFCM algorithm remains clearly and consistently superior to its competitors.

Table 4:

Correct Ratio (%) of Cluster Numbers for Different Methods.

Data set SGA PSO AGFCM
Iris 48 73 97
Glass 61 94 94
Wine 64 80 88
Cancer 44 81 96

The statistical results [18] of comparing AGFCM with SGA and PSO are given in Table 5. We used the one-tailed t-test with 58 degrees of freedom at a 0.05 level of significance, the changes in every τ generations. The notation used in Table 5 to compare each pair of algorithms is “+,” “++,” or “≈,” when the first algorithm is better than, significantly better than, or statistically equivalent to the second algorithm, respectively. Table 5 clearly demonstrates that the performance of AGFCM surpasses that of the other algorithms. When the change of population size N is small, the difference between the algorithms is not very meaningful. However, when the population size is large, AGFCM outperforms the other algorithms.

Table 5:

T-test Results of Comparing the Different Algorithms.

T-test results τ Population size
50 100 200 500
AGFCM-SGA + + ++
AGFCM-PSO 10 + + +
AGFCM-SGA + + ++ ++
AGFCM-PSO 50 ++ ++ ++
AGFCM-SGA + + ++ ++
AGFCM-PSO 100 + ++ ++ ++
AGFCM-SGA + ++ ++ ++
AGFCM-PSO 200 + + ++ ++

5 Conclusions

In this paper, we presented a hybrid GA for setting the appropriate number of clusters and determining initial cluster centroids for FCM. AGFCM uses two heuristic approaches, namely differential evolution and GSA, as a basis for genetic operators so as to exploit the best research areas and to explore other ones. The use of the MES, as a selection operator, controls the selection pressure. The experimental results on three real data sets indicate that the proposed algorithm improves the outcomes and outperforms two state-of-the-art clustering techniques. Future research may focus on integrating the automatic clustering scheme with the AGFCM algorithm for other metaheuristics such as PSO or differential evolution.

Bibliography

[1] A. Agrawal and I. Mitchell, Selection enthusiasm, in: Proceedings of the 6th International Conference on Simulated Evolution and Learning, pp. 449–456, Springer-Verlag, Berlin, 2006.10.1007/11903697_57Search in Google Scholar

[2] K. S. Al Sultan, A Tabu search approach to the clustering problem, Pattern Recogn. 28 (1995), 1443–1451.10.1016/0031-3203(95)00022-RSearch in Google Scholar

[3] G. J. Babu and E. D. Feigelson, Statistical Challenges in Modern Astronomy II, vol. 1, Springer, New York, 1997.10.1007/978-1-4612-1968-2Search in Google Scholar

[4] S. Bandyopadhyay and U. Maulik, Genetic clustering for automatic evolution of clusters and application to image classification, Pattern Recogn. 35 (2002), 1197–1208.10.1016/S0031-3203(01)00108-XSearch in Google Scholar

[5] A. M. Bensaid, L. O. Hall, J. C. Bezdek and L. P. Clarke, Partially supervised clustering for image segmentation, Pattern Recogn. 29 (1996), 859–871.10.1016/0031-3203(95)00120-4Search in Google Scholar

[6] J. C. Bezdek, Mathematical models for systematics and taxonomy, in: Proceedings of Eighth International Conference on Numerical Taxonomy, vol. 3, pp. 143–166, W.H. Freeman, San Francisco, 1975.Search in Google Scholar

[7] J. C. Bezdek, Pattern recognition with fuzzy objective function algorithms, Kluwer Academic Publishers, Norwell, MA, USA, 1981.10.1007/978-1-4757-0450-1Search in Google Scholar

[8] J. C. Bezdek and N. R. Pal, Some new indexes of cluster validity, IEEE Trans. Syst. Man Cybern. B Cybern. 28 (1998), 301–315.10.1109/3477.678624Search in Google Scholar PubMed

[9] J. C. Bezdek, J. Keller, R. Krisnapuram and N. Pal, Fuzzy models and algorithms for pattern recognition and image processing, The Handbooks of Fuzzy Sets Series, vol. 4, Springer US, New York, NY, USA, 1999.10.1007/b106267Search in Google Scholar

[10] C. Blake, E. Keogh and C. J. Merz, UCI repository of machine learning databases (http://www.ics.uci.edu/mlearn/MLRepository.html), 1998. Accessed 27 January 2018.Search in Google Scholar

[11] A. Bouroumi and A. Essaïdi, Unsupervised fuzzy learning and cluster seeking, Intell. Data Anal. 4 (2000), 241–253.10.3233/IDA-2000-43-406Search in Google Scholar

[12] D. -X. Chang, X. -D. Zhang, C. -W. Zheng and D. -M. Zhang, A robust dynamic niching genetic algorithm with niche migration for automatic clustering problem, Pattern Recogn. 43 (2010), 1346–1360.10.1016/j.patcog.2009.10.020Search in Google Scholar

[13] R. Cucchiara, C. Grana, S. Seidenari and G. Pellacani, Exploiting color and topological features for region segmentation with recursive fuzzy c-means, Mach. Graphics Vis. 11 (2002), 169–182.Search in Google Scholar

[14] S. Das, A. Abraham and A. Konar, Automatic clustering using an improved differential evolution algorithm, IEEE Trans. Syst. Man. Cybern. A Syst. Hum. 38 (2008), 218–237.10.1109/TSMCA.2007.909595Search in Google Scholar

[15] S. Das, A. Abraham and A. Konar, Metaheuristic pattern clustering – an overview, in: Metaheuristic Clustering, Studies in Computational Intelligence, vol. 178, pp. 1–62, Springer, Berlin, 2009.10.1007/978-3-540-93964-1_1Search in Google Scholar

[16] D. L. Davies and D. W. Bouldin, A cluster separation measure, IEEE Trans. Pattern Anal. Mach. Intell. 1 (1979), 224–227.10.1109/TPAMI.1979.4766909Search in Google Scholar

[17] J. De Andrés, P. Lorca, F. J. de Cos Juez and F. Sánchez-Lasheras, Bankruptcy forecasting: a hybrid approach using fuzzy c-means clustering and multivariate adaptive regression splines (MARS), Expert Syst. Appl. 38 (2011), 1866–1875.10.1016/j.eswa.2010.07.117Search in Google Scholar

[18] J. Derrac, S. Garca, D. Molina and F. Herrera, A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms, Swarm Evol. Comput. 1 (2011), 3–18.10.1016/j.swevo.2011.02.002Search in Google Scholar

[19] D. Dovžan and I. Škrjanc, Recursive fuzzy c-means clustering for recursive fuzzy identification of time-varying processes, ISA Trans. 50 (2011), 159–169.10.1016/j.isatra.2011.01.004Search in Google Scholar PubMed

[20] M. B. Eisen, P. T. Spellman, P. O. Brown and D. Botstein, Cluster analysis and display of genome-wide expression patterns, Proc. Natl. Acad. Sci. 95 (1998), 14863–14868.10.1073/pnas.95.25.14863Search in Google Scholar PubMed PubMed Central

[21] V. Feoktistov, Differential Evolution: In Search of Solutions, Springer Optimization and Its Applications, vol. 5, Springer Science+Business Media, LLC, Boston, MA, 2006.Search in Google Scholar

[22] R. A. Fisher, The use of multiple measurements in taxonomic problems, Ann. Eugen. 7 (1936), 179–188.10.1111/j.1469-1809.1936.tb02137.xSearch in Google Scholar

[23] G. Garai and B. Chaudhuri, A novel genetic algorithm for automatic clustering, Pattern Recogn. Lett. 25 (2004), 173–187.10.1016/j.patrec.2003.09.012Search in Google Scholar

[24] D. E. Goldberg, Genetic algorithms in search, optimization and machine learning, 1st ed., Addison-Wesley Longman Publishing Co., Inc. Boston, MA, 1989.Search in Google Scholar

[25] L. O. Hall, A. M. Bensaid, L. P. Clarke, R. P. Velthuizen, M. S. Silbiger and J. C. Bezdek, A comparison of neural network and fuzzy clustering techniques in segmenting magnetic resonance images of the brain, IEEE Trans. Neural Netw. 3 (1992), 672–682.10.1109/72.159057Search in Google Scholar PubMed

[26] L. O. Hall, I. B. Ozyurt and J. C. Bezdek, Clustering with a genetically optimized approach, IEEE Trans. Evol. Comput. 3 (1999), 103–112.10.1109/4235.771164Search in Google Scholar

[27] Y. Han and P. Shi, An improved ant colony algorithm for fuzzy clustering in image segmentation, Neurocomputing 70 (2007), 665–671.10.1016/j.neucom.2006.10.022Search in Google Scholar

[28] J. A. Hartigan, Clustering algorithms, John Wiley & Sons, Inc., New York, USA, 1975.Search in Google Scholar

[29] K. Jebari, A. Bouroumi and A. Ettouhami, Parameters control in gas for dynamic optimization, Int. J. Comput. Intell. Syst. 6 (2013), 47–63.10.1080/18756891.2013.754172Search in Google Scholar

[30] P. M. Kanade and L. O. Hall, Fuzzy ants as a clustering concept, in: 22nd International Conference of the North American Fuzzy Information Processing Society, 2003, NAFIPS 2003, pp. 227–232, IEEE, Chicago, IL, USA, 2003.10.1109/NAFIPS.2003.1226787Search in Google Scholar

[31] P. M. Kanade and L. O. Hall, Fuzzy ants and clustering, IEEE Trans. Syst. Man Cybern. A Syst. Hum. 37 (2007), 758–769.10.1109/TSMCA.2007.902655Search in Google Scholar

[32] D. Karaboga and C. Ozturk, Fuzzy clustering with artificial bee colony algorithm, Sci. Res. Essays 5 (2010), 1899–1902.Search in Google Scholar

[33] D. Karaboga and C. Ozturk, A novel clustering approach: artificial bee colony (ABC) algorithm, Appl. Soft Comput. 11 (2011), 652–657.10.1016/j.asoc.2009.12.025Search in Google Scholar

[34] M. K. Kerr and G. A. Churchill, Bootstrapping cluster analysis: assessing the reliability of conclusions from microarray experiments, Proc. Natl. Acad. Sci. 98 (2001), 8961–8965.10.1073/pnas.161273698Search in Google Scholar

[35] T. Littmann, An empirical classification of weather types in the Mediterranean basin and their interrelation with rainfall, Theor. Appl. Climatol. 66 (2000), 161–171.10.1007/s007040070022Search in Google Scholar

[36] Z. Liu and R. George, Mining weather data using fuzzy cluster analysis, in: Fuzzy Modeling with Spatial Information for Geographic Problems, F. E. Petry, V. B. Robinson, M. A. Cobb, eds., pp. 105–119, Springer, Berlin, Heidelberg, 2005.10.1007/3-540-26886-3_5Search in Google Scholar

[37] U. Maulik and S. Bandyopadhyay, Genetic algorithm-based clustering technique, Pattern Recogn. 33 (2000), 1455–1465.10.1016/S0031-3203(99)00137-5Search in Google Scholar

[38] U. Maulik and I. Saha, Automatic fuzzy clustering using modified differential evolution for image classification, IEEE Trans. Geosci. Remote Sens. 48 (2010), 3503–3510.10.1109/TGRS.2010.2047020Search in Google Scholar

[39] M. K. Ng and J. C. Wong, Clustering categorical data sets using tabu search techniques, Pattern Recogn. 35 (2002), 2783–2790.10.1016/S0031-3203(02)00021-3Search in Google Scholar

[40] M. Omran, A. Salman and A. Engelbrecht, Dynamic clustering using particle swarm optimization with application in unsupervised image classification, in: Fifth World Enformatika Conference (ICCI 2005), Prague, Czech Republic, pp. 199–204, Citeseer, 2005.Search in Google Scholar

[41] S. Paterlini and T. Krink, Differential evolution and particle swarm optimisation in partitional clustering, Comput. Stat. Data Anal. 50 (2006), 1220–1247.10.1016/j.csda.2004.12.004Search in Google Scholar

[42] K. V. Price, R. M. Storn and J. A. Lampinen, Differential Evolution: A Practical Approach to Global Optimization, Verlag Berlin Heidelberg, Germany, 2005.Search in Google Scholar

[43] E. Rashedi, H. Nezamabadi-Pour and S. Saryazdi, GSA: a gravitational search algorithm, Inform. Sci. 179 (2009), 2232–2248.10.1016/j.ins.2009.03.004Search in Google Scholar

[44] X. Rui and D. C. Wunsch, Clustering, IEEE Press, USA, 2009.Search in Google Scholar

[45] S. Saha and S. Bandyopadhyay, A new point symmetry based fuzzy genetic clustering technique for automatic evolution of clusters, Inform. Sci. 179 (2009), 3230–3246.10.1016/j.ins.2009.06.013Search in Google Scholar

[46] S. Saha and S. Bandyopadhyay, A symmetry based multiobjective clustering technique for automatic evolution of clusters, Pattern Recogn. 43 (2010), 738–751.10.1016/j.patcog.2009.07.004Search in Google Scholar

[47] S. Z. Selim and K. Alsultan, A simulated annealing algorithm for the clustering problem, Pattern Recogn. 24 (1991), 1003–1008.10.1016/0031-3203(91)90097-OSearch in Google Scholar

[48] S. Selinski and K. Ickstadt, Cluster analysis of genetic and epidemiological data in molecular epidemiology, J. Toxicol. Environ. Health Pt. A 71 (2008), 835–844.10.1080/15287390801985828Search in Google Scholar PubMed

[49] P. Shelokar, V. K. Jayaraman and B. D. Kulkarni, An ant colony approach for clustering, Anal. Chim. Acta 509 (2004), 187–195.10.1016/j.aca.2003.12.032Search in Google Scholar

[50] R. Storn, On the usage of differential evolution for function optimization, in: 1996 Biennial Conference of the North American Fuzzy Information Processing Society, 1996. NAFIPS, pp. 519–523, IEEE, USA, 1996.10.1109/NAFIPS.1996.534789Search in Google Scholar

[51] R. Storn and K. Price, Differential evolution – a simple and efficient heuristic for global optimization over continuous spaces, J. Glob. Optim. 11 (1997), 341–359.10.1023/A:1008202821328Search in Google Scholar

[52] C. S. Sung and H. W. Jin, A tabu-search-based heuristic for clustering, Pattern Recogn. 33 (2000), 849–858.10.1016/S0031-3203(99)00090-4Search in Google Scholar

[53] L. Y. Tseng and S. Bien Yang, A genetic approach to the automatic clustering problem, Pattern Recogn. 34 (2001), 415–424.10.1016/S0031-3203(00)00005-4Search in Google Scholar

[54] X. L. Xie and G. Beni, A validity measure for fuzzy clustering, IEEE Trans. Pattern Anal. Mach. Intell. 13 (1991), 841–847.10.1109/34.85677Search in Google Scholar

Received: 2018-01-28
Published Online: 2018-04-25

©2020 Walter de Gruyter GmbH, Berlin/Boston

This work is licensed under the Creative Commons Attribution 4.0 Public License.

Downloaded on 25.4.2024 from https://www.degruyter.com/document/doi/10.1515/jisys-2018-0063/html
Scroll to top button