An Efficient Weakly Supervised Approach for Texture Segmentation via Graph Cuts

Arnav V. Bhavsar

doi:10.1515/jisys-2013-0037

Open Access Published by De Gruyter June 18, 2013

An Efficient Weakly Supervised Approach for Texture Segmentation via Graph Cuts

Arnav V. Bhavsar

From the journal Journal of Intelligent Systems

https://doi.org/10.1515/jisys-2013-0037

Abstract

We propose an approach for texture segmentation based on weak supervised learning. The weak supervision implies that the user marks only a single small patch for each class in the input image. These patches are used for training. We employ the method of graph cuts for the segmentation task. Our work demonstrates that even under such weak training, texture segmentation can be achieved efficiently and with good accuracy via graph cuts. Moreover, our approach uses a simpler feature representation than that in similar contemporary segmentation approaches. We also provide a brief discussion indicating the good performance of our approach. We validate our method on various standard texture mosaics and also on segmentation of natural images with large texture variations.

Keywords: Texture classification; weak supervision; textons; graph cuts

1 Introduction

Texture segmentation is an active area of research with applications in remote sensing, medical image analysis, scene understanding, etc. Given an image with multiple textures, texture segmentation involves the task of classifying each pixel into one of several classes. However, unlike many general classification problems (including texture classification [16]), segmentation approaches typically do not assume the availability of a separate training dataset. In the texture segmentation problem, the observed information to label a pixel employs only a small neighborhood around that pixel, unlike its classification counterpart, where all the pixels (ranging from thousands to millions) in the observed test image are available for its classification. Thus, texture segmentation is a complex combinatorial labeling problem, where each pixel must be labeled based only on the information in the input image.

The concern of lack of training data in segmentation methods has led to the research on sophisticated unsupervised methods [2, 18]. These approaches trade off good performance with more complexity, more so in texture segmentation where one requires multipixel data to compute the texture properties. In contrast, a class of segmentation approaches addresses the lack of separate training data by involving a small amount of user interaction [4, 8, 15, 19], wherein the user easily marks a small subset of pixels from each class, which can be used as training data. Such a minor user interaction arguably helps in mitigating the sophistication and complexity typically required in an unsupervised case. The proposed approach falls in the category of such interactive segmentation methods. It is “weakly supervised” as the amount of training data selected by the user is much less than what is usually available in a supervised classification approach. Weak supervision also helps in mitigating the second concern about limited data from the input image to make a decision for each pixel. Even with a small amount of training data, one can avoid the chicken–egg scenario. Moreover, as the training data are extracted from the input image, the distribution of the training data would not be far off from that in the rest of the input image in terms of statistical variation in intensities, geometry, etc. This also makes the features learnt from the training data more reliable when used as a reference to compare with the test features extracted from a small amount of information available around each input pixel during the pixel-labeling process.

We explore the idea of weak supervision for texture segmentation to develop a simple and efficient approach for texture segmentation that requires little human interaction. For instance, in our method, the selection process can be as simple as clicking at some interior locations in each of the textured regions in the input image, around which small patches can be selected with a predefined size (e.g., 100 × 100 or 200 × 200, depending on the image size). In fact, in this article, for most cases, we use only a single training patch for each texture class.

Notwithstanding the above-mentioned advantages of weak supervision, we acknowledge that a segmentation problem involving labeling of each pixel is essentially an ill-posed combinatorial problem. Hence, one needs some regularization in the estimation process. However, we must keep in mind that it is important that the regularized estimation approach is efficient in addition to being sufficiently accurate (as such qualities would be expected in an approach involving user interaction). We show that even with such a small amount of training data, the task of texture segmentation can be achieved with good accuracy, when using a strong and efficient combinatorial optimization technique such as graph cuts. Moreover, we highlight that this can be achieved with a much simpler feature representation as compared with what is used in similar contemporary interactive segmentation approaches. This further conforms with our need for simplicity and efficiency. More specifically, we use the cluster means of texture features (i.e., just the first-order statistic) to formulate the data cost of graph cuts. Also, the texture features used in our method are the popular filter response features [16] and intensity neighborhoods [23], which are themselves straightforward and efficient to compute and are also used such that they represent textures at different scales. Thus, this work signifies that with a little user interaction, texture segmentation can be addressed using simple feature extraction and modeling, via efficient graph-cuts optimization.

1.1 Relation to Previous Work

Texture segmentation is a well-established area, spanning supervised [1] as well as unsupervised methods. Many of these approaches rely on sophisticated modeling and segmentation methods, such as modeling regional feature distributions and shape extraction using active contour [18], computing multidimensional probability distributions over feature space [17, 22], non-parametric neighborhood statistics using kernel methods [2], structure tensors with partial differential equation-based diffusion [20], etc. While a detailed survey on segmentation methods is beyond the scope of this article, we focus on those closer to our work that incorporate interactive (or weakly supervised) segmentation and/or use graph cuts for segmentation.

Indeed, the weakly supervised approaches are quite popular in the pattern recognition community to mitigate the complexity of a problem. The motivation behind weak learning is to provide some more but (very) partial information (involving very little user effort) to simplify a complex problem. For example, the work by Vasconcelos et al. [24] for image segmentation uses manually annotated images rather than an exactly segmented image for training. The idea is that annotation requires much less human intervention (than an exact segmentation that is used in supervised approaches), but still provides some extra information than that in a completely unsupervised problem. A similar philosophy is followed by Vezhnevets et al. [25] for natural image segmentation, where the user specifies only the classes in the training images but not their exact locations. The work reported by Fragkiadaki and Shi [7] exploits the figure-ground information from multiple images containing the same object to detect that object from an unknown image. This also falls into the category of weakly supervised approaches, as more data are used (in the form of figure-ground information) to aid the detection task than what would have been used in a completely unsupervised detection.

In this context, there exists some graph-cut-based interactive image segmentation methods that are more closely related to our work [4, 15, 19], in that they involve similar user interaction as in our method. However, unlike our method, which involves multiclass segmentation, these methods mainly focus on the task of foreground–background segmentation. In general, in these methods, the user initially marks some pixels as foreground and some as background. These pixels are used as seeds to learn probability distributions [feature histograms or Gaussian mixture model (GMM)]. A binary labeling is then performed using graph cuts, which use region-based costs that involve these learned likelihood probabilities, and boundary costs that utilize an a priori edge extraction in the images. While the overall segmentation philosophy in these approaches is similar to ours, there are some important differences.

To reliably learn probability distributions (such as GMM), as carried out in refs. [4, 15, 19], typically requires a reasonable number of training samples, and involves learning of multiple parameters (i.e., means, variances, and weights for different Gaussians). In this respect, our method only uses means of feature clusters to compute our data costs, which does not involve learning multiparametric probability distributions. Thus, our data modeling is arguably much simpler and efficient. Moreover, as mentioned above, the methods in refs. [4, 15, 19] consider a two-class problem. It is not known how reliably the data modeling (by learning probability distributions) would scale up to a multiclass scenario. Also, for a binary labeling problem, the method of graph cuts is proven to reach global minima, but it only promises “strong” local mimima for a multiclass problem [5]. Thus, the performance of such an approach for a multiclass texture segmentation method is as yet unclear. Finally, unlike these methods that consider the problem of natural image segmentation, our method does not employ the boundary costs; the reason being that we address a texture segmentation problem, wherein each textured region can yield too many edges that are not a part of the boundary between two textured regions. Our approach only uses the regional costs.

Indeed, the method of graph cuts has been recently considered for texture segmentation [11, 12, 14]. However, like in the methods discussed above, the approaches in refs. [11, 12] also employ GMMs to learn complex data distributions, thus bringing in concerns of parameter learning and more training data. To our knowledge, the method in ref. [14] is the only method that performs an unsupervised segmentation. However, in addition to the multivariate mixture modeling as in the above approaches, the unsupervised graph-cuts approach also comes at an additional expense of an iterative method that looks for new segments by decomposing the modeled distribution, computing the model parameters, and performing graph cuts on resulting subgraphs [14].

On the basis of the above discussion, our main contributions can be summarized as follows: (i) Our work empirically demonstrates that even for a multiclass segmentation problem as in our case, such a strong local minimum can yield a reasonably accurate solution. (ii) We achieve this with a simpler feature modeling as compared with the above approaches. (iii) Furthermore, our cost computation is also based only on the regional features and does not employ the edge information. (iv) We also provide a brief intuitive discussion on these salient aspects of our approach.

2 Segmentation Methodology

We now elaborate our overall approach. This essentially consists of two phases: (i) In the first part, we discuss our feature extraction from the training patches and the succinct modeling of these extracted features so as to represent each texture class. (ii) Once such representations are defined for each training class, we discuss our segmentation method using graph cuts.

2.1 Texture Features

While philosophically, no single definition of “textures” is universally agreed on, methods for texture classification and segmentation use different ways to extract texture features. The more popular of these are filter responses [16], which can be computed by convolutions of the input image with a set of filters. The idea of using filters to extract texture features is that textures can be considered as statistically repetitive signals with differently varying frequencies, scales, and orientations. Thus, these filters span a set of frequencies, scales, and orientations so that some of them bring out such characteristic properties of particular textures. Their discriminative ability can be appreciated by understanding that different filters would extract the salient characteristics for different textures, i.e., different filters would be tuned to different textures.

While there are various different filter banks, in our work we use the Gaussian filer bank from the influential work in texture classification [16]. This filter bank contains isotropic Gaussians, and Lapalacians, and first and second derivatives of one-dimensional non-isotropic Gaussians, at different scales and orientations. Other common filter banks are Gabor [10], wavelets [8], etc.

Thus, given an input M × M training patch I_i for the i^th class, and a bank of K filters F₁ to F_K, a set of filtered patches

are computed as

A K-dimensional vector r_f, formed by accumulating the response at each pixel location x from the K images, yields a filter response vector at that pixel.

Such a filter response vector characterizes the texture at that pixel. For the training patch I_i, such M² filter response vectors collected from all M² pixels form the filter response feature set for the class i.

Another relatively straightforward idea of modeling textures with raw patches of image intensities is also successfully explored in ref. [23]. We term this as the joint response. This is simply the N² dimensional vector formed by accumulating the N × N patch around a pixel (including that pixel) [23] [where N is usually small (3, 5, etc.)]. Thus, for each pixel location x in I_i, we have a neighborhood response feature vector at p as

where I_i(n₁) to I_i(n_l) are the l neighboring pixels of I_i(x). Similar to above, the joint response vectors collected from all pixels in the image form the joint response feature set.

The idea behind the joint response stems from the notion that textures are often modeled as Markov random fields (MRF) [23]. The MRF framework statistically relates the state of one pixel to its neighbors. In this sense, a simple accumulation of pixels from small patches, when done for all pixels, represents a local joint distribution of the pixel intensities across such patches. While this representation is perhaps one of the simplest (arguably, even simpler than the filter response representation), it is quite effective in its representative and discriminative properties [23]. This is again because, as indicated above, textures can be interpreted as statistical repetitive signals. The local joint distributions, as defined by the joint features, characterize the statistics or the variation of local structures. As different textures tend to have different local statistics, the joint responses are quite effective for discrimination.

Clearly, the computation of filter responses as well as neighborhood response is both straightforward and efficient. The former involves convolutions (which can be implemented via a fast Fourier transform) and the latter involves mere accumulations over 3 × 3 or 5 × 5 patches.

2.2 Texture Modeling

While we now have the feature sets for the training patches for all classes, the labeling process typically requires a more succinct representation or modeling of these features [4, 15, 19]. As discussed above, this is typically achieved by learning a probability distribution based on such features.

Instead of this, we follow a simpler representation. Again, invoking the idea that a “texture” is a statistically repeating phenomenon allows one to represent a texture patch by a few responses. The feature responses for each texture typically form clusters, which indicates that each texture only has a small number of variation patterns. The centroids (or means) of the feature response clusters, commonly known as textons [16], represent the most significant responses out of the overall response set, which can themselves characterize the texture sufficiently. Thus, instead of using all the feature responses, one can only use the textons as reference features in the decision-making process.

The clustering and the resultant texton computation is carried out by k-means clustering, which can be carried out very efficiently (e.g., ref. [6]). Importantly, such a texture representation by textons involves only the first-order statistics of the response, thus making the learning process easier. We demonstrate that even considering the simple first-order statistics (means) is sufficient in many examples of texture segmentation under weak supervision.

2.3 Segmentation via Graph Cuts

The above process yields, for each class, two sets of textons; i.e., for the filter response features and for the joint response features. As indicated above, these textons are used for labeling each pixel in the image, via the efficient graph-cuts labeling framework. Since the last decade, the graph-cuts approach is known to be very successful for various labeling problems, such as denoising and image segmentation [5], in terms of efficiency as well as in achieving good-quality solutions.

Graph cuts is a technique for minimizing energy functions of the form

The energy is defined over a grid, with the variables v and observations y. It consists of a data term E_d involving the variable labels and observations, and a smoothness term E_s, which enforces constraints over the labels for the neighboring variables v and w. The task is to minimize the energy function to compute the labels for all the variables on the grid.

The method of graph cuts defines a graph over the variable and observation nodes. The min-cut on this graph corresponds to the minimum of the energy function, in case of two labels. The link strengths between the nodes correspond to the data and smoothness costs. While classically the min-cut is defined for a binary labeling case (with a source and sink nodes corresponding to the two labels), there are different approaches to decompose a multilabel problem into many binary labeling one. For our problem, we employ the α-expansion method [5].

An important aspect behind its success is its good regularization ability. The graph-cuts method can handle different kinds of priors and yet maintains its properties of achieving a good local minimum and efficiency [5]. Thus, one can choose an appropriate prior so as to achieve a smooth solution that resists artifacts within the same texture regions, while respecting inter-texture discontinuities. Such regularization abilities are important in an application where the inherent features may not be strongly representative of the classes, such as in our case. Thus, graph cuts provide a good optimization framework, and still help in maintaining the efficiency, both of which are crucial to a weakly supervised method.

In our case, the graph is defined with each node corresponding to a pixel of the input image to be segmented. Given K_f filter-based textons and K_n neighborhood-based textons, we compute the filter response r_f(x) and the neighborhood-response r_n(x) at each pixel (graph node) x. The data cost at the pixel x, for a class L, is then defined as

where

Here,

denotes the filter response texton that yields the minimum cost among all the filter response textons when compared with the feature response at x.

is a similarly defined joint response texton. The minimum costs in each case are then added to compute the complete data cost at a pixel x [Eq. (5)].

For the smoothness cost, we consider the second-order neighborhood (eight neighbors for a pixel). For the task of segmentation, one usually desires hard boundaries between the classes. Thus, the smoothness cost for neighboring labels L(x₁) and L(x₂) at neighboring pixels x₁ and x₂ is defined using the Potts model [5] as

The idea behind using the Potts model for the smoothness is that in a segmentation problem, there is no preference for the closeness of neighboring labels, except for exactly the same neighboring labels. Hence, one has a low penalty for exactly the same labels (0 penalty in this case), and equal non-zero penalties if the neighborhood labels differ by any amount. Hence, this cost helps in providing equal weightage to any label discontinuities, which is natural for a segmentation problem.

2.4 Discussion on Salient Aspects

Having discussed our approach, we now provide some insights on some salient aspects of this work, which we think are important to gauge its good performance.

It is indeed interesting to ask why such a simple approach works well. In this respect, some hints can be obtained from the graph-cuts-based segmentation approaches [3, 4, 19], some of which have a similar overall structure as our method. These binary segmentation methods [3] emphasize on the property that graph cuts yield a globally optimum solution. While this property is only correct for a two-label problem, the early work on graph cuts for the multilabel case suggests that even for a multiclass scenario, the locally optimum solution can be theoretically related to the global solution [5]. In fact, when using the Potts model, the solution to the local solution can be shown to be close to the globally minimum solution [5]. This property would partially indicate the reasoning behind the good performance of our approach.

Another important query is that concerning the encouraging performance despite the simplicity of the modeling of training data; the fact that our training data are modeled using only the k-means for each class, rather than employing an explicit probability distribution (such as histograms or GMMs) often involving with multiple parameters. This may be reasoned by the observation that while the k-means method explicitly computes only the means, it implicitly learns the variability in the data (a measure of the second-order statistic), when optimizing the Euclidean distances between the means and the other training points. Thus, the means computation also effectively approximates the data as a mixture of Gaussians that represents the class variability.

The works in refs. [3, 4] also point out that the quality of the solution of graph cuts, with guarantees for a global or a strongly local solution, can be directly related to the cost function. Note that our cost function uses two strong well-known texture features, i.e., the filter banks, which are at a relatively larger scale, and MRF features at a smaller scale. Acknowledging the importance of scale in texture classification, such a representative data cost helps in better discriminating the textures as they operate at different scales.

3 Experimental Results

We tested our segmentation method on textured mosaics, including some from the Prague texture segmentation dataset [9] and the SIPI dataset [21]. Such texture mosaics are commonly used for validation of texture segmentation methods. In addition, we also experimented on some images from the Berkeley segmentation dataset [13], which contain natural images with textured objects and background.

The parameters in our method are filter kernel size, neighborhood extent for computing the joint responses, number of textons, and λ used in the graph cut smoothness cost. We experimented with square kernel sizes of 7 × 7 and 11 × 11 for computing filter responses. The standard deviation for the Gaussian kernel for the filters was chosen between 1 and 2. The neighborhood extent for the joint responses was 3 × 3. Note that these sizes indicate that the filter kernels and the neighborhood-response capture features at different scales. The number of textons used in our method is 10–30. Our λ values are set such that the data cost and smoothness costs are of the same order. All our results are obtained with only a single iteration of α-expansion method over all labels.

We begin with an example, which was synthetically generated and which serves as a good validation of our approach to discriminate simple textures (Figure 1). For this synthetic example, we used two 50 × 50 patches, one for each class, for training. Note that the segmentation output (Figure 1B) is quite smooth with no errors within the segmented regions. The localization of the segment boundaries is also quite accurate, except for a little jaggedness at the discontinuities due to the boundary effects of the kernel. However, we believe this is minor, given that the kernel size are larger compared with the magnitude of localization errors, and also considering the fact that our approach does not involve any explicit boundary costs.

Figure 1.

Synthetic Texture Example: (A) Input Image with Superimposed Segmented Boundaries. (B) Segmentation Label Map.

3.1 Texture Mosaics

After the initial validation, we show some examples on some real texture mosaics (Figure 2) with five to six texture classes. The training patch size was 100 × 100. As one can observe (unlike the previous example), these examples contain statistically varying textures, with intra-texture appearance variations, as one would expect with natural textures. Nevertheless, similar to the previous result, the labeling results (Figure 2C, D, G, H) are contiguous and with good localization.

Figure 2.

Texture Mosaics: (A, B, E, F) Input Image with Superimposed Segmented Boundaries. (C, D, G, H) Corresponding Segmentation Label Maps.

Note that most of the small errors are typically close to image boundaries (Figure 2C, D) or at triangle junction (Figure 2D, G, H), where the data available from each class, for patches in such regions, are relatively low. However, the overall efficacy in demarcation between textures is clearly visible in Figure 2A, B, E, and F. Two of the examples involve color images and the other two involve gray-scale images. However, the results indicate that even on gray-scale images the texture segmentation is reasonable. Thus, our approach does not seem to rely on the color information for yielding good segmentation.

3.2 Natural Scenes

Finally, we provide some results on natural images from the Berkeley dataset (Figure 3). In some of these cases, the problem is a two-class problem of segmenting object and background using texture cues. We operate on gray-scale images to emphasize that only the texture cue (and no color information) is used. These examples are relatively more difficult as these images are captured in an uncontrolled environment and involve large variations within and between natural objects, e.g., scale and orientation variations in the zebra scales, appearance variations in the starfish and birds images, and similarity between the background and the object intensities in the cheetah and bird images.

Figure 3.

Texture Segmentation on Natural Images: (A, D, G, J) Segmentation for the Zebra, Starfish, Birds, Cheetah Images, Respectively. (B, C,) (E, F) (H, I) (K, L, M) Training Patches from the Classes in the Corresponding Images.

In spite of the variations, we only select a single patch from each texture for all examples, shown (at a slightly larger scale) besides each segmented image in Figure 3. The segmentation for these natural data is quite acceptable, when considering the weak learning under such large variations. For instance, note that in the zebra image, the strips are different scales and orientations are segmented correctly at large, only with the training at only one scale and orientation. Similarly, the appearance variations in the background in the cheetah and the birds image do not much affect the segmentation. Indeed, the striped tail of the cheetah is also segmented using the dotted texture on body, perhaps because the texture on the tail is closer that on the body than that in the background. Similarly, in the starfish example, regions with considerably different appearance both in the background and foreground are segmented correctly.

Here, we would also make a passing observation about a weakly supervised approach. Note that in the synthetic example of Figure 1, the striped texture is similar to that of zebra in Figure 3A. However, in Figure 1, the two classes of horizontal and vertical strips are enforced to behave as separate textures by the user. However, in Figure 3A, a common “striped” class is assigned one label in the training phase and a different texture is assigned another label. Hence, in Figure 3A, the approach does not get confused between the different orientations and scales of the striped texture, as the second label is still easily discriminated (for the most part) owing to the user-defined weak supervision. Thus, weakly supervised learning allows one to “choose” the discriminability between textures; something that is not obvious in an unsupervised framework.

Finally, to provide a practical idea about the efficiency, we assert that the complete process of feature computation and clustering features into textons takes about 20–25 s for five training patches (one patch from five classes) of the order of 100 × 100 pixels each, for a Matlab implementation on a Xeon 3.2 GHz processor with 12 GB RAM. The graph cut segmentation process, for an image of the order of 400 × 400 takes about a few (<10) seconds.

4 Conclusion

We proposed a weakly supervised graph-cut-based texture segmentation method (with a very manageable user interaction), with features that can be easily extracted and a feature representation that can be efficiently computed. Our results clearly validate our approach even for challenging cases. Thus, the work shows that weakly supervised learning helps in representing textures reliably enough in a straightforward manner, which, when used in conjunction with a good combinatorial labeling method, makes it possible to achieve the segmentation task with good accuracy and efficiency. Encouraged by the performance of our approach, we aim to further analyze its behavior with noise and parameter variation.

Corresponding author: Arnav V. Bhavsar, PhD, Indian Institute of Technology Madras, Chennai – 600036, India

Bibliography

[1] O. S. Al-Kadi, Supervised texture segmentation: a comparative study, in: IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies, pp. 1–5, 2011.10.1109/AEECT.2011.6132529Search in Google Scholar

[2] S. P. Awate, T. Tasdizen and R. T. Whitaker, Unsupervised texture segmentation with nonparametric neighborhood statistics, Technical Report, UUSCI-2006-011, SCI Institute, University of Utah, 2006.10.1007/11744047_38Search in Google Scholar

[3] Y. Boykov and G. Funka-Lea, Graph cuts and efficient N–D image segmentation, Int. J. Comput. Vis. 70 (2006), 109–131.10.1007/s11263-006-7934-5Search in Google Scholar

[4] Y. Boykov and M. Jolly, Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images, in: International Conference on Computer Vision (ICCV 2001), pp. 105–112, 2001.Search in Google Scholar

[5] Y. Boykov, O. Veksler and R. Zabih, Fast approximate energy minimization via graph cuts, IEEE Pattern Anal. Mach. Intell. 23 (2001), 1222–1239.10.1109/34.969114Search in Google Scholar

[6] C. Elkan, Using the triangle inequality to accelerate k-means, in: International Conference on Machine Learning, pp. 147–153, 2003.Search in Google Scholar

[7] K. Fragkiadaki and J. Shi, Figure-ground image segmentation helps weakly-supervised learning of objects, in: European Conference on Computer Vision (ECCV 2010), pp. 561–574, 2010.10.1007/978-3-642-15567-3_41Search in Google Scholar

[8] K. Fukuda, T. Takiguchi and Y. Ariki, Graph cuts by using local texture features of wavelet coefficient for image segmentation, in: IEEE International Conference on Multimedia and Expo, pp. 881–884, 2008.10.1109/ICME.2008.4607576Search in Google Scholar

[9] M. Haindl and S. Mikes, Texture segmentation benchmark, in: International Conference on Pattern Recognition, 2008.10.5772/6243Search in Google Scholar

[10] A. K. Jain and F. Farrokhnia, Unsupervised texture segmentation using Gabor filters, in: IEEE International Conference on Systems, Man and Cybernetics, pp. 14–19, 1990.Search in Google Scholar

[11] M. Jirik, T. Ryba and M. Zelezny, Gabor filter and graph cut based texture analysis, Pattern Recogn. Image Anal. 22 (2012), 215–220.10.1134/S1054661812010208Search in Google Scholar

[12] M. Jirik, T. Ryba and M. Zelezny, Texture based segmentation using graph cut and Gabor filters, Pattern Recogn. Image Anal. 21 (2012), 258–261.10.1134/S105466181102043XSearch in Google Scholar

[13] J. S. Kim and K. Hong, A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics, in: International Conference on Computer Vision (ICCV 2001), pp. 416–423, 2001.Search in Google Scholar

[14] J. S. Kim and K. Hong, Color–texture segmentation using unsupervised graph cuts, Pattern Recogn. 42 (2009), 735–750.10.1016/j.patcog.2008.09.031Search in Google Scholar

[15] M. Kulkarni and F. Nicolls, Interactive image segmentation using graph cuts, in: Annual Symposium of the Pattern Recognition Association of South Africa (PRASA 2009), 2009.Search in Google Scholar

[16] T. Leung and J. Malik, Representing and recognizing the visual appearance of materials using three-dimensional textons, Int. J. Comput. Vis. 43 (2001), 29–44.10.1023/A:1011126920638Search in Google Scholar

[17] T. Ojala and M. Pietikainen, Unsupervised texture segmentation using feature distributions, Pattern Recogn. 32 (1999), 477–486.10.1016/S0031-3203(98)00038-7Search in Google Scholar

[18] N. Paragios and R. Deriche, Geodesic active regions and level set methods for supervised texture segmentation, Int. J. Comput. Vis. 46 (2005), 223–247.Search in Google Scholar

[19] C. Rother, V. Kolmogorov and A. Blake. GrabCut – interactive foreground extraction using iterated graph cuts, ACM Trans. Graphics 23 (2004), 309–314.10.1145/1015706.1015720Search in Google Scholar

[20] M. Rousson, T. Brox and R. Deriche, Active unsupervised texture segmentation on a diffusion based feature space, Technical Report (N4695), INRIA, 2003.Search in Google Scholar

[21] Signal and Image Processing Institute, University of Southern California, http://sipi.usc.edu/database/database.php?volume=textures. Accessed May, 2012.Search in Google Scholar

[22] S. Todorovic and N. Ahuja, Texel-based texture segmentation, in: International Conference on Computer Vision (ICCV 2009), pp. 841–848, 2009.10.1109/ICCV.2009.5459308Search in Google Scholar

[23] M. Varma and A. Zisserman, Texture classification: are filter banks necessary, in: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2003), pp. 691–698, 2003.Search in Google Scholar

[24] M. Vasconcelos, G. Carneiro and N. Vasconcelos, Weakly supervised top-down image segmentation, in: IEEE Conference in Computer Vision and Pattern Recognition (CVPR 2006), pp. 1001–1006, 2006.Search in Google Scholar

[25] A. Vezhnevets, V. Ferrari and J. M. Buhmann, Weakly supervised structured output learning for semantic segmentation, in: IEEE Conference in Computer Vision and Pattern Recognition (CVPR 2012), pp. 845–852, 2012.10.1109/CVPR.2012.6247757Search in Google Scholar

Received: 2013-5-14

Published Online: 2013-06-18

Published in Print: 2013-09-01

This article is distributed under the terms of the Creative Commons Attribution Non-Commercial License, which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

An Efficient Weakly Supervised Approach for Texture Segmentation via Graph Cuts

Abstract

1 Introduction

1.1 Relation to Previous Work

2 Segmentation Methodology

2.1 Texture Features

2.2 Texture Modeling

2.3 Segmentation via Graph Cuts

2.4 Discussion on Salient Aspects

3 Experimental Results

3.1 Texture Mosaics

3.2 Natural Scenes

4 Conclusion

Bibliography

Journal and Issue

Articles in the same Issue