Image Anal Stereol 2007;26:101-109 Original Research Paper MORPHOLOGICAL SEGMENTATION OF HYPERSPECTRAL IMAGES Guillaume Noyel, Jesús Angulo and Dominique Jeulin Centre de Morphologie Mathématique, Ecole des Mines de Paris, 35 rue Saint-Honoré, Fontainebleau, F-77305, France e-mail: {guillaume.noyel, jesus.angulo, dominique.jeulin}@ensmp.fr (Accepted October 2, 2007) ABSTRACT The present paper develops a general methodology for the morphological segmentation of hyperspectral images, i.e., with an important number of channels. This approach, based on watershed, is composed of a spectral classification to obtain the markers and a vectorial gradient which gives the spatial information. Several alternative gradients are adapted to the different hyperspectral functions. Data reduction is performed either by Factor Analysis or by model fitting. Image segmentation is done on different spaces: factor space, parameters space, etc. On all these spaces the spatial/spectral segmentation approach is applied, leading to relevant results on the image. Keywords: factor analysis, hyperspectral imagery, mathematical morphology, watershed segmentation. INTRODUCTION Hyperspectral images are multivariate discrete functions with typically several tens or hundreds of spectral bands. In a formal way, each pixel of an hyperspectral image is a vector with values in wavelength, in time, or associated with any index j. To each wavelength, time or index corresponds an image in two dimensions called channel. In the sequel we use only the term of spectrum and spectral channel. The number of channels depends on the nature of the specific problem under study (satellite imaging, spectroscopic images, temporal series, etc.). Let fl : E x T L fl(x) fa(x), fa(x),..., fa(x) (1) be an hyperspectral image, where: • EcR2,TcRandTi = TxTx...xT • x = xi \ i e {1,2,... ,P} is the spatial coordinates of a vector pixel fx(xi) (P is the pixels number of E) • fxj \ j G {1,2,...,L} is a channel (L is the channels number) • fxj(xi) is the value of vector pixel fx(xi) on channelf^. In this paper, we introduce a general methodology for hyperspectral image segmentation, using a watershed based approach. The watershed transformation is a powerful tool for mathematical morphology segmentation (Serra, 1982; Soille, 2003). Watershed segmentation requires a gradient (i.e., a scalar function) and markers on the target structures to obtain a correct image segmentation (Beucher and Meyer, 1992). A gradient on a multivariate function can be obtained in different ways. One way is to calculate on each image channel a modulus of a gradient, and to take the sum or the supremum of the gradients (Meyer, 1992; Angulo and Serra, 2003). Another way is to use vectorial gradients based on distance between vector pixels (Angulo and Serra, 2003; Evans and Liu, 2006). We consider here various alternatives for hyperspectral images. Moreover, when dealing with hyperspectral images, the large number of channels generates data redundancies. Consequently, it is necessary to reduce the amount of data, to extract pertinent information. To do this, two ways are explored: a linear factor analysis and a model approach. Previously in the literature, several approaches to multispectral image segmentation were explored. Flouzat et al. (1998) use a spatial and a spectral segmentation based on the filtering of the image adjacency graph. Paclík et al. (2003) obtain, with statistical classifiers, the pixels probabilities of membership to clusters for spectral domain and the pixels probabilities of membership to clusters for spatial domain. The pixels probabilities of membership to a cluster are obtained by multiplying both probabilities, because they assume independence between spatial and spectral information. The pixels are classified and the process is repeated until convergence. Li and Xiao (2004) compute on each channel a morphological multiscale gradient by summation of morphological gradients with increasing size structuring elements. As watershed requires _ 101 Noyel G et al: Morphological segmentation of hyperspectral images a scalar gradient (i.e., one channel), the channels gradients are summed with a weight equal to one. After morphological filtering on the gradient, the local minima are used as markers for watershed segmentation. Scheunders (2001) computes a gradient by summation of channels gradients followed by a watershed segmentation. Soille (1996) combines spectral classification on histograms and spatial segmentation. The multidimensional histogram is segmented, using the watershed algorithm, to obtain a classified image. On the classified image the minima of the gradient of the hyperspectral image are imposed. With the gradient and the markers he applies the watershed segmentation. The methodology and the different alternatives studied in the current paper are illustrated by means of a 60 channels image acquired by active thermography on plastic lids. The size of the image is 256 x 256 pixels x 60 channels. The aim is to segment glue occlusions within plastic lids. This sequence of images comes from Laboratoire Le2i, Le Creusot, France. Legrand et al. (2002) explain how the image was acquired. They also present a segmentation that was done on a channel, using difference of lid images with and without glue, thresholding and filtering. Their segmentation is used as a reference for comparison (Fig. 1). In our case, only the sequence with glue occlusions is available. That’s why we cannot use image difference or any kind of calibration with a reference image. an fl60 reference Fig. 1. Three channels of the original image and the reference obtained by Legrand et al. (2002) method. FRAMEWORK FOR MORPHOLOGICAL SEGMENTATION ON MULTIVARIATE DATA The watershed transformation is one of the most powerful tools for segmenting images. According to a flooding paradigm, the watershed lines associate a catchment basin to each minimum of the function (Beucher and Meyer, 1992). Typically, the function to flood is a gradient function which defines the transitions between the regions. Using the watershed on a scalar image without any preparation leads to a strong over-segmentation (large number of minima). There are two alternatives in order to get rid of the over-segmentation. The first one consists in initially determining markers for each region of interest: using the homotopy modification, the local minima of the gradient function are only the region markers. A difficult issue is to determine the markers, especially for generic images. The second alternative involves hierarchical approaches based on non-parametric merging of catchment basins (waterfall algorithm) or based on the selection of the most significant minima according to different criteria (dynamics, area or volume extinction values) (Meyer, 2001). In this paper, we focus on markers based segmentation. In fact, we consider that the hyperspectral images have enough information to define markers from a spectral classification. Multivariate gradients A gradient image, in fact the norm, is a scalar function with values in the reduced interval [0,1], i.e., V : E —> [0,1]. This normalization is always applied to multivariate gradients given below in order to have homogeneous gradient functions. To define a gradient, four approaches are considered: a morphological gradient on one channel, a metric-based gradient on all channels, a gradient defined as the supremum, or as the sum, of morphological gradients on each channel. The morphological gradient is a marginal gradient (i.e., it can only be applied on scalar images) defined as the difference between channel dilation and erosion with a structuring element Bx which is the neighbourhood of a point x G E (Serra, 1982): g(fkj (x)) = dBx (fXj (x) ) - eBx (fxj (x) ) = Vsx (fxj (x))- /\Bx (fxj (x)). (2) The gradient supremum of morphological gradients on each channel is a vectorial gradient defined as: Vvfx(x) = V[ g(Aj(x)),jG{1,...,L}]. (3) 102 Image Anal Stereol 2007;26:101-109 Each morphological gradient must be normalized between [0,1] before taking the supremum. The gradient weighted sum of morphological gradients is given by: Ñ+fl(x) = å wlj g(flj (x)) j=1 (4) where wlj denotes the weight of the gradient of channel flj . The metric-based gradient is a vectorial gradient defined as the difference between the supremum and the infimum of a defined distance on a unit neighbourhood B(x): Ñdf l (x) = V [d{f l (x), f l (y) ) ,y G B(x)] -A[d(fl(x),fl(y)),y£B(x)]. (5) Various metric distances are available for this gradient such as: • Euclidean distance: dE(fl(x),fl(y)) \ XCf l j x-f l j y))2. (6) j = 1 Mahalanobis distance: dM(fl(x),fl(y)) (fl(x) -fl(y))tS-l(fl(x) -fl(y)) , (7) where S is the covariance matrix between variables (channels) of fl. If channels are uncorrelated, the covariance matrix is diagonal. The diagonal values are equal to channels variance sl \ jG j {1,2,... ,L}. Therefore, the Mahalanobis distance becomes: dM(f l(x),f l(y)) = • chi-squared distance: dc2{f l{xi),f l xi)) \ (8) L å N 2 (9) with f.lj = åPi= 1 flj (xi), fxi. = åJj= 1 flj (xi) and N = åLj= 1 åPi= 1 flj (xi). Markers by spectral clustering The markers defining the targets are obtained with an unsupervised classification based on clustering. We have used in this study the clustering algorithm “Clara” (Kaufman and Rousseeuw, 1990). This is a similar way, but more robust than the “kmeans” classification, in order to cluster large numbers of points. Then, the pertinent clusters are selected to be the markers. To perform clustering, each channel is considered as a variable. Therefore each pixel has its own value on each variable. DATA REDUCTION AND FILTERING USING FACTOR ANALYSIS Due to the redundancy of channels, a data reduction is performed using Factor Correspondence Analysis (FCA) (Benzécri, 1973). We prefer an FCA in place of a Principal Component Analysis because image values are positive and the spectral channels can be considered as probability distributions. The metric used in FCA is the chi-squared normalized by channels weight. This metric is adapted to probability laws. FCA can be seen as a transformation from image space to factorial space (Eq. 10). In factorial space the coordinates of the pixels vector on each factorial axis are called pixels factors. The pixels factors can be considered as an hyperspectral image whose channels correspond to factorial axes: z: T L fl(x) T K /K . mrksfx;- -J Watershed (b) Fig. 14. (a) Flowchart of different transformations: data denoising and reduction, (b) diagram for Watershed segmentation. In this paper we have considered four multidimensional spaces for image segmentation (Fig. 14): 107 Noyel G et al: Morphological segmentation of hyperspectral images • space 1: the image space fl(x) after image filtering by FCA; • space 2: the factorial space of image ca f(x) after image filtering and data reduction using FCA; • space 3: the parameters space p(x) after image filtering by FCA and data reduction by model fitting; • space 4: the factors parameters space cb p (x) after image filtering by FCA, data reduction by model fitting and parameters orthogonalisation by PCA. These spaces can be grouped in two different approaches. The first one is data reduction by FCA. Space 2 belongs to this approach. The second approach reduces data by model fitting. Spaces 3 and 4 belongs to this approach. Space 1 provides a direct approach to be compared to the others. On each space, the same method of segmentation can be applied. First of all, a filtering is done on image fl(x) using FCA. Then the components of the spaces are generated: fl(x), ca f(x), p(x), c b p(x). These components generate new hyperspectral images corresponding to the components space. In each space, the same method is applied on hyperspectral images. The segmentation combines a spectral and a spatial part. The spectral part consists of a classification in the considered space to obtain the markers. The spatial part consists in computing a gradient on hyperspectral images (Fig. 14). Then, with the markers of the spectral part and the gradient of the spatial part, a watershed segmentation is performed on the considered space of hyperspectral images. CONCLUSIONS This paper has presented a watershed-based segmentation for hyperspectral images. This approach combines a spectral part (the markers) and a spatial part (the gradient). A temporal series example is used to illustrate our methodology. Comparing the results obtained for the various segmentations and the reference of Legrand et al. (2002), it is obvious that a data reduction approach is necessary. For the current image, the data reduction based on a model is better than the one based on factor correspondence analysis. In fact, for an hyperspectral image, it is better to use a model, when it can be fitted, because it reduces the image entropy while it keeps the inner data structure. Besides, the choice of pertinent parameters with low noise contribution is crucial for segmentation quality. Moreover, multivariate gradients behave better than any marginal gradient on parameters. The multivariate gradients are based on an adapted distance to the considered space and on the supremum, or weighted sum, of morphological gradients on channels. The two kinds of gradients give similar results. Besides, to obtain relevant segmentations, they must be applied on a space with a low level of noise. This underlines the importance of a pertinent choice for parameters factors. About the markers, the corresponding spaces to compute them must be also with a low level of noise, to get pertinent ones. In conclusion, a relevant multivariate segmentation requires an adapted data reduction, which gives parameters or factors with a low level of noise, is crucial ; a necessary condition to get pertinent markers and gradients. In the future, we will develop multivariate filtering on spectral bands. More precisely, we will focus on the levelings. They are usually necessary to enhance the functions on which the gradients are computed. As for greyscale images, we will define new types of multivariate levelings. They will be adapted to peculiarities of these functions and they will also simultaneously filter all the spectral channels of hyperspectral images. In the present example we have only used a simple approach for markers extraction, i.e., a clustering by “Clara”. It is necessary to test other classification methods combining spectral and spatial information in order to improve markers detection. We are considering to do clustering with lambda flat zones and clustering after a principal coordinates analysis, using a weight/distance matrix (Benzécri, 1973; Gower, 1966). REFERENCES Angulo J, Serra J (2003). Color segmentation by ordered mergings. In Proc. of the IEEE International Conference on Image Processing (ICIP’2003) 2:125–8. Benzécri JP (1973). L’Analyse des Données. L’Analyse des Correspondances. Vol. 2. Paris: Dunod. Beucher S, Meyer F (1992). The Morphological Approach to Segmentation: The Watershed Transformation. In: E. Dougherty, Ed., Mathematical Morphology in Image Processing, Marcel Dekker, 433–81. Evans AN, Liu XU (2006). A morphological gradient approach to color edge detection. IEEE Trans Image Process 15(6):1454–63. Flouzat G, Amram O, Cherchali S (1998). Spatial and spectral segmentation of satellite remote sensing imagery using processing graphs by mathematical 108 Image Anal Stereol 2007;26:101-109 morphology. Proceedings of IEEE Geoscience and Remote Sensing Symposium (IGARSS ’98) 4:1769–71. Gower JC (1966). Some distance properties of latent root and vector methods used in multivariate analysis. Biometrika 53:325–38. Green AA, Berman M, Switzer P, Craig MD (1988). A transformation for ordering multispectral data in terms of image quality with implications for noise removal. IEEE Trans Geosc Rem Sens 26:65–74. Kaufman L, Rousseeuw PJ (1990). Finding Groups in Data. An Introduction to Cluster Analysis. Ch. 2 and 3. John Wiley and Sons, 28–160. Legrand AC, Meriaudeau F, Gorria P (2002). Active infrared non-destructive testing for glue occlusion detection within plastic lids. NDT&E International 35:177–87. Li P, Xiao X (2004). Evaluation of multiscale morphological segmentation of multispectral imagery for land cover classification. Proceedings of IEEE Geoscience and Remote Sensing Symposium (IGARSS ’04) 4:2676–9. Meyer F (1992). Color image segmentation. In Proceedings of the 4th International Conference on Image Processing and its Applications, Maastrich, Holland, 303–6. Meyer F (2001). An overview of morphological segmentation. Int J Pattern Recogn 15(7): 1089– 118. Meyer F (2004). Levelings, image simplification filters for segmentation. J Math Imaging Vis 20:59–72. Paclík P, Duin RPW, Van Kempen GMP, Kohlus R (2003). Segmentation of multi-spectral images using the combined classifier approach. Image Vision Comput 21:473–82. Scheunders P (2001). Multivalued image segmentation based on first fundamental form. Proceedings of 11th International IEEE Conference on Image Analysis and Processing 185–90. Serra J (1982). Image Analysis and Mathematical Morphology. Vol. 1. London: Academic Press. Soille P (1996). Morphological partitioning of multispectral images. J Electron Imaging 5(3):252–63. Soille P (2003). Morphological Image Analysis: Principles and Applications. Springer-Verlag, 2nd edition. 109