ImageAnalStereol2009;28:195-203 TechnicalNote TOUCHINGGRAINKERNELS SEPARATIONBY GAP-FILLING MATTHIEU FAESSEL1ANDFRANCISCOURTOIS2 1MINES Paristech,Centre deMorphologieMath´ ematique,Mat h´ ematiquesetSyst` emes,35rue SaintHonor´ e, 77305FontainebleauCedex,France,2UMR 1145,INRA, AgroParisTech,1 AvenuedesOlympiades,917 44 MassyCedex,France e-mail: matthieu.faessel@ensmp.fr,francis.courtois@a groparistech.fr (AcceptedSeptember29,2009) ABSTRACT Separation of touching grain kernels is a recurring problem in image analysis. Morphological methods to separatemergedobjectsinbinaryimagesaregenerallybase donthewatershedtransformappliedtotheinverse of the distance function. This method is efficient with rough ly circular objects, but cannot separate objects beyonda certainelliptic shapenorwhen thecontact zonesar e toonumerousor toolarge. Thispaperpresents a gap-fillingmethodappliedto the skeletonofthe imageback groundasan alternativetechniqueto gofurther in the fused objects separation process. Open lines resulti ng from skeletonizationare prolongedaccordingto theirdirectionfromcorrespondingendpoints. Ifthedista ncebetweentwolinesissmallerthanacertainvalue, theirrespectiveendpointsareconnected. Resultsofcombi neduseofwatershedandgap-fillingbasedmethods are presented on sample binary images. An example of its use o n an particularly complex image containing rice grains shows that it allows to segment up to 90% of the gra ins when classical watershed methods allow onlyto segment25% of the grains. An applicationto breakage and cracksassessing of parboiledrice kernels ispresented. Keywords:gap-filling,mathematicalmorphology,overlapp inggrains,segmentation,skeleton. INTRODUCTION Fused grain kernels separation is a classical problem in binary images segmentation. For the general case, solutions are proposed with different approches (Schmitt and Hasse, 2009): contour or active contour based techniques, graph or level set approaches, parametric fitting or morphological methods. Morphological methods used to separate fused objects in binary images are generally based on the watershed transform (Beucher, 1990) on the complementary of the distance function (Talbot and Appleton, 2002). However, when fused particles have elliptic shape or when they are fused beyond a certain point, watershed algorithms may be insufficient to produce an appropriate segmentation (Shatadal et al., 1995; Talbot and Appleton, 2002; Wang and Paliwal, 2006). Fig. 1 shows the watershed transform on couples of circular fused objects with decreasing distances between their centers. As we can see, classicalwatershedbasedalgorithmsareeffectiveeven foranimportantdegreeofoverlap( d=1.2R).Indeed, we knowthataslong asthe centersoftwo diskslie on each side of their disecting line, watershed algorithm willsucceedinseparatingthem(BeucherandVincent, 1990). d=1.8R1.7R 1.6R 1.5R1.4R1.3R1.2R Fig. 1.Watershed based segmentation on circular fusedobjects. On the contrary, Fig. 2 shows the result of watershed based segmentation on elliptical fused objects with a constant distance between their centers andagrowing Rmax/Rminratio. Rmax Rmin=11.11.21.31.41.5 1 .6d=1.5Rmin Fig. 2.Watershed based segmentation on elliptical fusedobjects. It appears that above a certain shape factor, watershed algorithm fails to separate two overlapping objects. Those functions operate on markers (intrinsic markers such as the maxima of the euclidean distance function, or markers defined arbitrarly by means of more elaboarted technics) of fused objects and their success depends on the ability to obtain as many markers as the number of fused kernels. Fig. 3 represents the ultimate points (the maxima of the distancefunction)oftheelliptical fusedkernels. 195 FAESSELMET AL:Touchinggrainkernelsseparationbygap-filling Fig. 3.Ultimate points (white) of elliptical fused objects(black). It appears that the more the objects have an important Rmax/Rminratio, the more their markers position convergeuntil they merge. From that point, it is impossible for watershed based algorithm to detect more thanasinglekernel. Severaltechniqueshave been explored to improve the determination of markers of elliptical objects: the use of the elliptical distance in place of Euclidean distance (Talbot and Appleton, 2002), the bisector (Meyer, 1979) or conditional bisector (Talbot and Vincent, 1992) functions, etc..., but those methods are inefficient or difficult to use when fused objects are numerous, not perfect disks, or when contact zones are important. It is also possible to use methodsbasedonellipse fitting (TalbotandAppleton, 2002; Evans et al., 2004; Zhang et al., 2005), which appear efficient as long as clusters contain only a few grains,orwhoseperformancedependsonthefitofthe contourto anidealellipse (Baia etal.,2009). When concavities corresponding to contact zones areimportant,themostevidentsolutionremainstofind a way to locate them and to link them together (Visen et al., 2001; van den Berg et al., 2002; Schmitt and Hasse,2007).Inthispaper,wepresentamethodusing the skeleton of the background image to locate the concavities and to draw clipping lines, in order to improvethesegmentationofnon-circularfusedobjects in binaryimages. The initial scope of this work was to develop image analysis techniques to quantify some quality parameters on rice grains: global properties (amount of broken kernels, ...), but also individual grain properties (dimensions, degree of crack-free kernels, ...), which implies to be able to individualize each grain ofasample. MATERIALSANDMETHODS The material used in this study is based on long white rice grains of type “Ariete” (origin Camargue, France). Thecharacteristic dimensionsfor this type of grainsare2mmforthewidth and7mm forthelength (which representsa Rmax/Rminratio of3.5).The main purpose of this work was to develop a light and fast solution to extract individual properties such as length, width or state (broken, cracked,...) from a rice grain sample. Some studies give an important initial effort to position manually grains to make easier the stage of image analysis (Shatadal etal.,1995;vanDalen,2004).Inourcase,theprocess had to be as simple and as fast as extract a grain sample from a batch, place it on the acquision system and launch the acquisition and analysis process. Furthermore, samples sometimes contain hundreds of grainsandmanualgrainplacementwouldleadtoanon negligible waste of time and a need of an important surfacefor imageacquisition. Theacquisitionsystemisasimplestandardflatbed scanner. Grains are tabled directly on the glass of the scannerandarequicklyapportionedtoavoidexcessive overlapping. Then a black box is used to recover the grains and isolate them from ambient light. To reducethesizeoftheimages,onlya115mm ×70mm surface was used to place the grain sample (about 2– 3 g corresponding to about 200 grains) and a 300 dpi resolutionand8bits depth(grayscale)wereused. Fig.4.Sampleimageobtainedwiththeflatbedscanner (850×800pixels). Thealgorithmsweredeveloppedandimplemented under Java using the ImageJ software1(Girish and Vijayalakshmi,2004). ImageJ is a public domain Java image processing program inspired by NIH Image for the Macintosh and developped by Wayne Rasband at the Research Services Branch, National Institute of MentalHealth,Bethesda,Maryland,USA. 1http://rsbweb.nih.gov/ij/index.html 196 ImageAnalStereol2009;28:195-203 IMAGE BINARIZATION It is important to notice that the information contained in the grey-level image is generally useful in order to split fused objects. In Fig. 5 is represented the morphological gradient calculated on the grey- level image. As we can see, gradient between grains isveryweakcomparedtothevariationscorresponding to internal grain fissures and therefore can not be exploitedto splitgrains. Fig.5.Morphologicalgradientcalculatedonthegrey- levelimage ofFig.4 (detail). As our images have two distinct classes of pixels (grains and background), we use the Otsu algorithm (Otsu, 1979), which allows to find automatically an optimum threshold. Ostu algorithm consists in separating the two classes so that their combined spread (within-class variance) is maximal. Theresulting binaryimageis presentedin Fig. 6. Fig. 6.Binaryimageobtainedwith theOtsualgorithm appliedonthe grey-levelimage (Fig.4).CONCAVITIESLOCATION In order to locate concavities between grains, it is possible to use the cluster convex hull. Another possibility is to analyse the concavity of the contours of thegrain clusters (Baia etal., 2009),but becauseof noise or of the image grid along contours, it is usual to detect false concave regions, and to obtain an over- estimationofthe numberofconcavepoints. In our case, we use the morphological skeleton to detect the corners of the complementary image and then,extracttheconcavepointsfromtheskeletonopen lines. The use of the morphological skeleton to detect object corners in a binary image can be found with different implementations (Dinesh and Guru, 2006). The main advantages are that it is a robust and non- parametricmethod. The skeleton is calculated on the background of the binary image. It is obtained by means of a morphological thinning (Serra, 1982). The algorithm used here is the single pass thinning algorithm developped by Zhou et al.(1995). The result is presentedin Fig.7. a b c Fig. 7.Skeleton of the background image. (a) Binary image,(b) Backgroundimageskeleton(c)a overb. GAP-FILLINGMETHOD Automatic reconnection of linear segments is also a classical problem in image analysis. When portions of missing lines are sufficiently small, morphological operations such as closing (dilation followed by erosion) or localisation of nearest neighbours are commonlyusedinordertoconnectthem(Zhang etal., 1999; Visen et al., 2001). In our case, the distance betweenthe endpointsof the lines we wantto connect is often larger than the width of the grains, and the size of closing necessary to link them would lead to completelyfillskeletoncells. The pairing procedure used here to join endpoints is quite similar to the “linking edge-chains” method 197 FAESSELMET AL:Touchinggrainkernelsseparationbygap-filling described by Ferrari et al.(2006), except that we use 2D spherocylinders in place of isocele trapezia to extend and join open-lines. First, segment endpoints arelocalized.An“average”directionofthesegmentis calculated on a given length from its endpoint. Open linesarethenlinearlyextendedfrom theirendpointon a specified distance until they join (or until another extending line reach their neighborhood in a given radius). ENDPOINTSDETERMINATION Endpoints are detected from the skeleton thanks to the morphological hit-or-miss operation (Serra, 1982).Consideringacoupleoftwodisjointstructuring elements T={T′,T′′}, the hit-or-miss operation is defineduponimage Xas: ηT(X) ={z:T′′(z)⊆Xc;T′(z)⊆X} =εT′(X)∩εT′′(Xc), whereεT′(X)andεT′′(Xc)arerespectivelytheerosion ofXwithT′and the erosion of the complementary imageXcwith thestructuring element T′′. In order to extract the extremity of open lines (Fig. 9a), we use the hit-or-miss operation with a “U” structuring element and its seven successive rotations (Fig. 8). T′: T′′: T: r1 r2 r3 r4 ... Fig. 8.Definition of the “U” structuring element and its firstrotations. LINESDIRECTIONCALCULATION Starting from the endpoints, we apply successive morphological thinnings (Serra, 1982) with the same “U” structuring element (and rotations) to thin the openlinesonacertainlength,oruntilidempotence(in this case, the junction of several lines). The thinning operationcorrespondstotheresiduebetweenanimage andits hit-or-miss transformation: ΘT(X) =X\ηT(X) =X\[εT′(X)∩εT′′(Xc)] The “average” direction of a line is calculated between an end point and the last point obtained by thinningthe correspondingline (Fig. 9b).a b c d Fig. 9.Gap-Filling steps. a) End-points localisation. b) Open-lines direction evaluation. c) Open-lines prolongation.d)End-pointsjunction. LINESPROLONGATION This line of average direction is then extended fromterminalpointsonafixedlength.Ateverystepof reconstruction,weextendthelineofapoint(pixel)and determineifit intersectswith anotherone(Fig. 9c). Inordertodetecttheintersectionbetweendifferent lines, each line is represented on a black image with a different value. Each time we extenda line ifrom one pixel,wecheckifthecorrespondingpixelontheblack imagehasthevalue0orthe valueofanotherline j. In thefirstcase,wecontinuetheextendingprocess,andin the second one, we define a new contact pair between linesiandj.Whenacontactpairhasbeenestablished, we draw as straight line between the corresponding endpoints. This line is then subtracted to the binary imageto splittheconnexgrains. Note that line crossings are banned: when prolongatingaline,ifthenextpixelcorrespondstothe skeleton or to a previous link, the prologation process oftheline is stopped. IMPLEMENTATION During the pairing procedure, some problems can occur: –two corresponding lines may never connect ; this happens if two lines are parallel or if they don’t intersect between the two end-points. To ensure that two corresponding contact zone lines join, at 198 ImageAnalStereol2009;28:195-203 eachstepofprolongation,welookforthepresence of a nearby line within a radius of fixed size Rn (Fig. 10). As previously, each time we extend a line ifrom one pixel, we checkon the black image (on which extending lines are represented) the presence of a pixel with a different value in a radius of Rn. If a pixelwith a value jis found, we set a new contact pairbetweenlines iandj. Rn Fig. 10.Detectionoftwo lines proximity. –the average direction calculated from the end- point to the open line root doesn’t fit the global curvature of the line, which will be prolonged in anunexpecteddirection.Forthisreason,itisoften better to calculate the average direction on a fixed lengthLcratherthanonthewholelengthoftheline (Fig. 11a). –if a line is prolonged on a too long distance, unexpected contacts can occur between lines correspondingto different contact zones.To avoid this situation, lines are prolonged only on a fixed distance Lp, generally smaller than the specific lengthofa grain(Fig. 11b). Lcroot a b Fig. 11. Lenght of direction calculation a). Prolongationlength b). In order to obtain good results, it is necessary to execute the procedure sequentially with growing values for the parameters Lc,LpandRn. First, small values of Lc,LpandRnare used in order to fill small gaps.Thentheoperation is repeatedwith higher values, to join progressively more distant lines. The maximum values for the parameters are respectivelythe specific length and width of a grain for LpandRn (Lcis limited bythelengthofthe open-line). The figure 12 illustrates the sequence of pairing procedures with growing parameters: at the first step, we use small values of Lp,RnandLc. Only two lines (on the bottom left) are linked. At the second step, valueschosenfortheparametersallowtobuildthetwo upperlinksofthefigure,andthelastlinkisdetermined only at the third step. At each step, the particles revealed by the new links are measured. If their width and length are lower or equal to those of a typical grain, they are considered as correctly segmented and are removed from the image. The procedure is then continued with higher parameters, until the image is empty(oruntilthemaximumvaluesofparametersare reached). a b c Fig.12.Sequenceofpairingprocedures: a)Firststep with low values of parameters L p, Rnand Lc. b) Second step with medium values. c) Third step with highvalues. Theresultofsuchasequenceappliedontheimage 7a is observed in Fig. 13. As we can see, the result is exactly what we expect to, except for the two grains on the top of the image which are not separated. In thisparticularconfiguration,theskeletondidnotreveal two, but only one concavity (see Fig. 13a) and thus, onlyoneopen-line.Then,theleft open-linecannotbe pairedandnosplitline canappear. 199 FAESSELMET AL:Touchinggrainkernelsseparationbygap-filling (a) -Gap-Filling. (b)-Resulton binaryimage. Fig.13.Resultofthegap-filling(a)andresultinggrain splitting (b). Non splitted grains(atthe top). APPLICATION TO BREAKAGE AND CRACKS ASSESSING OFPARBOILED RICEKERNELS The criterion used by industrials for evaluating quality of paddy rice is the measurement of the head rice yield (HRY, in %) after milling. The rice is processed in an pilot rice-mill, generally consisting of a husker with rubber disks and a mill with an abrasive cone. Husked kernels are then separated in two fractions using a grain sorter: a whole fraction and a broken one. A kernel is considered broken if its length is smaller than 3 /4 of this of a whole kernel. The result is expressed in terms of rice processing yield,whichistheratiooftheweightofwhite(husked) kernels to the total weight and in terms of head rice yield, corresponding also to the ratio of the weight of whole kernels to the total weight of white rice (norm ISO 6646:2000. Rice – Determination of the potential milling yieldfrom paddyandfrom huskedrice). Ouralgorithmusesthesameclassificationcriteria: once the grains are individualized, they are classified accordingtotheirdimensions.Thelengthofthegrains is measured using the Feret’s diameter. A grain is considered as broken if its length is smaller than 3/4 ofthis ofawholekernel. Furthermore, in order to characterize the cracking level of a grain, we evaluate the number of parts it contains. As we can see in Fig. 4, when a grain contains fissures, cracking zones present important grey level contrasts (due to different cracking planes orientation). Thus, parts can be easily identified using information such as the gradient. In order to close the contours corresponding to the different cracking zones of a grain, the gradient image is flooded by a watershed. To control the level of segmentation, the number of local minima is reduced using a gaussian filteronthegradientimagebeforethefloodingprocess. Once the different zones are revealed, the grains are classified according to the number of zones theycontain: strongly cracked (more than 3 zones), slighly cracked(between2 and3 zones)andintact(1zone). h=0 h=1 h=4 Fig. 14. Result of the segmentation obtained using the watershed transform on the distance function with markerscorrespondingtotheh-minimaofthedistance function. 200 ImageAnalStereol2009;28:195-203 RESULTS In Fig. 14, we can see the segmentation obtained using the classical watershed transformation on the distance function on a grain sample image with a particularly difficult configuration: lots of conglomerates, important contact zones (only 15% of the grains are single). In order to control the level of segmentation obtained by the watershed, the markers used to perform the flooding are determined using the h-minima (Soille, 1999) of the distance function. As we cansee,evenwith arelativelyhighvalueof h(h= 4), there are still unwantedgrain clippings. In orderto completely avoid any inappropriate cuttings, we have to process the watershed with h≥5. This handles onlyverysimplecontactcases(clusterscontainingtwo or threee grains with small contact zones), and the amountofmergedgrainsremainsverylarge. Fig. 15 represents the result of our method on the same image (resulting countours have been projected ontheoriginalgrey-levelimage).Grainsconsideredas successfullysegmentedare representedwith a colored contour (in this example, grains are classified with a color code according to their state: blue for intact grains, green for broken, yellow and red for slightly andstronglycrackedgrains). Fig. 15. Gap-filling based segmentation on a complex image of many touching rice grains. Grains consideredassuccessfullysegmentedareanalysedand represented with a colored contour: blue for intact grains, green for broken, yellow and red for slightly andstronglycrackedgrains. We observe that most of the time, contours of grains are perfectly detected. Only a few grains are notcorrectlysegmented.Asitwassaidpreviously,this happends generally when concavities are not clearlyrevealed by the skeleton (inexistant or too short open- lines). The rate of successfully segmented grains is estimated using the ratio of the surface of correctly segmentedgrainstothesurfaceofallthegrains(easily determined by counting the foreground pixels on the binary image). In this case, segmentation based on watershed algorithms detects successfully only 25% of the grains total surface, while gap-filling algorithm reaches90% of correctly segmentedgrains surface. In the case of more simple configurations, our algorithm allows us to obtain from 95% to 100% of successful segmentation(Fig. 16). Fig. 16.Gap-filling based segmentation on a simpler case(1000×1000pixels). In order to validate the method, percentages of broken kernels obtained using image analysis were compared to measures obtained by manually separating and counting broken and intact grains. Different samples of processed, or not, parboiled rice dried in various conditions were used. As previously, each sample contains about 2–3 g (corresponding to about 200 grains). As we can see in Fig. 17, the estimates from the image analysis approach match perfectly both the measures using the ISO norm and the manualcounting. This algorithm was used on several hundred of images with more or less complicated cases, containing from ten to one thousand rice grains, with the same rate of success. The time needed to process one image depends of the number of grains and of the complexity of grains appointment, and is about several seconds (the time for the segmentation of the 850×800 pixels image in Fig. 15, is about 8 seconds ona standardlaptopcomputer). 201 FAESSELMET AL:Touchinggrainkernelsseparationbygap-filling 0102030405060708090 0 10 20 30 40 50 60 70 80 90% Broken grains (Image Analysis) % Broken grains (lab)parboiled paddy rice linear fit processed parboiled rice Fig. 17. Comparison between breakage ratios as obtained from laboratory visual inspection or ISO norm analysis, and sorting and image analysis estimation. The fitted slope of the line is 1.00434 ± 0.00898. CONCLUSION Separation of touching grains involves specific problems when grains have an elliptic shape and when clusters of grains contains many objects. We have seen that classical watershed based algorithms are not sufficient in this case. In this paper, a gap- filling method using the skeleton of the background image was introduced to draw clipping lines between touching grains. Then, a concrete implementation of the algorithm was presented in order to fulfil the separation process. Therefore, for images composed of many touching grain kernels with elliptic shapes, the described “gap-filling” based algorithm allows us to improve the segmentation process. More detailed application of this algorithm in food industry is presented in an upcoming article (Courtois et al., 2009). ACKNOWLEDGEMENTS The work was financially supported by the Office National Interprofessionel des C´ er´ eales (ONIC), France. REFERENCES Baia XZ, Sunb CM, Zhou FG (2009). Splitting touching cellsbasedonconcavepointsandellipsefitting.Pattern Recogn42:2434–46.Beucher S (1990). Segmentation d’images et morphologie math´ ematique. Ph.D. thesis, Ecole des Mines, Paris, France. Beucher S, Vincent L (1990). Introduction aux outils morphologiques de segmentation. In: ANRT, ed., Traitement d’image en microscopie ` a balayage et en microanalyseparsonde ´ electronique.Paris. Courtois F, Faessel M, Bonazzi C (2009). Assessing breakage and cracks of parboiled rice kernels by image analysis techniques. Food Control DOI: 10.1016/j.foodcont.2009.08.006. Dinesh R, Guru D (2006). Corner detection using morphologicalskeleton:Anefficientandnonparametric approach. In: Narayanan PJ, Nayar SK, Shum HY, eds., Comput Vis ACCV 2006. Lect Notes Comput Sci 3852:752–60. Evans C, Berman M, Talbot H (2004). Offline fast object splitting.Tech.Rep.CMIS04/55,CSIROMathematical andInformationSciences,NorthRyde,NSW,Australia. FerrariV,TuytelaarsT,VanGoolL(2006).Objectdetection bycontoursegmentnetworks.In: LeonardisA, Pinz A, eds., Computer Vision – Proc ECCV 2006. Lect Notes ComputSci 3953:14-28. Girish V, Vijayalakshmi A (2004). Affordable image analysisusingNIHImage/ImageJ.IndianJCanc41:47. Meyer F (1979). Cytologie quantitative et morphologie math´ ematique.Ph.D.thesis, ´EcoledesMinesde Paris. Otsu N (1979). A threshold selection method from gray- levelhistograms.IEEETransSyst ManCyb 9:62–6. Schmitt O, Hasse M (2007). Radial symetries based decomposition of cell clusters in binary and gray level images.PatternRecogn41:1905–23. Schmitt O, Hasse M (2009). Morphological multiscale decomposition of connected regions with emphasis on cellclusters.ComputVisImageUnd113:188–201. Serra J (1982). Image Analysis and Mathematical Morphology.London:AcademicPress. ShatadalP,JayasD,BulleyN(1995).Digitalimageanalysis for software separation and classification of touching grains:I. DisconnectAlgorithm.Am Soc AgrBiol Eng 38:635–43. Soille P (1999). Morphological Image Analysis: Principles andApplications.Springer-Verlag,170–1. Talbot H, Appleton BC (2002). Elliptical distance transforms and object splitting. In: Proc VI Int SympMathMorph,April3–5.Sydney,Australia. Talbot H, Vincent L (1992). Euclidean skeletons and conditionalbisectors.In: VisCommunImageProc’92. ProcSPIE 1818:862–76. van Dalen G (2004). Determination of the size distribution and percentage of broken kernels of rice using flatbed 202 ImageAnalStereol2009;28:195-203 scanningandimageanalysis.FoodResInt37:51–8. van den Berg E, Meesters A, Kenter J, Shlager W (2002). Automated separation of touching grains in digital imagesofthinsections.ComputGeosci 28:179–90. Visen N, Shashidhar N, Paliwal J, Jayas D (2001). Identification and segmentation of occluding groups of grain kernels in a grain sample image. J Agr Eng Res 79:159–66. Wang W, Paliwal J (2006). Separation and identification of touching kernels and dockage componentsin digital images.CanBiosyst Eng48:7(7pp).Zhang C, Murai S, Baltsavias E (1999). Road network detectionbymathematicalmorphology.In:ProcISPRS Worksh 3D Geospat Data Product Meeting Appl Requir,April7–9.Paris,France. Zhang G, Jayas DS, White NDG (2005). Separation of touching grain kernels in an image by ellipse fitting algorithm.Biosyst Eng92:135–42. Zhou RW, Quek C, Ng GS (1995). A novel single-pass thinning algorithm and an effective set of performance criteria.PatternRecognLett16:1267–75. 203