Image Anal Stereol 2011;30:167-178 Original Research Paper INTENSITY ESTIMATION OF STATIONARY FIBRE PROCESSES FROM DIGITAL IMAGES WITH A LEARNED DETECTOR Paola M.V. Rancoita1'2, Alessandro Giusti1 and Alessandra Micheletti3 1Istituto Dalle Molle di Studi sull'Intelligenza Artificiale, Galleria 2, 6928 Manno-Lugano, Switzerland; 2Institute of Oncology Research, via Vela 6, 6500 Bellinzona, Switzerland; 3Dipartimento di Matematica, Universita degli Studi di Milano, via Saldini 50, 20137 Milano, Italy e-mail: paola@idsia.ch, alessandrog@idsia.ch, alessandra.micheletti@unimi.it (Accepted October 4, 2011) ABSTRACT Stationary fibre processes are processes of curves in a higher dimensional space, whose distribution is translation invariant. In practical applications, they can be used to model several real objects, such as roots, vascular networks and fibres of materials. Often it is required to compare processes showing similar shape, thus a quantitative approach to describe their stochastic geometry is necessary. One of the basic geometric characteristics of these processes is the intensity (i.e., mean total length per unit area or volume). Here, a general computational-statistical approach is proposed for the estimation of this quantity from digital images of the process, thus only planar fibre processes or projections of processes onto a plane are considered. Differently from approaches based on segmentation, it does not depend on the particular application. The statistical estimator of the intensity is proportional to the number of intersections between the process under study and an independent motion invariant test fibre process. The intersections are detected on the real digital image by a learned detector, easily trained by the user. Under rather mild regularity conditions on the fibre process under study, the method also allows to estimate approximate confidence intervals for the intensity, which is useful especially for comparison purposes. Keywords: intensity estimator, intersection detector, machine learning, stationary fibre process. INTRODUCTION In many fields, such as biomedicine, material sciences, agronomy, remote sensing, structures of interest can be modeled as stationary fibre processes (curves/lines in a higher dimensional space). Here, we focus our attention only on planar fibre processes or on projections of 3D processes onto a plane. The characterization of such processes from digital images is a common task, which is frequently solved first by attempting an automated segmentation of the fibres (which is dependent on the particular application). If a trusted segmentation is available, then an appropriate fibre tracking algorithm can be applied and the process can be characterized by computing several geometric measures. For example, the intensity (i.e., mean total length per unit area) can be estimated as the ratio between the total length of the fibres in the window of observation W (i.e., the image) and the area of W. Unfortunately, in many practical scenarios the required segmentation can not be reliably obtained and often is not even a well-posed problem, for example in cases where fibres appear blurred or very thin. Other intensity estimators are usually based on Crofton-like formulas (see, e.g., Ohser, 1981 and Stoyanetal., 1995), but they have no asymptotic properties because they are obtained by means of the intersection with a finite deterministic test fibre system (such as a finite grid of circles or segments). To overcome the request of a segmentation for the estimation of the intensity and to ensure good asymptotic properties of the estimation procedure, we propose a computational-statistical technique, which is based instead on a computationally simpler estimator, proportional to the number of intersections with an independent motion invariant (i.e., invariant with respect to translations and rotations) test fibre process. The detector of the intersections is learned incrementally using a random forest classifier trained on few user inputs (i.e., examples of intersections in the image), until the user is satisfied with the detector accuracy. Afterwards, the test fibre process is simulated several times on the image, obtaining an estimate of the intensity of the process under study and an approximation of the variance of the estimator. Thus, we can also provide approximate confidence intervals for the intensity of the process. Due to its generality, the approach is successfully applied on images of very dissimilar structures, in a variety of application scenarios, avoiding the requirement of an ad-hoc segmentation method for the identification of the fibres. In the following sections, we describe the computational-statistical technique we propose for the estimation of the intensity of stationary fibre processes. The explanation of the procedure is divided into two sections: the former regarding the statistical estimator of the intensity (based on the intersection with an independent motion invariant test fibre process) and the estimation of its variance, the latter regarding the learned detector of the intersections from a digital image. Finally, we present the results obtained on simulated and real data for the evaluation of the performance of the learned detector and the comparison between the methods for the variance estimation. THE STATISTICAL ESTIMATOR OF THE INTENSITY AND THE ESTIMATION OF ITS VARIANCE A planar fibre process <ä> is a random variable taking values in the space of the systems of fibres in R^ (Stoyan et al., 1995). A fibre can be defined as a curve of class ^^ with finite length and a system of fibres as the union of at most countably many fibres that can have only the endpoints in common (see Stoyan et al. (1995) for a more formal definition and motivations for this choice). Thus, many processes represented by lines/curves in a two-dimensional space (such as, images of capillaries and roots) can be modeled as a planar fibre process. An example of fibre process is the Boolean fibre process. (1) where © denotes the Minkowski sum {i.e., AQ)B = {a + b\ a e A, be B}), is a homogeneous Poisson point process with intensity A {i.e., mean number of points per unit area) and {r„}„ is a sequence of i.i.d. random fibres independent of If the process is stationary {i.e., its distribution is translation invariant), we can define the intensity La, as the constant such that E[;U(Z?)] = LaV2{B), for all Borel sets B, where iMp{B) denotes the total length measure of the fibres of <ä> in ß {i.e., lMp{B) := Vi(<ä>nß)) and v^, d = 1,2, is the i-dimensional Lebesgue measure. The intensity La represents the mean total length of fibres per unit area; for example, in the case of the Boolean processes (Eq. 1) L^ = Am, with m = E[vi(ri)]. As a consequence, the intensity is one of the basic characteristics of the stochastic geometry of a stationary fibre process. As we mentioned in the Introduction, we can estimate unbiasedly the intensity with Mw) measure -A (W) = V2(W) ' where W is a compact window of observation. If <ä> is ergodic, then the estimator is strongly consistent, when enlarging the window of observation (from Corollary 10.2.V in Daley and Vere-Jones, 1998). Nevertheless, the computation of the estimator requires to measure the total length of the fibres in W. In practical applications, the sets under study are represented in a digital image, thus to calculate the total length of the fibres we need a reliable segmentation and an appropriate fibre tracking algorithm. To avoid these requirements, we can use another estimator which is unbiased, computationally simpler and has also good asymptotic properties. La,IW = V2(W) 2 La,2 (2) where <ä>i is the fibre process under study, <ä>2 is a test independent motion invariant {i.e., invariant with respect to translations and rotations) fibre process with intensity La,2, and A'jn2(-) is the counting measure associated to the the point process of the intersections <ä>i n <ä>2, i.e., A'jn2(W^) is the number of the intersection points of <ä>i and <ä>2 in W (Micheletti and Rancoita, 2009). Due to the stationarity and independence of <ä>i and <ä>2, the point process <ä>i n<ä>2 is a.s. stationary and, since <ä>2 is also isotropic, its intensity is 2LA,iLA,2/7r (Lemma 3.2 in Mecke, 1981), which implies the unbiasedness of La,i. La,i is a generalization of the estimator defined by Ohser (1981), based on the intersection with a finite deterministic test fibre system. If the point process of the intersections <ä>i n <ä>2 is ergodic, La,i is strongly consistent when enlarging the window of observation (it follows from Corollary 10.2.V in Daley and Vere-Jones, 1998). If <ä>i n<ä>2 has suitable mixing properties (such as the independence of the process in disjoint sets, having a distance greater than / < 00), the estimator is also asymptotically normal (Jolivet, 1991; Rancoita, 2010). For example, if the point process of the centers of the fibres is independent in sets with distance greater than I and the fibres have maximum length smaller than I, with I < 00, (like for Boolean fibre processes where the length of the fibres is smaller than I < 00), then the fibre process is independent at distance I. It is easy to verify that if both i and <ä>2 have this property, then it holds also for <ä>i n <ä>2 and all the described asymptotic properties of the estimator hold. In Rancoita (2010) and Rancoita and Micheletti (2011), the authors showed the behavior of the estimator on both simulated "continuous" and digital images (where the fibres are approximated with pixels), using different types of (even not isotropic) Boolean fibre processes as <ä>i, different isotropic Boolean fibre process as <ä>2 and different sizes of W. On both "continuous" and digital images, the approximation of the distribution of the estimator with the normal distribution was already good in square window of observation with side equal to 200-300 units or pixels. In case of digital images, the authors studied also the issue of over/underestimation of the number of intersections due to the pixel approximation: the eventual over/underestimation depended mainly on the shape of the fibres of the process <ä>i, thus the estimator can always be used for the comparison of the intensity of fibre processes with fibres showing the same shape (since the bias of the estimation is the same). Under the assumptions needed for the asymptotic normality of La,i, if we can estimate the variance of La,i, we can provide an approximate confidence interval for the intensity. In practical applications, only a few or even one single image of the process under study is usually provided, thus it is not possible to estimate Var(LA,i) with the sample variance of the estimator computed on the set of images. In this situation, we need an accurate way to approximate the variance of the point process of the intersections on a single image. Heinrich and Prokesova (2010) proved that if ^ is a stationary point process, with milder mixing conditions than the ones required for the asymptotic normality of the corresponding counting measure, and if {Wn}n is a sequence of enlarging bounded convex windows of observation, then / ^ lß(0,l) -nibWviiW)) --' (5) where bw is a bandwidth (whose choice depends on the size of W, for convergence properties) and T is the Q-anslation operator {i.e., Tx{B) = {x + b\b ^ B}). On simulations with small windows of observation, and different types of point process models, kernels and bandwidths, Heinrich and Prokesova (2010) obtained that the estimator, which achieved the lowest relative mean square error (E[((7^ - a^), was the one with the cylinder kernel with a value of Z'wa/v^W) in the interval [^3]. Therefore, we can derive an estimator for Var(LA,i) from Eqs. 4 and 5, / % \ V2(W) (6) V2LA,2y We will refer to this estimator as the Method 1 for the estimation of the variance. Rancoita and Micheletti (2011) proposed another way for the approximation of the variance. Let us generate n i.i.d. test processes <ä>2 (which are also independent of <ä>i) and call {UYi=i the corresponding estimators La,i(W), obtained on the same realization of <ä>i. If we assume that. V/ / j, Cov(L;,Lj) < ciVar(L;) with ci < 1, then an upper bound for the variance is given by. -\2 Lj-L Um--=: cri < 00 . (3) It follows that, if W is sufficiently large and ^ = <ä>i n<ä>2. Var 0% N^jwy V2(W) ; V2(W) Var(LA,i(W)) 71 \2la,2 2 fjl V2(W) (4) Heinrich and Prokesova also defined a class of kernel estimators for a^ and showed their good theoretical asymptotic properties. Choosing a cylinder kernel, the estimator becomes. = V2{W) " with L = Li/n, and we can approximate the variance with the estimate of its upper bound. Var(L;) = 1 1 " i—^—Y -cm-ijti —\ 2 Lj-L (7) We will call this method for the approximation of the variance Method 2. By using Lemma 2.3 in Mecke (1981), it is easy to prove that, Cov(La.) = ^^^ = Var(L,—(W)) , for any i ^ j, i.e., the covariance between the estimators computed on the same image depends only on the variance of process <ä>i (in fact the two processes <ä>2, by which L; and L j are computed, are generated independentiy). Since the co variance among the estimators is equal to the variance of the following proposition and corollary (see the Appendix for their proofs) allow us to show that the constant ci < 1 exists at least when the test fibre process <ä>2 is an isotropic Poisson segment process {i.e., a Boolean fibre process with fibres that are segments uniformly oriented). In the applications, we will use this type of test fibre process. Proposition 1 Let <ä>i be a stationary planar fibre process with intensity La,i and <ä>2 be a test isotropic Poisson segment process with intensity La^, then LaaTI 2LA,2V2{W) ' for any compact window of observation W in R^. Corollary 1 Let <ä>i be a stationary planar fibre process with intensity and <ä>2 be a test isotropic Poisson segment process with intensity La^, then there exists at least one constant ci < 1 such that < ciVar(LA,i(W)) , for any compact window of observation W in R^. In Rancoita (2010) and Rancoita and Micheletti (2011), the constant ci was found always strictly lower than one (ci G [0,0.6]), on both simulated "continuous" and digital images, by using different types of (even not isotropic) Boolean fibre processes as <ä>i, different isotropic Boolean fibre process as <ä>2 (including the isotropic Poisson segment process) and different sizes of W. Among the types of processes <ä>i, there were the anisotropic Boolean fibre processes whose fibres were either horizontal segments or arcs of parabola (with horizontal orientation) and thus resembled the real angiogenic processes that we will study in the applications. In general, if we assume that a condition similar to Eq. 3 holds for <ä>i, i.e., Var(Mi(W„)) ^ hm--=: cr<„ < oo MWn) or even the following milder condition holds, hmsup--< C7| < 00 , V2(W„) (8) (9) then (W„))/(V2(W„))^ = 0, that is, for large windows, the covariance (and Var(L™®^®"''®(W))) is close to zero and we can approximate the constant ci with zero. This approximation is reasonable, since ^measure already strongly consistent by assuming only the ergodicity of <ä>i (which is a milder condition than the ones required for the asymptotic normality of La,i). The hypothesis in Eq. 8 is verified, for example, for Boolean fibre processes whose fibres are boundaries of compact sets with a finite maximum diameter (Theorem 6.1 in Molchanov and Stoyan, 1994), and the hypothesis in Eq. 9 holds for general fibre processes independent at distance / (/ < 00) thanks to the following proposition and corollary (see the Appendix for their proofs). Proposition 2 Let ^ be a stationary planar fibre process independent at distance I < 00. Then, for any bounded convex set B in R^, yar{lMs,{B)) V2{B) ,VVar(Mi is independent at distance I < 00 and <ä>2 is an isotropic Poisson segment process, by using Proposition 2 together with the technique used to prove Corollary 1. In the applications, we will use ci = 0, since the side of the squared window will be greater than 100 units/pixels, which can be considered a big size, according to the simulations performed in Rancoita (2010) and Rancoita and Micheletti (2011), where the variance was estimated via replicates of <ä>i, and ci = 0.6, which was the maximum value obtained in the same simulations. Actually the optimal value for ci could be internal to the interval [0,0.6], thus more values in this interval should be tested. Anyway, both because the aim of this paper is to give general guidelines on the estimation procedure and because of space reason, we will restrict our analysis to the extreme of the interval. We leave to subsequent papers the problem of identifying an optimal value for c1, depending on the types of processes and 02 and the size of W. Note that the types of process used in the simulations in Rancoita (2010) and Rancoita and Micheletti (2011) comprise: the same processes that will be used in the simulations performed in this paper, and two non-isotropic processes with geometric characteristics similar to the angiogenic process that we will consider here as real application. Therefore, it is reasonable to assume that c1 will belong to the interval [0,0.6] in both our simulated and real data. THE LEARNED DETECTOR OF THE INTERSECTIONS In order to apply the approach described above on real images, a system is needed to automatically detect intersections between fibres in (which are visible on the image) and fibres in 02 (named segments in the following, since in the applications we will use an isotropic Poisson segment process). In very simple scenarios, intersections may be detected easily, e.g., as local maxima of image intensity along the segment. However, the majority of real applications requires more sophisticated image analysis approaches, and different types of images can have very different appearances of both the fibres and the background. We propose a general approach for intersection detection based on a classifier learned from training data. In the current implementation we use a random forest classifier (Breiman, 2001). The classifier is applied for each pixel of the segment and uses the data (called also features) obtained from a neighborhood of the pixel. Namely, we look at a rectangular window with side lengths of 17 and 5 pixels, which is centered at the pixel under consideration and oriented such that the long sides are parallel to the segment (see Fig. 1). Pixel values from the original image are resampled at the real-valued locations, defined by a grid inside the window, by using bilinear interpolation. For dimensionality reduction, the features considered are the first 15 principal components of the resampled raw pixel values in the neighborhood. For each pixel of the segment, the classifier returns the probability to be an intersection. Fig. 1. ^^^ervi^^ of the intersection detector. The usage of raw pixel data rather than higherlevel features for the classification allows us to remain completely scenario-neutral, since we do not need any assumption about the fibre appearance. The values of the side lengths of the rectangular window were set, in a very conservative way, based on the applications explained in the subsequent sections, but they are not critical parameters (i.e., the use of other reasonable values would not significantly affect the performance of the detector). As general guidelines, the length of the long sides should be greater than the maximum width of the fibres we expect in the image (thus, the rectangular window contains also the boundaries of the fibre around the intersection point), while the value of the length of the short sides is even less crucial. A length of 1 pixel is sufficient in most cases and slightly larger values may improve the performance of the detector in datasets of noisy images. Training the classifier. The classifier is trained by means of user-labeled segments obtained from a set of training images. Each segment is randomly generated (on a training image) and is shown to the user, who is asked to identify the intersections by clicking on them. All the clicked intersections are considered positive training examples for the classifier; negative examples are generated from the remaining pixels of the segment. The classifier is retrained as new user-labeled segments are available. After each retraining, the classifier performance on segments not belonging to the training set is visualized to the user, who may stop the training phase when accuracy is deemed sufficient. Once trained, the classifier can be applied to detect intersections in new images with a similar appearance to the ones used for training. As any supervised method, this approach works as long as the training set is representative of the testing set; appearance variations are correctly handled, assuming that some examples for different appearances are available in training data. Applying the classifier to a test segment. The number of intersections with a test segment is computed by applying the classifier to each pixel of the segment. The classifier returns, for each pixel, the probability of being an intersection and thus, on the whole, a profile of intersection probabilities along the segment: peaks of such profile are considered as candidate intersections with the corresponding probability (see Fig. 2). The candidate intersection points of a given segment are assumed to be conditionally independent Bernoulli random variables. The total number of intersections (with the segment) is estimated as the expected value of the sum of the candidate intersections. Fig. 2. Example of profile of probability of being an intersection for all pixels in a segment. RESULTS ON SIMULATED AND REAL DATA This section has three aims: (1) to compare the two methods for the estimation of the variance in order to select one of them, (2) to evaluate the performance of the learned detector, (3) to show a practical application of the proposed approach for intensity estimation. Some artificial data are used to address the first issue, while three real datasets are used for the second one. The practical application consists in a quantitative comparison between the intensities of real angiogenic processes, produced under the effect of different antiangiogenic treatments. DATA DESCRIPTION For the comparison of the two previously described methods for the variance estimation in situations similar to real applications, we use artificial datasets of digital images, generated with straightforward computer graphics techniques: namely, the fibres are rasterized as curves with a 1-pixel-wide stroke without antialiasing. In the simulations, both and O2 are Boolean fibre processes (see Eq. 1), with intensity of the Poisson point process Ai and A2, respectively. The window of observation W is squared and its side is called dimension (or dim). The possible values of the parameters used in the simulations are: - Ai = 0.004 and A2 G {0.002,0.004,0.008}; - the fibres of have been generated as: either horizontal segments with length /1 = 20 (Poisson horizontal segment process), or segments with uniform orientation and length /1 = 20 (isotropic Poisson segment process) or circles of radius ri = 10 (Poisson circle process); - the fibres of O2 are segments with uniform orientation and length /2 G {20,60}; - the dimension of W is 500. For each type of process Oi, we generated 100 datasets (i.e., images) and, to compute the variance with Method 2, for each image of and type of process O2, we simulated 100 times the test process. To see the improvement of the estimations when enlarging the window of observation, for each image, we computed La,i and its variance (with the two methods) also in subwindows of dimension 100, 200, 300 and 400. For the evaluation of the performance of the learned detector of the intersections, we consider three datasets: - Simulated roots: 49 simulated realistic root images (Dowdy etal., 1998); - DRIVE: 20 training and 20 testing images of retina blood vessels (Staal et al., 2004); - Angiogenesis: 16 images of angiogenic processes of mouse cornea (Corada et al, 2002). All datasets provide a ground truth {i.e., a reference segmentation of the image made by an expert) for all images and, for the 20 testing images of the dataset DRIVE, a second ground truth is also available. The dataset Angiogenesis will be also used to show a real application of the intensity estimation. Examples of images of these datasets are given in Fig. 3. Fig. 3. Examples of images from the three datasets Simulated roots, DRIVE and Angiogenesis, respectively. COMPARISON BETWEEN THE METHODS FOR VARIANCE ESTIMATION We compared the two methods for the estimation of Var(LA,i(T^)) on datasets of simulated digital images (both the methods and the data were described previously). Regarding the estimator described in Eq. 6, since in the experiments in Heinrich and Prokesova (2010) the optimal bandwidth bw was found such that d := b^^/v^J^ G [1,3], we chose bw such that either J = 2 or J = 3. In fact, due to the pixel approximation, we can choose only integer values for d (in Eq. 5 we need to count the pairs of intersections (x,_y) such that |x-_y| <= d) and d=l provides too narrow neighborhood to detect pairs of intersections. We computed the mean square error (MSE) and the relative mean square error (relMSE), as in Heinrich and Prokesova (2010), to compare the quality of the estimation. For the relMSE, we used as reference (true) variance of the estimator the sample variance of the estimator calculated over the 100 independent images of the specific pair (i,2). In Fig. 4, we show, as an example, the results obtained when Ol is the Poisson horizontal segment process and O2 has the lowest and highest intensity (among those used in the simulations). Fig. 4. Mean square error and relative mean square error of the estimated Var(LA,i(T¥)), where Oi is a Poisson horizontal segment process = 0.004 and l\ =20) and O2 is a isotropic Poisson segment process. For the estimation we used: Method 1 with d = 2 (solid-circle line) or d = 3 (solid-star line) and Method 2 with ci = 0 (dash-circle line) or c\ = 0.6 (dash-star line). For all methods, fixed a type of Oi and O2, the MSE was decreasing as the dimension increased, since also the variance itself decreases with the dimension for the convergence of the intensity estimator. Moreover, the MSE of the four methods are usually similar, except for the smallest considered side of the window {dim = 100). In that case, usually Method 2 with ci = 0 achieved the lowest error and with ci = 0.6 the highest. Regarding the relMSE, the error was almost constant as the dimension of W increased, since it is less sensitive to changes in the value of the variance. Furthermore, Method 1 with d = 3 achieved only a slightly smaller error than with d = 2 and, with both values of J, the error slightly increased with La,2- This effect might be due to the pixel approximation of the image, so that we find a smaller number of pairs of intersection points (x,_y) such that |x-_y| <= J, than what expected. In fact, we obtained almost always an underestimation of the variance and thus the error represents a quantification of this underestimation. Instead, Method 2 with ci = 0.6 almost always overestimated it, due to its formulation (Eq. 7). Moreover, the error decreased as La,2 increased, because the value of ci = 0.6 is an approximation obtained heuristically for high values of the intensity of O2. Using = 0, the properties of the estimator of the variance are similar to the ones of Method 1, but with a lower relMSE and a smaller underestimation. In conclusion, when Eq. 8 or Eq. 9 hold, it is better to estimate Var(LA i(W)) with Method 2 using ci = 0 (when the window of observation is sufficiently large), otherwise it is better to use a test process with high intensity La,2 and (over)estimate the variance with Method 2, using ci = 0.6 (for processes with characteristics similar to ours) or trying to estimate it, since in this case Method 1 leads to an underestimation of the variance, and, for comparison purposes, it is preferable to be more conservative and obtain larger approximated confidence intervals. VALIDATION OF THE LEARNED DETECTOR We evaluated the accuracy of the learned detector by verifying that the number of intersections computed for each segment estimates well the reference number of intersections, which is computed from the ground truth segmentation. For the vahdation, we used the images of the three real datasets described previously. For each dataset, we generated 1000 random segments and, for each segment, we computed the number of intersections from the ground truth and, using the detector, from the corresponding real image. We observed that the p-value of the test on the Pearson's correlation coefficient was always lower than 10"^^ and the slope in the linear regression model was always positive and lower than one. Thus, even in challenging datasets, the detector found a number of intersections well-correlated to the one in the ground truth, with a slight expected underestimation, as tangent intersections might be ignored by the detector but counted in the ground truth. In Fig. 5, we can also observe that the relative absolute error is similar in all datasets, with a median close to 0.2 even in the more difficult case. Instead, for example, we would have obtained a greater performance variability among the datasets, if simply we had considered as intersections all peaks with probabihty greater than 0.5. In the DRIVE dataset, which is regarded as a standard benchmark for vessel segmentation, the detector performance was comparable to state of the art segmentation methods (designed specifically for such dataset), in the task of finding the number of fibre-segment intersections. Beyond the good estimation of the number of intersections, we also qualitatively verified that the detected intersections were at the expected positions. Fig. 6 demonstrates this by showing that the spatial distribution of the intersections detected in many segments (white-background image labeled Det. Tnf/yrv ^ mafrhpQ fhp arfiial nnQifinn of fhp ranillaripQ Fig. 5. Boxplot of the relative absolute error between the number of intersections in the ground truth and the corresponding computed either with our method (here called soft) or by identifying the intersections as peaks with probability greater than 0.5 (hard). Fig. 6. Original image (Original), ground truth segmentation (GT) and spatial distribution of detected intersections (Det. Inters.) for an image in the dataset AuQioQenesis. REAL APPLICATION OF THE INTENSITY ESTIMATION We show an apphcation of the intensity estimation using the real dataset Angiogenesis. The images were produced at IFOM (FIRC Institute of Molecular Oncology Foundation, Milan), by the research group of Prof. Dejana, during a research project regarding the inhibition of angiogenic processes {i.e., formation of new capillary blood vessels) for cancer therapy. They considered several antibodies against the protein VE-Cadherin and studied their ability of inhibiting the angiogenesis on mouse corneas. In each experiment, firQf fhpv imnlanfpri a npllpf rnnfainina hrFOF-9 (which induces the formation of capillaries and thus simulates a tumor) in the cornea of a mouse and then they treated the mouse with an antibody either putting it in the pellet together with the angiogenic factor (non-systemic treatment) or injecting it intraperitoneally starting from the day after the pellet implantation (systemic treatment). The four mice used as control were treated (two of them non-systemically and the others systemically) with the antibody nonimmune rat IgG (Rat-IgG), which has no antiangiogenic effect. For each type of treatment, two experiments were executed (i.e., they used the treatment on two mice) and thus two eye images were available. The evaluation of the "performance" of the antibody was done by the biologists, by comparing the images of the specific antibody with the control ones of Rat-IgG. In Fig. 7 we show a qualitative comparison between the estimation of the intensity and of the variance (with Method 2 and both c1 = 0 and c1 = 0.6), using the ground truth and the real image. All estimates are nicely correlated, with a slight underestimation in the real image (due to the difficulties encountered by the detector). Moreover, we can see that differences between the estimates of the variance are lower by using c1 = 0 than c1 = 0.6. Fig. 8. Confidence interval of the intensities of the mice non-systemically treated with antibodies Rat-IgG and 19E6. Fig. 7. Comparison of the intensity and variance estimation by using the ground truth and the real image in the dataset Angiogenesis. Fig. 9. Confidence interval of the intensities of the mice systemically treated with antibodies Rat-IgG and 6D10. To evaluate the ability of the antibodies in inhibiting the angiogenic process, we computed the approximate confidence interval of the intensity for each image of eye, separately, and compared them with the corresponding ones obtained on the control eyes. The variance was estimated with both c1 = 0 and c1 = 0.6. Figs. 8 and 9 show two examples of comparison. In the former, we can observe that in the eyes of mice treated with antibody 19E6 the intensity looks significantly lower than in the control eyes. In the latter, we cannot say that the intensity corresponding to the antibody 6D10 is significantly different from the one in the control eyes. In fact, in this case the treatment was systemic, that is the antibody was given the day after the implantation of the pellet, when the network of vessel was already partially formed, and the major differences are in the area occupied by the vessels and in their width, more than in the intensity. Moreover in Fig. 9 we can also note the high biological variability of the two replicates corresponding to antibody 6D10, which may suggest that systemic treatments give less reliable effects with respect to the non-systemic ones. DISCUSSION We propose a computational-statistical technique for the estimation of the intensity of stationary planar fibre processes from their digital images. The method is based on a statistical estimator (La,i) proportional to the number of intersections between the process under study <ä>i and an independent motion invariant test fibre process (simulated on the real digital image of <ä>i). The points of intersection in the image are identified by a learned detector (based on a classifier), which is trained on few user inputs (examples of intersection and non-intersection points). The main advantage of such approach is that it can work on any suitable dataset after a simple and quick training by the user. Moreover, the resulting detector works well also in hard datasets, such as DRIVE and Angiogenesis, where a clear segmentation is not easy to determine even visually. Uncertainty and ambiguity in training data is well-handled by exploiting the probabilistic information returned by the detector, rather than forcing binary outputs. Both the statistical and the computational components of the method have good characteristics so that it performs well even on real challenging data. The intensity estimator La,i is asymptotically normal, under suitable mixing conditions {e.g., the independence of the point process of the intersections in set separated by a distance greater than /, / < oo) and, in the applications, these assumptions are often satisfied. Due to the asymptotic normality, the method can also provide an approximate confidence interval for the intensity, once computed the variance of the estimator. In case only one image of the process is available, we considered two possible ways for the approximation of the variance and, based on the results with simulated data, we identified which of them is more suitable for applications with digital images (where the curves are approximated with pixels). Thus, the approximate confidence interval can be calculated even in case of one single image of the process. geometric In real applications, also other characteristics can be of interest, such as the distribution of the orientation of the fibres and their width (by considering the objects under study as two-dimensional random sets, instead of random fibres, which are one-dimensional). We leave to subsequent papers the study of estimators of these quantities and the extension of the detector to provide also the information required for their computation. Furthermore, the performance of the detector can be improved by implementing an active learning technique for its training. In this case, the segments shown to the user are chosen automatically in order to provide the largest amount of useful information to the classifier. Finally, we intend to release soon an open source code of our method in Matlab. APPENDIX In this section we provide the proofs of all propositions and corollaries stated in the article. Proof of Proposition 1. As in Weiss and Nagel (1994), let us assume that the segments of <ä>2 are numbered and, for any x G <ä>i n <ä>2, n{x) denotes the (a.s. unique) number of the segment of <ä>2 to which x belongs. Thus, we can rewrite the second moment of A'in2(-) as. EKn2W]=E _xein2 x,y(£in2'- n(x)My) . n{x)=n{y) where %(•) is the indicator function of the set W. The first term is the first moment of A',n2(-) and thus it is equal to E[A/jn2(W^)] = '2.La,iLa,2V2{W)/n = 2LA,2E[iMs>^{W)]/n (see Lemma 3.2 in Mecke, 1981). The' second term can be explicitly computed by using Theorem 4.1 in Weiss and Nagel (1994) and it is equal to E[ßl^{W)]i2LA,2/n)^. Finally, since the third term is greater or equal than zero, we obtain that 2La.2 2La,iLa,2V2{W) that is, ' \ n J 2La,ILA,2V2{W) + ■ n because = 2La,2HiM>i{W)]/7t:. The thesis can now be derived, by using the definition of the estimators La,i and L^^asure Proof of Corollary 1. From Proposition 1, Var(U,i(W)) > Var(L7-"-(W)) + , 2LA,2V2{W) for any compact window of observation W in R^. Thus, for any constant c> 0 such that c< 2LA,2V2(W)Var(L; measure m we have that Var(Lr-"-(W)) < — Var(U,i(W)) . By setting ci = 1/(1 +c), we obtain the thesis and ci < 1 since c> 0. Proof of Proposition 2. Let ß be a bounded convex set in R^, then we can define a finite grid of squares of side I that covers B and can have only sides or vertices in common, or, more in general, intersections with null v2-measure: ^ = {ß^: ß^ nß / 0, ß^ n ß/ = A with V2(A) = 0, Mi / j}, where Q\ denotes a square of side I. We can divide the collection of squares into two groups whether they lie completely inside B or not. Let us define the corresponding sets of indices as h = {i-. Qi B} (with cardinality |/i \=Ni) and h = {i- Q\ B} (with I/2I = N2). Moreover, for each index i ^ {k = \,2), let us define the set of indices in 4 corresponding to squares adjacent to Q\, i.e., H^ = {j G h: j / i, d{&i,Qi) = 0} (where d{.,.) is the Euclidean distance between sets). Obviously, |//f| < 8, for all/, L Now, we can decompose ß{B) as Y^ieii M(ß/) + Lieh M(ß/ n ß) =: X + F and it is easy to verify that Var(;U(ß)) < Var(X) + Var(F) + 2 VVar(X)Var(F) , (10) by a suitable majorization of Cov{X,Y). In order to obtain an upper bound of the right-hand side of Eq. 10, firstly let us derive an upper bound for E[X^], E[X2] = Y^ElßUm + ^ ^ ElMffi)MQj)] + 1: L E[M<,(ßi)]E[M(ß/)] (ß/)) + (E[X])2, by using the stationarity of the fibre process, the independence of the process at distance I, the Cauchy-Schwarz inequality and the inequality \Hf\ < 8. Therefore, Var(X) < 9MVar(;U(ß/)). Regarding by using a similar procedure we obtain, ieh + Y.I. E[M(ßinß)M(ß/nß)] iehjeHf + ^ ^ E[iM.{QinB)]E[ß^{QjnB)] ''^'2jeh\{HfL>i} <9N2E[ßl{Q])] + {E[Y]f , since Mi(ß/ nß) < ßliQ]) a.s., for any i, and im>{.) > 0 a.s.. Thus, Var(F) < 9A^2E[Mi(ß/)]• By applying all these results to Eq. 10, YarjiMfiB)) ViiB) <9 NiYmi^iQ})) N2E[^1{Q})] V2(ß) V2(ß) (11) In order to properly majorize the ratios Ni/V2{B) and N2/v2{B), let us consider some properties of convex sets. Let us define: dB as the boundary of B, p{B) as its perimeter, d{B) as its diameter, r{B) as its inradius and Qs{x) as a square of side 5- centered at x. Because of the definition of the sets of indices h, we have that [^iehQ\ ^ «-e-, Nif < V2{B). Moreover, since B can be always inscribed in a square of side Nßl := {\d{B)/l'\ + 1)/ (where [x] = min{;t eZ : x