Image Anal Stereol 2008;27:143-149 Original Research Paper SAMPLING INTENSITY WITH FIXED PRECISION WHEN ESTIMATING VOLUME OF HUMAN BRAIN COMPARTMENTS R M M G´-F~ HIANNON AUDSLEY1 AND ARTA ARCIA INANA2 1Department of Mathematics and Statistics, Lancaster University, LA1 4YF, Lancaster, UK; 2Centre for Medical Statistics and Health Evaluation, Faculty of Medicine, University of Liverpool, L69 3GS, Liverpool, UK e-mail: Rhiannon.Maudsley@quanticate.com, martaf@liv.ac.uk (Accepted September 10, 2008) ABSTRACT Cavalieri sampling and point counting are frequently applied in combination with magnetic resonance (MR) imaging to estimate the volume of human brain compartments. Current practice involves arbitrarily choosing the number of sections and sampling intensity within each section, and subsequently applying error prediction formulae to estimate the precision. The aim of this study is to derive a reference table for researchers who are interested in estimating the volume of brain regions, namely grey matter, white matter, and their union, to a given precision. In particular, this table, which is based on subsampling of a large brain data set obtained from coronal MR images, offers a recommendation for the minimum number of sections and mean number of points per section that are required to achieve a pre-defined coefficient of error of the volume estimator. Further analysis on MR brain data from a second human brain shows that the sampling intensity recommended is appropriate. Keywords: Cavalieri sampling, coefficient of error, grey matter, human brain, magnetic resonance imaging, point counting, stereology, volume estimator, white matter. INTRODUCTION Observer-based methods, where the region of interest is visually identified by the observer, are often regarded as “gold standard” methods in quantitative magnetic resonance (MR) imaging studies. For example, in some applications, volumetric measurements of brain structures are obtained by manually delineating the region of interest on consecutive MR images (see for example, Salmenpera¨ et al., 1998; Bernasconi et al., 2003). However, this procedure is usually highly time consuming, and therefore, it may not be feasible in studies where a large group of subjects need to be investigated. An alternative technique, namely point counting, has been widely applied in combination with the Cavalieri method to estimate the volume of internal brain compartments, such as hippocampus, amygdala, ventricle, Broca’s area, white and grey matter, etc. (e.g., Doherty et al., 2000; Keller et al., 2002; Garc´ia-Fin~ana et al., 2003; Gong et al., 2005; Salmenpera¨ et al., 2005; Schmitz and Hoff, 2005, and references therein). In these studies, a set of parallel and equidistant MR images with a uniform random position are selected, and the area is estimated for each section by randomly superimposing a grid of systematic points and counting the number of points which fall inside the region of interest (see for example, Gundersen and Jensen, 1987). The question of identifying an effective sampling density for volumetric measurements using Cavalieri and point counting has previously been addressed (e.g., Gundersen and Jensen, 1987; Regeur and Pakkenberg, 1989; Roberts et al., 1997). In particular, it is suggested that counting a total of 150 points per structure is usually appropriate to achieve a coefficient of error of the volume estimator between 3% and 10% (Gundersen and Jensen, 1987). In this paper, we analyse a large brain data set (acquired from coronal MR images) to derive a detailed reference table for the minimum number of sections and sampling intensity per section that are required to achieve a pre-defined coefficient of error when estimating the volume of human grey matter, white matter and their union. In the following section, we introduce Cavalieri sampling, the point counting technique, and the error prediction formulae currently available to predict their precision. Those error predictor formulae are useful for a posterior assessment of the coefficient of error using the data sample. Then, a description of the data set and methodology is given, and the results are derived. In particular, two tables are derived to allow prior selection of the sampling intensity. Finally, the main conclusions and a discussion can be found at the end of the paper. 143 Maudsley R et al: Sampling intensity for human brain volume estimation CAVALIERI SAMPLING AND VARIANCE PREDICTION The volume of a three dimensional bounded structure, V, can be expressed as the one-dimensional integral JRf(x) dx, where f(x) represents the area of the intersection between the structure and a plane at the point of abscissa x. An unbiased estimator of V can be defined from a set of equidistant observations of f by applying: V = T ^f(UT+ kT); TgR+, (1) keZ (see, e.g., Baddeley and Jensen, 2005, and references therein) where T, called the sampling period, is the distance between consecutive observations along the sampling axis and where, in order to guarantee unbiasedness, U is a uniform random variable in the interval [0,1). For simplicity, U will denote indistinctly a variable or a realization of it. We assume that f : R —> R+ is a piecewise infinitely differentiable function which vanishes outside a bounded interval [a, b]. The variance of V can be expressed as follows: oo Var(V) = 2 ^T G(kjT) , (2) k=1 (see Yates, 1948; Moran, 1950) where G is the Fourier transform of the covariogram of f, g, and which is defined as g(h) = f(x) f(x + h) dx , h G R R (3) where q G [0,1] and T() and L() denote the gamma function and the Riemann Zeta function, respectively (see Garc´ia-Finana and Cruz-Orive, 2004; Cruz-Orive, 2006, where a table with numerical values for tt(q) is provided, and references therein). The smoothness constant q can be estimated from the data sample as described in Souchet (1995) and Garc´ia-Finana and Cruz-Orive (2004). It is often the case that the section areas {fi; i = 1,...,n} are not calculated precisely but estimated using a second level of sampling. In particular, with the point counting technique, a test grid of equidistant points with a distant u apart is superimposed on each section with a uniform random position. In this case, Eq. 1 becomes: V = Tu ^P(UT + kT); T G R+ , (6) keZ where P(UT + kT) represents the number of points hitting the region of interest for a section at the point of abscissae UT + kT. Point counting is also applied to structures where manual delineation of the section boundaries is possible. The reason is that point counting is less time consuming and this is a clear advantage in studies where volume estimation for a large number of subjects is required. When the grid of points is superimposed with isotropic orientation (as well as with a uniform random position as described above), the contribution of the variance due to point counting can be expressed as (Matheron, 1971): Var PC(V) =u T v , Several estimators of Var(F) have been derived based on mathematical expansions of Eq. 2 in powers of T (e.g., Matheron, 1965; 1971; Gundersen and Jensen, 1987; Cruz-Orive, 1989; Kellerer, 1989; Kie^u, 1997; Gual-Arnau and Cruz-Orive, 1998; Gundersen et al, 1999; Garc´ia-Finana and Cruz-Orive, 2004). A general and currently accepted variance estimator reads as follows: where V = 0.0724 B P ' (7) (8) var(V) = oc(q) T (3C0 - 4C1 +C2) (4) where q, called the smoothness constant, is the order of the first non-continuous (which could be fractional) derivative off. The quantities Ck with k = 0,1,2 are defined as Ck = Jln - 1 kff+k where {f; i = 1,...,n} denotes the set of equidistant observations offwithin the interval [a, b]. The analytical expression of oc(q) is given by: a(q) T(2q + 2) C(2q + 2) cos(nq) (2*)2q+2(1-22 q-1) (5) and where A and B represent the mean boundary length and mean area of the sections, respectively, and P is the mean number of points counted per section (Gundersen and Jensen, 1987). Eq. 4 has been extended to take into account the contribution due to point counting given by Eq. 7. The corresponding variance estimator becomes: var(F) = T 2 u 4 [a(q) (3 (C0 - v) - 4C1 +C>2) + v] , (9) where the quantities C'k with k = 0,1,2 are now defined as C'k = Xn -=1 PiPi+k where Pi ; i = 1,...,n, represents the number of points counted on the i-th section (see Cruz-Orive, 1999; Kie^u et al., 1999; Garc´ia-Finana et al, 2003, and references therein). The estimator v can be directly derived from Eq. 8, where P is estimated from the data sample and the shape coefficient B/A1'2 _ 144 Image Anal Stereol 2008;27:143-149 can be empirically estimated following the procedure as described in Gundersen and Jensen (1987). The shape coefficient is considered to be stable for a given type of shape. In most studies, the researcher will arbitrarily choose the number of sections and the point grid density. An estimation of the variance of the volume estimator is then obtained based on the acquired data sample by applying Eq. 9. However, to optimise the sampling density, it is useful to have an idea beforehand of the appropriate sampling parameters. In the next section, we derive, based on Eq. 7 and empirical resampling, a reference table with the approximate number of sections and sampling intensity per section required to obtain reasonable coefficients of error of the volume estimators of human cerebral grey matter, white matter and the union of both (namely total). DATA AND METHODOLOGY The material under study is the grey matter (GM) and white matter (WM), which includes glial tissue, of a human brain, and the union of both compartments, namely ‘Total’. A total of 217 non-empty consecutive coronal images with a distant 1mm apart were acquired by magnetic resonance imaging. The overall coefficient of error of V (defined as the root square of the variance of V divided by V) can be decomposed into: CE(V) = ( CE S (V) +CE PC (V) j , (10) where CES(F) and CEPC(V) are the coefficient of errors due to sectioning (variability between sections) and point counting (mean variability within sections), respectively. The two terms on the right hand side of Eq. 10 are here calculated for the volume estimator of the three brain compartments GM, WM and Total. We consider sample sizesn G {1,2, ...,20} and assume that the mean number — of counted points per section takes values in the set P G {1,2,..., 100}. For each MR section, the area of GM, WM and Total was measured by automatic pixel counting (pixel side = 1mm), and the samples of section areas here obtained were considered to be free of measurement errors (see McNulty et al, 2000 for a more detailed description of the data set). Empirical calculation of CES{V) is obtained by extracting subsamples of the initial data set (with 217 area data) via the following procedure: 1. The distance between parallel sections is fixed as T = s(mm) where s is initially setto s = 10. 2. The s possible systematic samples of section areas with a distance T apart are extracted. 3. The volume is estimated for each sample and the set of volume estimates {Vi ; i = 1,2, ...,s} is computed. 4. The empirical coefficient of error of V due to sectioning is calculated by applying 1 \s 1 (Vi-V /v , - z,i-i Vi — V/V , where V = 1 Ysi 1 Vi. s '-'i—i. ' s ^'— ' 5. Steps 1-4 are repeated with s = 11,12,...,217. For simplicity and since Eq. 7 provides satisfactory results for a wide variety of geometrical shapes (Gundersen and Jensen, 1987), we have used Eq. 7 to obtain the contribution to the coefficient of error from point counting, instead of deriving an empirical coefficient of error based on several random positions and orientations of the square grid on each section. In particular, VarPC(V') is calculated as a function of the number of sections andthe mean number of points counted per section P by applying Eq. 7. Note that CE2 C (V) can be written as (0.0724B/A1'2) P~3'2n~1. Estimates of the shape coefficient, B/A1'2, have been previously calculated for the three brain compartments here considered (GM, WM and Total), and take values 19.3, 11.5 and 7.7, respectively (see Garc´ia-Finanaetal., 2003). By using the statistical package R, a code was developed to calculate the contribution to the coefficient of error from the two levels of sampling and to construct the reference tables given in the next section. Table 1. Optimum values for n and P, such that minimum effort (i.e., minimum total number of points counted) is required to achieve coefficients of error of 10%, 5% and 2.5% for the volume estimator of grey matter, white matter and Total. 10% — Compartment n P GM WM Total 6 9 6 7 4 8 5% — n P 8 20 13 10 7 11 2.5% — n P 20 25 20 16 12 18 145 Maudsley R et al: Sampling intensity for human brain volume estimation Grey matter White matter Total Number of Sections Number of Sections Number of Sections Fig. 1. Ratio of the coefficient of error of the volume estimator due to point counting and sectioning as a function of the number of sections when P = 20. The value of n for which the ratio is equal to 1 is given by the abscissa of the intersection between the continuous curve and the horizontal non-continuous straight line. Grey matter White matter Total o Q. a; -Q 3 C C Number of sections Number of sections Number of sections Fig. 2. Values ofn and P for coefficients of error ofV equal to 10%,_5% and 2.5%. The straight broken lines represent constant values of the total number of points counted E = nP. 146 Image Anal Stereol 2008;27:143-149 RESULTS The contribution of the overall coefficient of error of the volume estimator of each brain compartment depends both on the number of sections and the mean number of points counted per section. Fig. 1 shows the ratio of the two components of the coefficient error decomposition given in Eq. 10 when P = 20. For small n, the greater contribution comes from the first component (sectioning), whilst as n increases, point counting emerges as the greatest source of error. The relative contribution from each error component differs between structures. Fig. 2 shows the values ofn and P that are required to reach fixed pre-defined levels of CE(V), namely, 10%, 5% and 2.5%. As we would expect, as the CE(V) decreases, the intensity of sampling required increases. Also, for a given value of CE(V), as the number of sections increases, the mean number of points required per section decreases. Optimum values of n and P for a given coefficient of error are regarded here as those that minimize the total number of points counted. Therefore, if we define the effort E as a quantity proportional to nP, E is an indicator of the time required to estimate the volume of the structure under study. The curves in Fig. 2 exhibit oscillations which are produced by the systematic nature of the sampling and are connected with the oscillating term of Var(V) called Zitterbewegung (Matheron, 1971; Kie^u, 1997; Garc´ia-Finana and Cruz-Orive, 2000). Due to the oscillating behaviour of the curves in Fig. 2, however, there may exist several pairs (n,P) which minimise E, each corresponding to a local minimum of E. To avoid this obstacle, optimum values were calculated based on spline interpolated upper bounds of the curves (see Table 1). Therefore, the recommended values ofn and P are conservative approximations, and as such, we expect that the values of the exact coefficient of error of V would tend to be lower than the initially defined level. It is acknowledged that in practice, it is not always possible or desirable to take the optimum n. For this situation, Table 2 provides a reference for the average of points per section which should be taken for fixed values ofn. The optimum sampling parameters recommended in Tables 1 and 2 are strongly related to the geometry of the brain structures under consideration, and probably also to the cutting direction. Therefore, although these tables have been derived from MR data of one human brain, we expect that Tables 1 and 2 will provide an appropriate reference for the brain compartments: GM, WM and Total of any other adult human brain. We analysed possible deviations from the pre-defined levels of precision after following the recommendations of Table 2 for a second brain. The imaging protocol and stereological quantification for the second brain was similar to the one described in Section Data and Methodology, although the distance between images for this second brain was equal to 0.625 mm. Table 2. Mean number of points (P) required for a range of values of the number of sections (n) and predefined levels ofCE(V). GM (%) WM(%) Total (%) n 10 5 2.5 10 5 2.5 10 5 2.5 1 2 3 74 39 4 41 13 8 5 11 72 8 63 6 15 96 6 9 28 7 32 5 13 57 7 8 21 6 18 5 11 34 8 7 20 83 5 16 82 4 10 28 9 7 18 69 5 14 59 4 9 26 10 6 15 58 5 13 46 4 8 24 11 6 15 50 4 11 40 3 8 22 12 6 13 45 4 10 37 3 8 18 13 5 13 41 4 10 35 3 7 18 14 5 13 39 4 9 33 3 7 17 15 5 12 36 4 9 32 3 7 16 16 5 11 34 4 9 29 3 6 15 17 5 11 31 3 8 26 3 6 15 18 4 11 28 3 8 23 3 6 14 19 4 10 26 3 8 19 3 6 14 20 4 10 25 3 7 16 2 6 13 Table 3 shows that the empirical values of CE(F) that a researcher will obtain after following the recommendation given by Table 2 with the second MR data are appropriate. Table 3. Empirical levels ofCE(V) (%) obtained when the recommended sampling intensities given in Table 2 are used. GM (%) WM (%) Total (%) n 10 5 2.5 10 5 2.5 10 5 2.5 5 10.3 5.9 9.9 5.3 9.0 5.0 2.6 10 9.9 5.1 2.3 9.0 5.0 3.1 8.4 5.0 2.3 15 9.2 4.9 2.4 8.5 5.3 2.5 8.5 4.5 2.4 20 9.5 4.9 2.8 9.2 5.3 3.5 9.9 4.4 2.4 147 Maudsley R et al: Sampling intensity for human brain volume estimation DISCUSSION This study has focused on the precision of the Cavalieri estimator in combination with point counting to estimate the volume of the human brain compartments: grey matter, white matter and their union. We have derived a reference table with the approximate number of sections and number of points per section that are required to estimate the volume with a pre-defined coefficient of error (10%, 5% and 2.5%). For example, the optimum values to achieve a coefficient of error of 5% are 8,13 and 7 sections with 20,10 and 11 mean number of points counted per section, for GM, WM and Total, respectively. The mean number of points recommended per section is directly connected to the size of the grid required — Table 2 was derived to provide the optimum value of P in applications where the number of sections is already fixed. Structures with a complex geometry will tend to need a more dense sampling than smooth structures to achieve the same level of precision. This effect can be directly observed in Table 2 where larger values of n and P are required for grey matter when compared to white matter, and of white matter when compared to Total, to achieve the same CE(V). The tables were subsequently checked using a second MR brain data set, and although predictions of the optimum level of sampling were not always conservative, the empirical coefficient of error was within a reasonable interval (see Table 3). The optimum sampling parameters depend on the sampling direction. In this study we consider coronal sections of the brain, and although we suspect that a different direction might provide similar results, this would need to be confirmed in an additional study. The shape coefficientB/A1'2 also depends on the sampling direction. For adifferent sampling orientation (e.g., axial, sagittal), B/A1'2 can be calculated for practical purposes from the analysis of few sections. Tables 1 and 2 are provided as an aid for researchers, to enable some element of confidence that the level of sampling implemented will provide a reasonable coefficient of error. However, these results are only a guideline, and error prediction formulae would need to be applied to estimate the level of precision achieved based on the data sample acquired as described in Section Cavalieri sampling and variance prediction. In studies that are based on the comparison of two or more groups, the researcher often needs to predict how many subjects (e.g., animals, patients, etc.) are required to detect a given difference between the population means for specified values of the significance level and power. Information is then required about the variability of the data due to the different levels of sampling involved. Let us assume, for example, that our aim is to compare grey matter volume between patients with schizophrenia and a control group, where volume is estimated applying Cavalieri sampling and point counting. In this case, an estimate of the biological variability within each group as well as the stereological variance, considered in this paper, would be required for the sample size calculation (see Cruz-Orive et al., 2004). Optimal sampling intensities in a two stage simple random sampling design have been previously investigated based on exact expressions of the variance (e.g., Cochran, 1977). Finding an analogous and general procedure to identify optimal sampling parameters in systematic sampling is a challenging problem due to the complex expressions of the variance involved, and this is left for future work. ACKNOWLEDGMENTS The authors are grateful to Dr. Deborah Costain for her helpful comments and Prof. Neil Roberts for the brain data sets used in this study. We thank the two referees for their helpful and constructive comments. This work was supported by the Spanish Ministry of Sciences and Technology I+D Project no. MTM2005-08689-C02-01. REFERENCES Baddeley A, Jensen E (2005). Stereology for Statisticians. London: Chapman and Hall/CRC. Bernasconi DN, Bernasconi A, Caramanos Z, Antel S, Andermann F, Arnold D (2003). Mesial temporal damage in temporal lobe epilepsy; a volumetric MRI study of the hippocampus, amygdala and parahippocampal region. Brain 126:462–9. Cochran WG (1977). Sampling Techniques. 3rd ed. New York: J. Wiley & Sons, Section 10.6. Cruz-Orive LM (1989). On the precision of systematic sampling: a review of Matheron’s transitive methods. J Microsc 153:315–33. Cruz-Orive LM (1999). Precision of Cavalieri sections and slices with local errors. J Microsc 193:182–98. Cruz-Orive LM, Insausti A, Insausti R, Crespo D (2004). A case study from neuroscience involving stereology and multivariate analysis. In: Evans SM, Janson AM, Nyengaard JR, eds. Quantitative Methods in Neuroscience, pp. 16–64. Practical Neuroscience Series. Oxford: Oxford University Press. Cruz-Orive LM (2006). A general variance predictor for Cavalieri slices. J Microsc 222(3):158–65. Doherty C, Fitzsimons M, Holohan T, Mohamed H, Farrel M, Meredith G, Staunton H (2000). Accuracy and 148 Image Anal Stereol 2008;27:143-149 validity of stereology a a quantitative method for assessment of human temporal love volumes acquired by magnetic resonance imaging. Magn Reson Imaging 18:1017–25. Garc´ia-Fin~ana M, Cruz-Orive LM (2000). New approximations for the variance in Cavalieri sampling. J Microsc 199:224–38. Garc´ia-Fin~ana M, Cruz-Orive LM, Mackay C, Pakkenberg B, Roberts N (2003). Comparison of MR imaging against physical sectioning to estimate the volume of human cerebral compartments. NeuroImage18:505–16. Garc´ia-Fin~ana M, Cruz-Orive LM (2004). Improved variance prediction for systematic sampling on R. Statistics 38(3):243–72. Gong QY, Sluming V, Mayes A, Keller SS, Barrick T, Cezayirli E, Roberts N (2005). Voxel based morphometry and stereology provide convergent evidence of the importance of medial prefrontal cortex for fluid intelligence in healthy adults. NeuroImage 25:1175–86. Gual-Arnau X, Cruz-Orive LM (1998). Variance prediction under systematic sampling with geometric probes. Adv Appl Probab 30:889–903. Gundersen HJG, Jensen EB (1987). The efficiency of systematic sampling in stereology and its prediction. J Microsc 147:229–63. Gundersen HJG, Jensen EB, Kie^u K, Nielsen J (1999). The efficiency of systematic sampling in stereology – reconsidered. J Microsc 193:199–211. Keller SS, Mackay C, Barrick T, Wieshmann U, Howard M, Roberts N (2002). Voxel based morphometric comparison of hippocampal and extrahippocampal abnormalities in patients with left and right hippocampal atrophy. NeuroImage 16:23–31. Kellerer AM (1989). Exact formulae for the precision of systematic sampling. J Microsc 153:285-300. Kie^u K (1997). Three lectures on Systematic Geometric sampling. Memoirs 13/1997, University of Aarush: Department of Theoretical Statistics. Kie^u K, Souchet S, Istas J (1999). Precision of systematic sampling and transitive methods. J Statist Plan Inf 77:263–79. Matheron (1965). Les Variables Re´gionalise´es et Leur Estimation (Thesis). Paris: Masson et Cie. Matheron (1971). The Theory of Re´gionalized Variables and its Applications. Les Cahiers du Centre de Morphologie Mathe´matique de Fontainebleau, No.5. Ecole Nationale Supe´rieure des Mines de Paris. F-Fontainebleau. McNulty V, Cruz-Orive LM, Roberts N, Holmes CJ, Gual-Arnau X (2000). Estimation of brain compartment volume from MR Cavalieri slices. J Comput Assist Tomogr 24:466–77. Moran PAP (1950). Numerical integration by systematic sampling. Proc Camb Phil Soc 46:111–5. Regeur L, Pakkenberg B (1989). Optimizing sampling designs for volume measurements of components of human brain using a stereological method. J Microsc 155:113–21. Roberts N, Cruz-Orive LM, Bourne M, Herfkens RJ, Karwoski RA, Whitehouse GH (1997). Analysis of cardiac function by MRI and stereology. J Microsc 187:31–42. Salmenpera¨ T, Ka¨lvia¨inen R, Partanen K, Pitka¨nen A (1998). Hippocampal damage caused by seizures in temporal lobe epilepsy. Lancet 351:35. Salmenpera¨ T, Kononen M, Roberts N, Vanninen R, Pitka¨nen A, Ka¨lvia¨inen R (2005). Hippocampal damage in newly diagnoses focal epilepsy: a prospective MRI study. Neurology 64:62–8. Schmitz C, Hoff PR (2005). Design-based stereology in Neuroscience. Neurosciences 130:813–31. Souchet DC (1995). Pre´cision de l’estimateur de cavalieri. Rapport de stage, DEA de statistiques et mode´les ale´atoires applique´s a´ la finance, Universite´ Paris-VII. Laboratoire de biome´trie, INRA Versailles. Yates F (1948). Systematic sampling. Phil Trans Roy Soc A. 241:345–77 149