Strojniški vestnik - Journal of Mechanical Engineering 60(2014)11, 725-734 © 2014 Journal of Mechanical Engineering. All rights reserved. D0l:10.5545/sv-jme.2014.1741 Original Scientific Paper Received for review: 2014-02-08 Received revised form: 2014-04-02 Accepted for publication: 2014-04-04 Investigating Prior Parameter Distributions in the Inverse Modelling of Water Distribution Hydraulic Models Daniel Kozelj1* - Zoran Kapelan2 - Gorazd Novak3 - Franci Steinman1 University of Ljubljana, Faculty of Civil and Geodetic Engineering, Slovenia 2University of Exeter, School of Engineering and Computer Science, U.K. 3Institute for Hydraulics Research, Slovenia Inverse modelling concentrates on estimating water distribution system (WDS) model parameters that are not directly measurable, e.g. pipe roughness coefficients, which can, therefore, only be estimated by indirect approaches, i.e. inverse modelling. Estimation of the parameter and predictive uncertainty of WDS models is an essential part of the inverse modelling process. Recently, Markov Chain Monte Carlo (MCMC) simulations have gained in popularity in uncertainty analyses due to their effective and efficient exploration of posterior parameter probability density functions (pdf). A Bayesian framework is used to infer prior parameter information via a likelihood function to plausible ranges of posterior parameter pdf. Improved parameter and predictive uncertainty are achieved through the incorporation of prior pdf of parameter values and the use of a generalized likelihood function. We used three prior information sampling schemes to infer the pipe roughness coefficients of WDS models. A hypothetical case study and a real-world WDS case study were used to illustrate the strengths and weaknesses of a particular selection of a prior information pdf. The results obtained show that the level of parameter identifiability (i.e. sensitivity) is an important property for prior pdf selection. Keywords: Bayesian inference, calibration, generalized likelihood, Markov Chain Monte Carlo, differential evolution adaptive metropolis, pipe networks, hydraulics, water distribution systems 0 INTRODUCTION Inverse modelling is the reciprocal process of the forward modelling problem in which a physical theory is used to predict the behaviour of a real system. Data from the indirect observations of unknown model parameters can be inferred to adequately represent the observed system behaviour. Inverse modelling of water distribution system (WDS) models, commonly referred to as calibration or parameter estimation, has been investigated extensively since the 1980s, providing valuable insight for modellers when tackling the nonlinear and highly combinational calibration process. Throughout this period, different types of model parameters have been estimated, e.g. pipe friction coefficients, pipe diameters, nodal demands, etc. Approaches of WDS model calibration can be divided into three categories: iterative trail-and-error approaches, explicit models, and implicit models, e.g. optimization approaches. The development of implicit models has proved to be the most effective in the exploration of the non-linear parameter space. A wide variety of global optimization methods has been studied for parameter estimation problems. Those methods can be divided into non-evolutionary and evolutionary methods. Among the evolutionary methods, genetic algorithms in particular have proved their applicability to large and complex real-world calibration problems with multimodal search (parameter) spaces. For a comprehensive review of calibration methods, we refer the reader to Savic et al. [1]. The assessment of parameter and predictive uncertainty is an essential part of the modelling process in order to perform model comparison and selection [2]. One shortcoming of the summarized optimization methods is their ability to only identify near optimal parameter values, while they lack the ability to estimate the parameter and predictive uncertainty. However, formulating the inverse modelling problem as a probabilistic Bayesian approach, and solving it with a Markov Chain Monte Carlo (MCMC) method exhibits the capability of estimating parameter values and their associated parameter and predictive uncertainties in a single optimization run [3]. Alternatively, in a recent study, the uncertainty analysis of pipe roughness coefficients by using grey numbers was proposed, which led to uncertainty intervals without defining any probability distribution [4]. Bayesian inference is a concept of probability theory whereby model parameters are represented as probabilistic variables having a joint posterior probability density function (pdf). The joint posterior pdf is derived from combining information on the prior distribution of model parameters and data likelihood. Bayesian-type approaches have some distinct advantages in comparison to existing WDS calibration methods: probabilistic definition of prior pdf of parameters, retrieving joint and marginal *Corr. Author's Address: University of Ljubljana, Faculty of Civil and Geodetic Engineering, Jamova cesta 2, 1000 Ljubljana, Slovenia, daniel.kozelj@fgg.uni-lj.si 725 Strojniski vestnik - Journal of Mechanical Engineering 60(2014)11, 725-734 posterior pdf, and no requirement of derivative calculation [3]. Recently, developments have led to significant improvements in the efficiency of MCMC simulations and extended their feasibility to complex, multimodal search problems [5], [6]. The differential evolution adaptive metropolis (DREAM) scheme is a new MCMC sampler, which runs multiple chains simultaneously for global exploration and automatically tunes the scale and orientation of the proposal distribution during the search process. We use a recent variant of DREAM, called DREAM(ZS), which uses sampling from past states and a mix of parallel direction and snooker updates to generate proposals in each chain [5]. The aim of this paper is to demonstrate the benefits of including prior information to improve the identifiability of estimated parameters. We investigate the effect of different sampling strategies of pipe roughness coefficients in the inverse modelling of WDS hydraulic models. The paper is organized as follows: following this introduction, we provide the governing equations that constitute the WDS forward modelling approach. Afterwards, a formulation of the Bayesian inference approach is given. The presented approach is applied in Sections 3 (artificial case study) and 4 (real-world case study) to estimate the parameter and predictive uncertainty of WDS model parameters. The results of each case study are discussed in their corresponding sections. Section 5 summarizes our findings and relevant conclusions are drawn. 1 WATER DISTRIBUTION SYSTEM MODELLING The main purpose of WDS is to supply its users with the required quantities of water under adequate pressure for various loading conditions. Common constituents of WDS are water sources (i.e. reservoirs, pumping stations), distribution storage (water tanks), and distribution pipe networks. To appropriately perform operational tasks, as well as development and rehabilitations measures, the utility operator is assisted by WDS models. Hydraulic simulations of WDS models provide insight into the flow and pressure conditions of even the most complex WDS. The interconnection of the WDS components is governed by the conservation of energy and the conservation of mass. The conservation of energy means that the difference in energy between nodes is equal to the pipe friction and minor losses and the energy added to the flow in components between the observed nodes: Z hLj + Z K, j =AE, (1) where hLi is the energy loss in pipe network component i, hPj the added energy by pump j, and AE the difference in energy between observed nodes [7]. A commonly used fictional energy loss model is the Darcy-Weisbach equation: Ad 2 g , l-l\ Re, (2) where L is pipe length, d pipe diameter, v fluid flow velocity, g gravitational constant, X the Darcy-Weisbach friction factor dependent on the Reynolds number (Re), and relative pipe roughness (s/d), s equivalent roughness. The minor (i.e. local) energy losses of valves and fittings are typically expressed as: h =C — L,local T r, ' 2 g (3) where Z is an empirical coefficient. The pump energy gain is given by: hP = -m2 (h0 - r (Q / m)" (4) where h0 is pump shutoff head, a> variable pump speed, r and n pump curve coefficients. The conservation of mass of each junction node is: ^ ' Qin ^ ' Qout ^ex (5) where Qin and Qout are pipe flow into and out of a junction node, and qext is the external demand at junction node [7]. When steady-state simulations are extended to extended-period simulations, which mimic a quasi-dynamic WDS behaviour, the conservation of mass Eq. (5) is extended to account for storage in tanks: ^ ' Qin ^ ' Qout dt ^ex (6) where dV is a change in storage volume, and dt the time period between steady-state simulations. Changing tank water levels are updated by: dV dHT = (7) where dHT is a change in tank level and AT tank cross-section. The set of mass continuity and energy equations for a WDS model are most efficiently solved by the gradient method, and its implementation can be found in the widely known EPANET2 network solver [8]. 2 726 Kozelj, D. - Kapelan, Z.. - Novak, G. - Steinman, F. Strojniski vestnik - Journal of Mechanical Engineering 60(2014)11, 725-734 2 INVERSE MODELLING The inverse modelling problem is usually based on a nonlinear regression model [3]. First, let us consider a model, f that simulates a vector of model predictions. In a general form, the model can be written as: Y = f ( X\0)», (8) where Y is a vector of model predictions, X a vector of known model inputs, 8 a vector of unknown model parameters, ^s a bias factor to account for model input error which is defined as: Ms = eXP (HhY) (9) where nh is a bias parameter to be inferred from the observations [2]. In order to provide a measure of model adequacy, it is common to compare the model f simulated response Y with measurements of the observed system behaviour Y. The nonlinear regression model describes the random component of residuals as the difference between the deterministic components of model predictions of a WDS model, Y, and observations, Y: e(0) = Y(X\9) - Y, (10) where e(8) is a vector of residuals (e1, ..., eN}, N the number of observations Y. Residuals, e(8), are defined as a statistical model describing a priori expected behaviour. Frequently, residuals are assumed to be independent and identically distributed (i.i.d.) according to a normal distribution with zero mean and a constant variance, i.e. homoscedasticity, and are not showing any autocorrelation. Occasionally, these assumptions are violated and an alternative description of the residual is needed. In this study, we adopt the generalized likelihood function of Schoups and Vrugt [2] that can account for residual errors that are correlated, heteroscedastic, and non-Gaussian. First, we describe the statistical model of residuals, while the generalized likelihood function is provided in Section 2.1. To account for correlation and non-normality residuals, e(8) are described by: ®, (B)e, (0) = oe sas (11) where Op(B) is an autoregressive polynomial with parameters B a backshift operator, ae,s a standard deviation of residuals, as i.i.d. random error described by a skew exponential power distribution as ~ SEP(0,1,£,P) with zero mean, unit variance, and with the parameters £ and p accounting for skewness and kurtosis. The heteroscedasticity of residuals is accounted for by assuming that the standard deviation ae,s linearly increases with model predictions: = a0 +alYs, (12) where a0 and are parameters to be inferred from the observations. Details of this approach can be found in Schoups and Vrugt [2]. 2.1 Likelihood Function If the inverse problem is stated as a probabilistic framework the criterion (i.e. measure) to estimate the residuals of a model, the response variables vs. observations is called the likelihood. The likelihood L(8\Y) quantifies the "probability" that observed data were simulated by a particular set of parameters [9]. A general likelihood function presented in Schoups and Vrugt [2] is adopted to account for conditions of correlation, non-constant variance, i.e. heteroscedasticity, and non-normality of residuals. Their formulation of a log likelihood l(0\Y) functions is: 2afm„ N 1 \ £(dhJY ) = N log-^1 -X log Ks ) (13) and residual errors are given as: a(s ), (14) where ^ a^, cp and wp are variables defined as functions of £ and p, which are provided in Appendix A of Schoups and Vrugt [2]. 2.2 Parameter Uncertainty By considering model parameters as the only source of uncertainty, the posterior parameter pdf p(8\Y) can be estimated from the Bayes theorem: p(d\Y ) = p(0) p(Y\d) p(Y) ! (15) where p(8) is a prior parameter pdf, p(Y) a normalization constant or "model evidence", p(Y\8) = L(8\Y) likelihood function. Since only parameters are of interest, we can ignore the normalization constant p(Y) and infer parameter samples from the posterior parameter pdf p(8\Y) that is proportional to the prior Investigating Prior Parameter Distributions in the Inverse Modelling of Water Distribution Hydraulic Models 727 Strojniski vestnik - Journal of Mechanical Engineering 60(2014)11, 725-734 parameter pdf p(0) multiplied by the likelihood function L(0\Y): p(d\Y) x p(Q) L(e\n (16) Parameter uncertainty after observing data is directly derived from the posterior parameter pdf p(0\Y) [9]. The term p(0) denotes prior knowledge of the parameter vector 0 before inferring it to the observational data Y. In the present case, pipe roughness coefficients are under investigation. Since prior information on the parameter pdf is limited and vague, we will consider three cases to sample the parameter sets 0 from the prior parameter space. Initial parameter sampling will be from a continuous uniform pdf (also known as rectangular distribution) of the parameter space. The continuous uniform distribution p(0)~U(^0,a0) is a bounded domain distribution and samples values between given lower ^0 and upper a0 bounds, respectively. The second type of prior pdf is the normal (Gaussian) distribution with a given mean value and standard deviation. The prior p(0)~U(^0,a0) is given by the mean parameter value ^0 and its standard deviation a0. Finally, we provide the gamma distribution for describing the prior information on parameters. The gamma distribution p(0)~r(a,P) is defined by a shape parameter a and a scale parameter p. The gamma distribution closely approximates a normal distribution with the advantage that the gamma distribution has density only for positive real numbers, which is compliant to the physical nature of our parameters. All samples (parameter values) generated from prior pdf are trunked at 0 (only positive values are allowed). Prior information about parameters can significantly improve parameter identifiability and provides an effective and robust approach of parameter value estimation [6]. Additional information on the selected prior pdf parameter values is provided in Section 3. The assembled Bayesian framework, i.e. the prior pdf of model parameters, the likelihood function in Eq. (13), and the joint posterior parameter pdf can be calculated using Eq. (16). MCMC simulations are used to efficiently derive the joint posterior parameter pdf by repeated sampling of parameter sets [2] and [3]. 2.3 Predictive Uncertainty In addition to the evaluation of parameter uncertainty, the predictive uncertainty is also of significant interest. The predictive uncertainty derives from predictive percentiles Ya, which correspond to the exceedance probability P(Y< fa\X), and can be calculated as: P([Y(X | e) + e(0)]^ < Ya\X) = a, (17) where Ya is exceedance probability 1-a, a significance levels, N0 number of MCMC sampled parameter sets 0. The prediction percentiles Ya are obtained from the set of J predictions of the sampled parameter set 0 and its corresponding response Y(Y|0) and residuals e(0). Evaluating the 95% predictive uncertainty bands requires the selection of a = 0.025 and a = 0.975, 1.e. the 97.5% and 2.5% prediction percentiles, respectively [10]. 2.4 Sensitivity Analysis and Parameter Identifiability A complex real-world WDS comprises numerous uncertain model parameters (e.g. pipe roughness coefficients, nodal demands, pipe diameters, etc.) that could be investigated. To reduce the number of calibrated parameters, a sensitivity analysis of the pipe roughness coefficient was performed by applying the forward finite difference approximation of the first derivative of model response against all investigated model parameters [11] and [12]. This approximate approach is warranted since it serves only as a measure of model parameter identifiability for the given measurement layout (the model structure is assumed to be given and not addressed here). The low sensitivity of the model response to a parameter can lead to the reduced identifiability of the investigated parameters [12]. Sensitivity and uncertainty are closely related, e.g. greater parameter sensitivity results in greater uncertainty propagation from that parameter. The sensitivity analysis facilitates the selection and differentiation between more and less identifiable (i.e. sensitive) model parameters. The classification between the cases is applied in conjunction with model parameter's prior information. If prior information on model parameters is vague, sensitive parameters could still be identifiable by applying a uniform prior pdf, and less sensitive ones by applying an informative (e.g. normally distributed) prior pdf. 2.5 DREAM(ZS) Algorithm MCMC simulations are an increasingly popular method in a wide range of engineering problems [3], [6], [13] and [14]. In inverse modelling, Bayesian frameworks proved their ability to effectively estimate the posterior pdf of parameters. In our study, we used the DREAM(ZS) algorithm [5] provided by J. A. Vrugt. 728 Kozelj, D. - Kapelan, Z.. - Novak, G. - Steinman, F. Strojniski vestnik - Journal of Mechanical Engineering 60(2014)11, 725-734 The DREAM MCMC scheme runs multiple Markov chains simultaneously for effective global exploration of the parameter space and provides efficient evolution of the proposal distribution to its target distribution, especially for complex, highly non-linear and multimodal target distributions [5]. The DREAM(ZS) algorithm differs from its predecessor by using sampling from past states and a mix of parallel direction and snooker updates to generate proposals in each chain. Some of the distinct advantages of DREAM(ZS) are that sampling from the past reduces the need to use a large number of chains; outliers can be redirected to the region of exploration; the independence of the current state of chains enables integration in multi-processor environments [5] and [6]. These improvements lead to the acceleration of convergence to the target distribution, especially for high-dimensional problems (d > 20, i.e. number of parameters). DREAM(ZS) can work with d up to 50 to 100 with far fewer chains, e.g. NZS = 3, while still accurately assessing the target distribution once convergence has been achieved [5]. Other DREAM(ZS) algorithm parameters are DEpair the number of chain pairs to generate candidate points; NCR the crossover value, pup the fraction of parallel direction updates, k the thinning parameter for appending position of chains and corresponding posterior density values to sample history, Zm0 the initial size of thinned sample history (past states), pjump the probability of selecting a jump rate of 1, Neva¡ the number of function evaluation. 3 HYPOTHETICAL WDS CASE STUDY This study aims to demonstrate the performance of the suggested approach of parameter and predictive uncertainty analysis by applying the Bayesian framework on the "Anytown" WDS model and has been used in various calibration studies [3], [11] and [15]. In a previous study of the Anytown model, a Bayesian-type procedure was applied to investigate the uncertainties of HW C-factor pipe roughness estimations [3]. The present study aims to investigate pipe roughness coefficients for the equivalent roughness s of the Darcy-Weisbach (DW) friction model. The Anytown model consists of 34 pipes and their roughness coefficients are grouped into six pipe roughness groups (PG) (Ng = 6). Their true DW s values are provided in Table 1. Observational data sets are generated by simulating the model response via the Epanet2 hydraulic solver [8]. Pressure measurements collected at four junction nodes (i.e. 40, 90, 120 and 140) and five independent LC represent the observational data (N = 20) for the presented case. The imperfect observational data was generated through altering the perfect observational data by introducing random normally distributed noise with zero mean and a standard deviation of 0.10 m. Incorporation of prior knowledge on calibration parameters (DW equivalent roughness s) is performed by using three prior information pdfs. A continuous uniform prior pdf p(0)~U(0.001, 15) is first used for all PGs. Then, the pdf parameters of the normal and gamma distribution are estimated on the basis of the approximate equivalent roughness s values by consulting literature sources relating the original HW C-factors [3] to the DW s values used in this study [16]. The distribution parameters of the normal and gamma priors are provided in Table 1. Table 1. Anytown: True D-W c values for PGs and parameters of normal and gamma prior pdf DW Normal Gamma s Vg a P PG1 0.525 0.75 0.5 1.0 1.0 PG2 11.75 11.0 1.0 10.0 1.0 PG3 2.5 2.5 1.0 3.0 1.0 PG4 0.3 0.5 0.5 0.5 1.0 PG5 1.2 1.25 1.0 2.0 1.0 PG6 1.2 1.25 1.0 2.0 1.0 The generalized likelihood (GL) function given by Eq. (13) is used with fixed values of residual model parameters ^ = 0 and ^h = 0, while parameters o0, ctj, [, and £, are inferred additionally to the model parameters. Uniform prior pdfs are assumed for the GL parameters and their upper and lower bounds are as follows: o0 [0, 1], o1 [0, 1], [ [-1, 1] and £ [0.1, 10]. This results in a total number of Ne = 10. The DREAM(ZS) algorithm was set up with the following parameters: Ne = 10, NZS = 3, DEpair = 1, Ncr = 3, pup = 0.9, Z^0 = 10xNdim = 60, p]ump = 0.2, Neva, = 50,000. The DREAM(ZS) algorithm converged in approximately 35,000 function calls with a total simulation time of 315 s on a 403 MFLOPS PC. The inferred residual model parameters 8e of the GL function were evaluated at o0 = 0, ox = 0.0011, [ = 1, and £ = 0.583 for the proposal prior pdf p(6) scheme. Very similar values were also observed at other simulations. The GL function parameters indicate that residuals are non-normally distributed and heteroscedastic. The SEP parameters [ and £ indicate that the residual distributions are peaked ([ = 1) and negatively skewed (£ = 0.583). Investigating Prior Parameter Distributions in the Inverse Modelling of Water Distribution Hydraulic Models 729 Strojniski vestnik - Journal of Mechanical Engineering 60(2014)11, 725-734 The model parameters uncertainty simulation results can be observed in Fig. 1. The presented box plots of PG1 to PG6 provide information on the following statistical values: median (middle line), lower (first) and upper (third) quartiles (i.e. the interquartile range (IQR)) of posterior parameter pdf samples, and the 95% confidence interval (vertical lines). The actual parameter values are given in Table 1. The obtained parameter statistics show that even "uninformative" prior distributions (e.g. uniform pdf) can adequately identify parameters values. This can be observed for parameter groups PG1 to PG4. However, PG5 and PG6 show greater deviations of the median parameter estimates as well as their IQR and the 95% confidence intervals. This is caused by their small parameter sensitivity for the given observational layout. Therefore, the incorporation of prior information is narrowing the IQR ranges by deriving independent information on pipe roughness states. Identification of parameters with small parameter sensitivity can be very difficult, since the given observational data do not provide sufficient information to provide reasonable parameter estimates and narrow posterior pdf [17]. PG5 and PG6 are not identifiable by the uniform pdf, while normal and gamma prior pdfs slightly deviate in their marginal posterior pdf. Applying a normal or gamma prior distribution narrows the parameter uncertainty. The differences in shift and broadness of IQR and 95% confidence intervals arise from the prior pdf used and the observational information available. By examining Fig. 1, the shape and position of both normal and gamma pdfs are identifiable from the marginal posterior parameter statistics. This indicates that the posterior parameter pdf, and their estimated values of insensitive parameters benefit or suffer from the applied prior distribution. This is evident since the likelihood function does not force the joint posterior parameter pdf towards their "true" values. Here lies the true added value of prior information of calibration parameter estimates. Based on the information given in Sections 2.2 and 2.4 and the findings from the previous paragraph, we used a fourth prior information scheme by combining the synergies of prior parameter information and parameter sensitivity. A continuous uniform prior pdf />(0)~^(O.OO1, 15) is used for PGs with higher parameter sensitivity (PG1 to PG4), while PGs with lower sensitivity are estimated by their associated gamma prior distribution given in Table 1 (PG5 and PG6). When compared to the normal and gamma prior pdf results, a close resemblance in terms of parameter mean, IQR and 95% confidence intervals can be observed. Table 2. Anytown: Model fit statistics for the four prior pdf schemes Uniform Normal Gamma Proposal RMSE 0.068 0.071 0.062 0.065 R2 0.998 0.998 0.998 0.999 Bias -0.035 -0.021 -0.016 -0.011 The predictive uncertainty results in terms of root-mean-squared error (RMSE), coefficient of determination R2 and bias are presented in Table 2. All different prior pdfs generated an excellent model fit. Fig. 2 presents the histograms of marginal parameter Fig. 1. Anytown: DW e roughness statistics for the marginal posterior pdf of parameters PG1 to PG6a 730 Kozelj, D. - Kapelan, Z.. - Novak, G. - Steinman, F. Strojniski vestnik - Journal of Mechanical Engineering 60(2014)11, 725-734 -0.1922 -0.0537 -0.0341 -0.0797 -0.0471 - -4*. A 0.0697 -0.1345 -0.0413 -0.0292 '.w A. -0.8817 -0.0287 0.1651 i -0.0033 -0.1349 Ji» A- «A!» i. 0.0084 -4 JL* . iilBP" ■¿M* it Ü& + PG1 PG2 PG3 PG4 PG5 PG6 Fig. 2. Anytown; histograms of marginal distributions and two-dimensional correlation plots of posterior parameter samples distributions and a two-dimensional correlation plot of posterior parameter samples for the fourth (i.e. mixed prior) approach. Correlation values of any pair of parameter groups are low, while only PG3 and PG4 share a higher correlation coefficient of -0.882. These features were also observed in [3]. 4 REAL-WORLD WDS CASE STUDY The aim of this section is to demonstrate the Bayesian framework of parameter estimation on a real-world WDS network and to show the effect of assumed prior pdfs on calibrated parameter values. The prior information approach presented in this paper is applied to exhibit its applicability to real-world WDS networks. The selected model parameters are the equivalent roughness e of the DW pipe friction model. The analysed system is part of a bigger WDS, but hydraulically independent of the rest of the WDS. The WDS of Šentvid serves a population of approximately 34,000 inhabitants, and its estimated average demand is 93.87 l/s. From the available WDS data, an Epanet2 hydraulic model was assembled consisting of three reservoirs, two tanks, three pumps, one pressure reducing valve, 812 junction nodes and 1072 pipes. The complete measurement campaign consists of 11 fire flow tests were performed throughout the WDS network. Sixteen pressure loggers (PL) (Memmy NT, measurement range: 0 to 20 bar, measurement error: ±0.05% max. measurement range)), four ultrasonic flow metering devices (Krohne UFM 610P, measurement range: 0.006 to 14.89 m/s, measurement error: ±2.0% (v > 1 m/s) and ±0.02 m/s (v < 1 m/s)) and SCADA measurements (five flow meters and two tank level gauges) were recording measurements. In this study, 11 steady-state hydraulic simulations were performed to represent the 11 fire flow loading conditions (LC) during the measurement campaign. A total of 176 observations (16 PL x 11 LC) are considered in the observational data set. Flow and SCADA measurements were used to define the boundary conditions of the hydraulic simulations. The PGs were established based on the criterion of pipe diameter, material and age, resulting in a total of 93 PGs. A second grouping criterion involved only pipe material and age, resulting in 25 PGs. Only the last criterion was investigated, since the quantity of observational data would not support the higher Marainal Dosterior density of individual Darameters Marninal nnsterinr density nf individual narameters o1—1-—^ —1—1-^— o1--u— ^—-'-'--O1--^-'- ~~ ' --------o1--^-1-1- ^ --- 12345 0123 02468 02468 LZDN 300 AC DN 150 PE d 90 NL DN 80 Fig. 3. Marginal posterior densities of the individual PGs (1 to 8) for the real-world WDS network and GL residual model parameters (9 to 12) (x indicates the maximum a posterior (MAP) values) Investigating Prior Parameter Distributions in the Inverse Modelling of Water Distribution Hydraulic Models 731 Strojniski vestnik - Journal of Mechanical Engineering 60(2014)11, 725-734 dimensionality of the parameter estimation problem [18]. For the GL residual model, we used fixed values of residual model parameters ^ = 0 and ^h = 0, while parameters o0, oj, ft, and £, were inferred with uniform prior distributions as described in Section 3. This resulted in a total of Ng = 29. The PGs prior pdf were estimated by literature given pipe roughness values e for different pipe materials [19]. The parameter values of the gamma pdf p(6)~r(a,ft) were kept close to the higher estimates of roughness values for new pipes (i.e. a parameter), while their right-tailed shape provided a possible drift towards higher roughness values if some pipe aging was present (i.e. ft parameter). The DREAM(ZS) algorithm was set up with the same parameters as in Section 3, except Ndim = Ng = 29 and Neval = 75,000. Approximate posterior parameter pdfs and of equivalent roughness s in [mm] maximum a posterior (MAP) values are given in Fig. 3. Additionally, posterior densities for the GL parameters 8e are provided in Fig. 3 (numbers 9 to 12). The inferred GL parameters ft = 1 and £ = 1 indicate that SEP distribution of residuals is symmetrically double exponentially distributed. Standard deviations o0 and oj show small heteroscedasticity. The first four PGs (Fig. 3, 1 to 4) show high parameter sensitivity; therefore, the uniform prior pdf was p(6>)~^(0.001, 15). All other PGs had gamma prior pdf applied. Parameter uncertainty can be expressed in terms of the spread of the posterior marginal parameter pdf. A greater spread indicates higher uncertainty. The asbestos-cement (AC) and ductile iron (NL) PGs have a narrower posterior pdf in combination with a uniform prior pdf, indicating smaller parameter uncertainty for those two PGs. In contrast, some PGs (e.g. cast iron (LZ)) show higher parameter uncertainties due to their greater spread. The next four PGs (Fig. 3, 5 to 8) have smaller parameter sensitivity values and were inferred using a gamma prior pdf p(8)~ r(a,ft). As can be observed by the posterior pdf, a general shape of the gamma prior is recognizable, while the likelihood functions provided a drift towards the observational information content. Fig. 4 illustrates how the marginal posterior pdf (i.e. parameter uncertainty) translates into a 95% pressure head predictive uncertainty. The 65 60 55 50 45 40 35 30 25 20 15 Number of observations Fig. 4. Real-world WDS: 95% posterior parameter (dark grey) and prediction (light grey) uncertainty ranges and corresponding pressure observations (solid circles) U) 50 RMSE = 0.96408 R2 = 0.9903 Bias = -0.028863 35 40 45 50 Observational data [m] -1 0 1 Standard normal quantiles Fig. 5. Real-world WDS: a) scatter plot of observational data against model predictions, b) residuals as a function of model predictions 732 Kozelj, D. - Kapelan, Z. - Novak, G. - Steinman, F. 70 3 2 60 m1 ri 0 40 30 20 20 25 30 55 60 65 2 3 Strojniski vestnik - Journal of Mechanical Engineering 60(2014)11, 725-734 light grey region depicts the predictive uncertainty, while the dark grey region corresponds to parameter uncertainty. These and other results from both case studies are shown for the calibration data set only, i.e. no results are shown for the validation data set. Since observational data is very limited, it was all used for calibration only. Ideally, validation on an independent data set should be done. The WDS model fits very well with the observational data with an associated RMSE of 0.458 m. Additionally, Fig. 4 shows that all observations fall inside the 95% predictive uncertainty bounds. In a post-processing analysis, assessment of the underlining assumptions made in Section 2.1 is required, i.e. the likelihood function. Two diagnostic tests were conducted to verify the assumptions on the statistical model of residuals. Fig. 5a plots the model predictions against the observational data. In addition to the RMSE, the coefficient of determination R2 = 0.997 and bias = -0.052 indicate a very good model fit. Fig. 5b presents residuals as a function of model predictions. It can be observed that residual show some heteroscedastic behaviour. We can, therefore, conclude that the model residual distribution, the posterior parameter pdf and predictive uncertainties are adequately represented. 5 CONCLUSIONS This paper presents a study of uncertainty analyses of pipe roughness parameter estimates, their corresponding parameter and predictive uncertainties. The analyses were conducted on a hypothetical and a real-world WDS model. Identifiability of pipe roughness parameters is difficult, especially in a real-world WDS model due to the limited information content of the observational data. Mapping samples from prior distributions of the parameter space to the likelihood space results in the identification of plausible ranges of parameter sets through given observational data and allows estimation of both types of uncertainties. The generalized likelihood function was used to adequately represent the residual distribution. Using this formal Bayesian approach, the inference should lead to unbiased parameter estimates [2]. Incorporation of the prior distribution has proved to be an efficient and effective approach to estimate the posterior parameter pdf. We used three different prior pdfs. The results of this study demonstrate that prior information on pipe roughness parameters and correct representation of residual distributions significantly improves identifiability and reduces parameter and predictive uncertainties. Since definition of prior pdf is difficult, we suggested an approach that resembles the parameter identifiability. It proves to be important to provide accurate prior information in order to narrow the ranges of uncertainties of posterior parameter pdfs and to obtain confidence in the optimised/expected parameter values [17]. Using this approach, we successfully inferred the posterior parameter pdf and derived parameter and predictive uncertainties for a real-world WDS model. 6 ACKNOWLEDGEMENTS We are obliged to Jasper A. Vrugt and Cajo ter Braak for providing the code of the DREAM(ZS) algorithm and graphical post-processing software. 7 REFERENCES [1] Savic, D.A., Kapelan, Z.S., Jonkergouw, P.M.R. (2009). Quo vadis water distribution model calibration? Urban Water Journal, vol. 6, no. 1, p. 3-22, D0I:10.1080/15730620802613380. [2] Schoups, G., Vrugt, J.A. (2010). A formal likelihood function for parameter and predictive inference of hydrologic models with correlated, heteroscedastic, and non-Gaussian errors. Water Resources Research, vol. 46, no. 10, p. 1-17, D0I:10.1029/2009WR008933. [3] Kapelan, Z.S., Savic, D.A., Walters, G.A. (2007). Calibration of water distribution hydraulic models using a Bayesian-type procedure. Journal of Hydraulic Engineering, vol. 133, no. 8, p. 927-936, D0I:10.1061/ (ASCE)0733-9429(2007)133:8(927)). [4] Alvisi, S., Franchini, M. (2010). Pipe roughness calibration in water distribution systems using grey numbers. Journal of Hydroinformatics, vol. 12, no. 4, p. 424-445, D0I:10.2166/hydro.2010.089. [5] Ter Braak, C.J., Vrugt, J.A. (2008). Differential evolution Markov chain with snooker updater and fewer chains. Statistics and Computing, vol. 18, no. 4, p. 435-446, D0I:10.1007/s11222-008-9104-9. [6] Scharnagl, B., Vrugt, J.A., Vereecken, H., Herbst, M. (2011). Inverse modelling of in situ soil water dynamics: Investigating the effect of different prior distributions of the soil hydraulic parameters. Hydrology and Earth System Sciences, vol. 15, no. 10, p. 3043-3059, D0I:10.5194/hess-15-3043-2011. [7] Mays, L.W. (2000). Water Distribution System Handbook. McGraw-Hill Professional, New York. [8] Rossman, L.A. (2000). EPANET 2 - User manual. United States Enviromental Protection Agency, Washington, D.C. [9] Box, G.E.P., Tiao, G.C. (1992). Bayesian inference in statistical analysis. Wiley Classics Library. John Wiley & Sons, New York, D0I:10.1002/9781118033197. [10] BranisavljeviC, N., Prodanovic, D., Ivetic, M. (2009). Uncertainty reduction in water distribution Investigating Prior Parameter Distributions in the Inverse Modelling of Water Distribution Hydraulic Models 733 Strojniski vestnik - Journal of Mechanical Engineering 60(2014)11, 725-734 network modelling using system inflow data. Urban Water Journal, vol. 6, no. 1, p. 69-79, D01:10.1080/15730620802600916. [11] Banovec, P., Kozelj, D., Šantl, S., Steinman, F. (2006). Sampling design for water distribution system models by genetic algorithms. Strojniški vestnik—Journal of Mechanical Engineering, vol. 52, no. 12, p. 817-834. [12] Kang, D.S., Pasha, M.F.K., Lansey, K. (2009). Approximate methods for uncertainty analysis of water distribution systems. Urban Water Journal, vol. 6, no. 3, p. 233-249, D0I:10.1080/15730620802566844. [13] Seifollahi-Aghmiuni, S., Haddad, O.B., Omid, M.H., Marino, M.A. (2013). Effects of pipe roughness uncertainty on water distribution network performance during its operational period. Water Resources Management, vol. 27, no. 5, p. 1571-1599, D0I:10.1007/s11269-013-0259-6. [14] Kuščer, L., Diaci, J. (2013). Measurement Uncertainty assessment in remote object geolocation. Strojniški vestnik - Journal of Mechanical Engineering, vol. 59, no. 1, p. 32-40, D0I:10.5545/sv-jme.2012.642. [15] Ormsbee, L.E. (1989). Implicit network calibration. Journal of Water Resources Planning and Management, vol. 115, no. 2, p. 243-257, D0I:10.1061/(ASCE)0733-9496(1989)115:2(243). [16] Travis, Q.B., Mays, L.W. (2007). Relationship between Hazen-William and Colebrook-White roughness values. Journal of Hydraulic Engineering-ASCE, vol. 133, no. 11, p. 1270-1273, D0I:10.1061/(ASCE)0733-9429(2007)133:11(1270). [17] Kapelan, Z., Savic, D.A., Walters, G .A. (2004). Incorporation of prior information on parameters in inverse transient analysis for leak detection and roughness calibration. Urban Water Journal, vol. 1, no. 2, p. 129-143, D0I:10.1080/15730620412331290029. [18] Giustolisi, O., Berardi, L. (2011). Water distribution network calibration using enhanced GGA and topological analysis. Journal of Hydroinformatics, vol. 13, no. 4, p. 621-641, D0I:10.2166/hydro.2010.088. [19] Lamont, P.A. (1981). Common pipe-flow formulas compared with the theory of roughness. Journal American Water Works Association, vol. 73, no. 5, p. 274-280. 734 Kozelj, D. - Kapelan, Z.. - Novak, G. - Steinman, F.