Metodološki zvezki, Vol. 4, No. 2, 2007, 165-176 The Scaling Problems in Service Quality Evaluation Michele Gallo1 Abstract In service quality evaluation we have to treat data having different kinds of scales. In order to obtain a measure of the service quality level a conventional ordinal rating scale for each attribute of a service is used. Moreover additional information on the customers or on the objective characteristics of the service is available (interval, ordinal and or categorical scale). In the latter the importance or weight assigned to the different items must be also considered (compositional scale). To analyze these different kinds of data particular precaution should be used, a transformation of quality level perceived (expected) data in quantitative scale is carried out before a multidimensional data analysis. In literature more techniques are proposed for the quantification of ordinal data preserving the original characteristics. The aims of this paper are to analyze different ways to quantify ordinal data, and illustrate how the additional information on the customers or on the service could be used in the multidimensional analysis as external information. 1 Introduction Customer satisfaction has become a vital concern for companies and organizations in their efforts to improve product and service quality, and maintain customer loyalty within a highly competitive market place. It is conceptualized as an affectively laden “fulfillment response” to service received (Oliver, 1997). To obtain a measure of satisfaction is not a simple meter because satisfaction is mostly due to physiologically conditioned assessments. It reflects both emotional and cognitive elements (Oliver 1993). In the last decade, more Customer Satisfaction Indexes (CSI) have been proposed (e.g., USA, Fornell et al., 1996; European Union, ECSI Technical Committee, 1998) where the structure of all CSI are continually undergoing review and subject to modifications. If the structure of CSI(s) are continually subject to modifications, the core of the model is in most respects standard. It is encased within a system of cause and 1 Department of Social Science, University of Naples - L’Orientale, L.go S. Giovanni Maggiore 20, 80156 Naples, Italy; mgallo@unior.it 166 Michele Gallo effect running from the antecedents of overall customer satisfaction – quality expectation, quality perceived, perceived value and image – to the consequences of overall customer satisfaction – customer loyalty and customer complaints. Eklöf (2000) proposed to distinguish between perceived product quality and perceived service quality where perceived service quality is the evaluation of recent consumption experience of associated services like customer service, conditions of product display, range of services and products etc. On this basis, we posit that service quality is primarily an antecedent of customer satisfaction (Fornell et al., 1996). In this work, we have not taken care of the problems associated with the choice of the “better” CSI measurement framework. In the following just the SERVQUAL model (Parasuraman et al., 1988) is considered for measuring the service quality (SQ). Anyway most of the results can be generalized to the other models (D’Ambra and Gallo, 2006). The SERVQUAL have a structure based on a set of attributes and dimensions. Where each attribute is evaluated by an item and sets of items giving the evaluation of the dimensions. The item has the same ordinal rating scale with a ranging on seven scores. Moreover an importance or weight (C) is attached to each dimension (or attribute) that is principally used to weigh the gap between performance perception (R) and service quality expectation (E). Whereas the weight could be analyzed independently to obtain information on the nature and causes of the interrelationships between the quality dimensions (or attributes). In this case, we should consider the constraint ratio scale of the importance data. Likewise, in these studies some additional information is available which permits us to investigate the degree of satisfaction of homogeneous customer clusters (gender, age, profession, level of education, etc.) with respect to some objective characteristics of the service (for example in hospital: procedure, illness, etc.). The inclusion of additional information in the multidimensional analysis of SQ data could be used to obtain more accurate results. Let RN, J ( EN, J ) be the matrix of perceived (expected) results where N are the customers and J are the ordinal variables (for each item there is one variable) with the same number of categories l. And let CN, D be the matrix of importance with D variables. The nature of data should be considered before we carry out a multidimensional analysis. In particular, perception (expectation) evaluation has an ordinal scale. This scale establishes an explicit rank, but not all arithmetic transformations are meaningful because the distances between points on an ordinal scale are not meaningful. The importance data has a constrained ratio scale. For these data intervals between values and ratios of values are meaningful but the constrained of the unit-sum of the composition scale causes more problems (Aitchison, 1986). Therefore additional information could have different kinds of scales: as categorical (or nominal) scale, where there is no explicit ranking on the The Scaling Problems in Service… 167 category labels; interval scale, where the distances between data are meaningful but where zero is not meaningful; and where there is an ordinal and ratio scale. The principal purpose of this paper is to point out the necessity to transform the row data before we carry out some multidimensional statistics analysis, while the transformation should respect the original nature of the data. Moreover, the inclusion of additional information could be particularly appropriate in the framework of SQ, because it allows us to know the degree of satisfaction of homogeneous subject clusters evaluating the SQ in a more precise and objective way. In Section 2, we consider the quantification of perceived (expected) data. In particular, after a brief review of some technique for the optimal scaling of ordinal data, where the optimal scaling is defined in terms of the correlation matrix of quantified variables. We propose a new way to quantify the perceived (expected) data based on the conservation of the different subjective scales of each customer. In Section 3, we define the properties of compositional data and propose a logcontrast transform of the matrix CN, D in order to perform multivariate analysis. In Section 4, a presentation of how the additional information could be included into a multidimensional analysis of SQ data is given. 2 Some approaches to the optimal quantification of ordinal data In literature different approaches are proposed to quantify the ordinal data. Most of them are based on a loss function to attain a minimum between the transformed variables, so one possibility is to use the mean squared euclidean distance between the transformed variables and one hypothetical common variable. A generic loss function is -1 ?Jj= 1 SSQ(Xßj -fj (hj )) J -1 ?Jj= 1 SSQ(Xß j - fj (hj )) (2.1) where X is a matrix of basis vectors of order (N x s), ß j is a vector of s loadings and fj (hj ) may be any non linear function of the variable hj ( j =1,K,J ). Equation (2.1) is used by Kruskal and Shepard (1974), Young, de Leeuwe and Takane (1976), and many others. De Leeuwe, van Rijckevorsel (1980) and Gifi (1982) use the following alternative loss function J -1 ?Jj= 1 SSQ(X - fj (hj )) (2.2) where the weights ß j are incorporated into function fj ( . ). Equation (2.2) is used because the treatment of missing data becomes more simple, and both variables 168 Michele Gallo with a single quantification and variables with multiple ones can be analyzed simultaneously. We interpret fj ( . ) as an approximation of a function based on a representation of some finite amount of data on that function. In the present case, we would like to incorporate into the analysis the underlying monotone structure of data, so that each variable is treated as ordinal. Moreover the rank-one restrictions are included in (2) which implies that the quantifications in s dimensional space of each variable hj become proportional to each other (Gifi, 1982). For the generic variable hj , we can write fj (hj ) = Gj qj ß j ' , where qj is a vector of single category quantification, ß j a vector of loadings and Gj is a matrix which indicates the category of the jth variable. Here we propose a B-spline transformation because it has more attractive proprieties (van Rijckevorsel and De Leeuwe, 1988). Nevertheless, we could use a Fuzzy coding or a Monotone-spline (van Rijckevorsel, 1987; Winsberg and Ramsay, 1983). 2.1 ß-Spline transformation The B-spline takes a variable as input and produces more than one variable as output. By B-spline a variable hj (hj Î [a,b] with a¹b) is partitioned into a number of intervals, where two boundary points, called knots, restrict every interval: a£k 1< KkBk ( hj ) , the unknown spline coefficient a i is the only determining parameter of (Van The Scaling Problems in Service… 169 Rijckevorsel, 1987). Nevertheless, other methods to determine the optimal coefficient are proposed in literature (De Boor, 1978; Schumaker, 1981). 2.2 A quantification of perceived/expected data for the multidimensional analysis In Section 2.1 we have considered an approach based on the difference between the variables where each categorical value of the jth column of RN, J ( EN, J ) is substituted from the correspondent score of the vector qj . In this way categories of the same variable are quantified similarly and categories of dissimilar variables are quantified differently (q j ¹ q j' with j¹ j' ). This characteristic is not required in the quantification of perceived (expected) data. An additional effect of this kind of transformation is that costumers with the same rate for an item obtain the same quantification whereas a customer with the same rate for different items obtains dissimilar quantifications. An alternative approach is based on the difference between the “psychological” scale used with respect to different customers. Following this approach, the quantification procedure has to preserve the different origin of the measuring system of each customer, the different distance between two points on the scale as the non-linear distance between two subsequent points. Finally, by the time that a SERVQUAL questionnaire is drawn up each customer has his own specific reference system. This is the same for each item of the questionnaire and it does not change at the moment but only after a long time. Figure 1: The approach based on the differences between the variables (a) and the single quantification based on the difference between the subjective scale of the customers (b). To give a graphic example, Figure 1 compares the approach presented in the first section (a) and the alternative one based on the difference between the 170 Miche le Gallo customers (b). It is clear that by the first approach non homogenous customers have the same quantification, while in the second one the difference between the customers is respected. Moreover, each customer has a single quantification for all the items of the questionnaire and, in respect to the specific reference system, each customer has a quantification that preserves the subjective origin of the measuring system and the different distance between two points on the scale as the non-linear distance between two subsequent points. In order to obtain a quantification that present the characteristics of the alternative approach we propose the following strategy. Before each generic row of matrix RNJ (ENJ) is crisp coding in a matrix Gi (J x t) in accordance with Section 2.1. But the jth row (with j = 1,K, J) of Gi is the coding of the rate that the ith subject has given to the different items. Afterwards an Alternative Least Squares (ALS) algorithm is engaged to minimize the following loss function N-1YUN = 1 SSQ(X-Gi q iVi') (2.3) with ˆ1sX = 0J, X'X = JI and ßi 'Dißi =1, where ˆ1s is a s dimensional vector of unit, 0J is a J dimensional vector of zero, X a matrix of order (J x s), ßi is a vector of score, qi is a vector of single category quantification for each single customer, and Di =Gi 'Gi. Equation (2.3) presents the rank-one restrictions (Yi =qißi') that implies the quantifications in s dimensional space of each customer. To minimize the loss function (4) the ALS algorithm is proposed to search the optimal solution through the satisfaction of the two centroid principles with respect to X and each Yi (van Rijckevorsel, 1987). The algorithm is given by the following step: - Step 0 Initialize the matrix X by a singular decomposition analysis of R (E) so that 1ˆsX = 0 and X'X = JI - Step 1 Estimate the matrix Yi = Di -1GiX, iÎ N - Step 2 Estimate the vector ßi = Yi'Diqi/qi'Diqi, iÎ N - Step 3 Estimate the vector of single quantification qi = Yi'ßi/ßi'ßi, iÎ N - Step 4 Update the matrix Yi = q ißi', i Î N - Step 5 Estimate the matrix X = N-1 ^iN = 1GiY ˆ i - Step 6 Center and orthonormalize the matrix X - Step 7 Go to Step 1 until the convergence criterion is reached. The Scaling Problems in Service… 171 By the single quantification qi of each customer we are coding the perceived (expected) matrix into the quantified matrix R* (E*). In this way, the subjective scale of each customer has been respected. Further a multidimensional analysis of the gap between the performance and expectation, or the performance and expectation data, could be carried out separately. The B-spline takes a variable as input and produces more than one variable as output. By B-spline a variable hj (hjÎ [a,b] with a¹b) is partitioned into a number of intervals, where two boundary points, called knots, restrict every interval: a£ k 1 ( × ) (Van Rijckevorsel, 1987). Nevertheless, other methods to determine the optimal coefficient are proposed in literature (De Boor, 1978; Schumaker, 1981). 3 Compositional data The compositional data have particular properties that pose special problems for imputation and they can rarely be analyzed with the usual multivariate statistical methods. For each row of matrix CN, D we define c1,K,cD as positive quantities with the same measurement scale c =(c1,K,cD) c1 ³ 0,K,cD ³ 0 and |c| the trace of c. The vector c is the basis of compositional data and c = c ~|c| is a composition vector. 172 Michele Gallo More generally, we define CN, D a compositional data matrix if all elements are positive and each row is constrained to the unit-sum C1D = ˆ1N where 1D and ˆ1N are vectors of units of D and N dimension, respectively. Let Q = [NIN -1N1'N ] be the product between 1/Nand the usual centering projector then C'QC is the covariance matrix of C called crude covariance matrix (Aitchison, 1986). The unit-sum constraint for each row of C implies four difficulties: 1) Negative bias, 2) Subcomposition, 3) Basis, 4) Null correlation. Each row and column of C'QC has zero-sum: 1DC'QC = 0D where 0D is a D dimensional vector of zero. Therefore each variable has a covariance sum equivalent to negative variance (the first difficulty). No-relationship exists between the crude covariance matrix and the crude subcomposition covariance one. Therefore the variation of subcomposition can substantially influence the covariance (the second difficulty). Likewise in the subcomposition, it is not easy to select a basis c ~ for the composition (which is the third difficulty). Like the crude covariance matrix, each row and column of the crude correlation matrix of C has a zero-sum. Therefore the correlation between two variables is not free to range over the usual interval [-1, 1]. The negative bias causes a radical difference from the standard interpretation of correlation between variables. Zero correlation between two ratios does not mean that there is no association (the latter difficulty). Moreover the uninterpretable crude covariance structure is not the only problem of compositional data. Unfortunately, compositional data often exhibit curvature when standard multivariate methods are employed. Aitchison (1986) richly described the properties of compositional data and proposed an alternative form of logratio, where the more useful is based on a geometric mean g(c). Replacing the natural non-negative condition by the following stronger assumption of the strict positive quantities: wi 1 >0,K,wiJ >0 (see Gallo, 2003); Aitchison (1982) proposes to transform each element of C (cij) in the logratio log[cij /g(c)], because the relative matrix of centred logratio Z, with generic element zij = log[cij/g(c)]is adequate for a low-dimensional description of compositional variability. Moreover, a generalization of the logratios - called logcontrasts - have particular and researched properties in compositional data analysis. Logcontrast of c is any loglinear combination u'logc = u1logc1+ K + uDlogcD with u1+ K + uD =0, where of logcontrast with the geometric mean g(c) presents the property: u' logc = u' log(c/g(c)). The study of composition is essentially concerned with the relative magnitudes of ci 1,K,ciD rather than their absolute values. In this case, ratios between components are meaningful, and those ratios are independent from the arbitrary total. Moreover any logcontrast is scale free: u' logc = u' logkc (with k > 0 ). The Scaling Problems in Service… 173 Aitchison (1986) richly describes how the logcontrast transformation is adequate to resolve the difficulties of compositional data. Barceló-Vidal, Martin-Fernández and Pawlowsky-Gòahn, (2001) show, from a mathematical point of view, that this transformation is not arbitrary. 4 External information in CS analysis Before a multidimensional analysis of the matrix R* we could include the additional information available on the customers or process (Takane and Shibayama, 1991). In this way the additional information defines a priori levels of sampling hierarchical structures, since it permits us to investigate how well structures supplied by the a priori information can account for the data. Let H (N,Q with Q the number of predictor categories, Q £ N) be the external informational matrix, which can take a variety of forms. We can consider the following decomposition model of the matrix R*: R* = HT + E (4.1) where T = (H'H)-H'R* is the estimated coefficient matrix and E the error matrix. Each term of the model is column-wise orthogonal which implies that the sum of squares of T is decomposed into the sum of squares of the components of (5). The problem of estimating T is equivalent to minimizing SS(E) = tr(E'E) where E = R*-HT = R*-PHT and PH = H(H'H)-1H' is the orthogonal projection operator onto the sub space spanned by the column vectors of H, so that (4) is decomposed into two additive components: R*=PHR*+PH ^R* where PH ^ is the orthogonal projection operator that is orthogonal to PH. Once the data matrix is decomposed according to the additional information (External Analysis), multidimensional analysis is carried out on PHR* and PH ^R* separately (Internal Analysis). An analysis of PHR* allows us to incorporate the external information into the analysis, whereas the analysis of PH ^R* allows us to exclude them. Incorporating the external information we have the evaluation of the performance perception in a more precise and objective way because we have the degree of satisfaction of homogeneous customer clusters. Differently, incorporating the second additive component (analysis on PH ^R*) we have the evaluation of the performance perception while excluding the influence of the external information. Similarly we carry out an analysis with the external information on the matrices E*, R* -E*, C* and so on. It is also possible to include the external information before the quantification of the raw data (D’Ambra et al., 2002) to obtain a more parsimonious 174 Michele Gallo representation of the data. Nevertheless we have to respect the original scale of the external information and it is not always possible if the external information has different scale systems. 5 Conclusion and perspectives The central theme of this paper is the quantification of expected (perceived) data. Different scaling methods are proposed in literature to quantify these data (see Zanella, 1999) and most of them have a large number of researched properties. Nevertheless, here is a new approach to scaling expected (perceived) data because the preservation of the subjective scale of each customer is necessary for an accurate multivariate analysis. Moreover a strategy that preserves the rule and the properties of the row data that we collect with a SQ analysis is a secondary aim of this paper. The strategy that we propose is based on the following steps: · quantification of expected (perceived) data by an approach that preserves the subjective scale of each customer, · inclusion of the additional information available on the customers or process, · research of the latent factors by more independent analysis. As further developments, we are comparing the different scaling methods with the appreciable monotone property. For example, the M-splines are proportional to B-splines and a basis of integrated M-splines (I-spline) have the characteristics of a probability distribution function. Moreover we are checking on real data for the benefit to use the approach that we have proposed. Relative to the importance data, the compositional scale of the kind of data should be considered because the analysis of these data without transformation give misleading information (Aitchison, 1986). Logcontrast transformation is the most used for compositional data afterward this data could be used in SQ analysis in accordance to the strategy given before. Acknowledgement August Viglione is thanked for revising the original text. The referees are also thanked for many useful comment. The present paper is financially supported by the University of Naples – L’Orientale (Department of Social Science 60% fund 2006). The Scaling Problems in Service… 175 References [I] Aitchison, J. (1982): The statistical analysis of compositional data (with discussion). Journal of the Royal Statistical Society, 44, 139-177. [2] Aitchison, J. (1986): The Statistical Analysis of Compositional Data. London: Chapman and Hall. [3] Barceló-Vidal, C., Martín-Fernández, J.A., and Pawlowsky-Glahn, V. (2001): Mathematical Foundations of Compositional Data Analysis. Proceedings of IAMG’01 – The sixth annual conference of the International Association for Mathematical Geology. Electronic publication. [4] D’Ambra, L., Amenta, P., and Gallo, M. (2002): Riflessioni sulla Valutazione dei Servizi di Day Surgery nel contesto dell’Analisi Multidimensionale dei dati. In Frosini et al. (Eds): Vita & Pensieri, 153-165. [5] D’Ambra, L., Amenta, P., and Gallo, M. (2006): La valutazione della Customer Satisfaction. In Carpita et al. (Eds):Guerini Studio, 267–290. [6] De Boor, C. (1978): A pratical Guide to Spline. Berlin: Springer-Verlag. [7] De Leeuw, J. and van Rijckevorsel, J. (1980): Homals en Princals. In Diday et al. (Eds): Data Analysis and Informatics. [8] ECSI Technical Committee (1998): European Customer Satisfaction Index, Foundation and Structure for Harmonized National Pilot Projects. Report Prepared for the ECSI Steering Committee. [9] Eklöf, J.A. (2000): European Customer Satisfaction Index Pan-European Telecommunication Sector Report Based on the Pilot Studies 1999. Stockholm, Sweden. European Organization for Quality and European Foundation for Quality Management. [10] Fornell, C., Johnson, M.D., Anderson, E.W., Cha, J., and Bryant, B.E. (1996): The American customer satisfaction index. Nature, purpose and findings. Journal of Marketing, 60, 7-18. [II] Gallo, M. (2003): Partial Least Squares for compositional data: An approach based on the splines. Italian Journal of Applied Statistics, 15, 349-358. [12] Gifi, A. (1982): Princals User’s Guide. Leiden: Department of Data Theory. [13] Kruskal, J.B. and Shepard, R.N. (1974): A nonmetric variety of linear factor analysis. Psychometrika, 39. [14] Oliver, R.L. (1993): A conceptual model of service quality and service satisfaction: compatible goals, different concepts. In Advances in Service Marketing and Management: Research and Practice, 2, JAI Press. [15] Parasuraman, A., Zeithman, V.A., and Berry, L.L. (1994): Reassessment of expectations as a comparison standard in measuring service quality: Implications for further research. Journal of Marketing, 58, 111-124. [16] Ramsay, J.O. (1982): When data are functions. Psychometrika, 47, 379-396. 176 Michele Gallo [17] Schumaker, L.L. (1981): Spline Functions. New York: John Wiley & Sons. [18] van Rijckevorsel, J. and De Leeuw, J. (1988): Component and Correspondence Analysis Dimension Reduction by Functional Approximation. New York: John Wiley & Sons. [19] van Rijckevorsel, J. (1987): The Application of Fuzzy Coding and Horseshoes in Multiple Correspondence Analysis. DSMO Press. [20] Winsberg, S. and Ramsay, J.O. (1980): Monotonic transformations to additivity using splines. Biometrika, 67, 669 - 674. [21] Winsberg, S. and Ramsay, J.O. (1983): Monotone spline transformations for dimension reduction. Psychometrika, 48. [22] Young, F.M., De Leeuwe, J., and Takane, Y. (1976): Regression with qualitative variables: an alternating least squares method with optimal scaling features. Psychometrika, 41. [23] Young, F.M., Takane, Y., and De Leeuwe, J. (1978): The principal components of mixed measurement level multivariate data: an alternating least squares methods with optimal scaling features. Psychometrika, 46, 357-388. [24] Zanella, A. (1999): A Stochastic model for the analysis of customer satisfaction: some theoretical aspects. Statistica, LIX.