Volume 8 issn 1581-6311 Number 2 Summer 2010 Managing Ö Global • editor Boštjan Antončič Research re; Journal ) ournal Managing Global Transitions International Research Journal editor Boštjan Antončič, University of Primorska, Slovenia associate editors Roberto Biloslavo, University of Primorska, Slovenia Štetan Bojnec, University of Primorska, Slovenia Evan Douglas, Sunshine State University, Australia Robert D. Hisrich, ThunderbirdSchool of Global Management, usa Mitja Ruzzier, University of Primorska, Slovenia Anita Trnavčevič, University of Primorska, Slovenia Zvone Vodovnik, University of Primorska, Slovenia editorial board Zoran Avramović, University of Novi Sad, Serbia Terrice Bassler Koga, Open Society Institute, Slovenia Cene Bavec, University of Primorska, Slovenia Jani Bekö, University of Maribor, Slovenia Andrej Bertoncelj, University of Primorska, Slovenia Heri Bezić, University of Rijeka, Croatia Vito Bobek, University of Maribor, Slovenia Branko Bučar, Pace University, usa Suzanne Catana, State University of New York, Plattsburgh, usa David L. Deeds, University of Texas at Dallas, usa David Dibbon, Memorial University of Newfoundland, Canada Jeffrey Ford, The Ohio State University, usa Ajda Fošner, University of Primorska, Slovenia William C. Gartner, University of Minnesota, usa Tim Goddard, University of Prince Edward Island, Canada Noel Cough, la Trobe University, Australia George Hickman, Memorial University of Newfoundland, Canada Andräs Inotai, Hungarian Academy of Sciences, Hungary Hun Joon Park, Yonsei University, South Korea Stefan Kajzer, University of Maribor, Slovenia Jaroslav Kalous, Charles University, Czech Republic Maja Konečnik, University ofljubljana, Slovenia Leonard H. Lynn, Case Western Reserve University, usa Monty Lynn, Abilene Christian University, usa Neva Maher, Ministry of Labour, Family and Social Affairs, Slovenia Massimiliano Marzo, University of Bologna, Italy Luigi Menghini, University of Trieste, Italy Marjana Merkač, Faculty of Commercial and Business Sciences, Slovenia Kevin O'Neill, State University of New York, Plattsburgh, usa David Oldroyd, Independent Educational Management Development Consultant, Poland Susan Printy, Michigan State University, usa Jim Ryan, University of Toronto, Canada Hazbo Skoko, Charles Sturt University, Australia David Starr-Glass, State University of New York, usa Ian Stronach, Manchester Metropolitan University, u K Ciaran Sugrue, University of Cambridge, uk Zlatko Šabič, University of Ljubljana, Slovenia Mitja I. Tavčar, University of Primorska, Slovenia Nada Trunk Sirca, University of Primorska, Slovenia Irena Vida, University of Ljubljana, Slovenia Manfred Weiss, Johan Wolfgang Goethe University, Germany Min-Bong Yoo, Sungkyunkwan University, South Korea Pavel Zgaga, University of Ljubljana, Slovenia editorial office University of Primorska Faculty of Management Koper Cankarjeva 5, si-6104 Koper, Slovenia Phone: ++386 (0)5 610 2021 E-mail: mgt@fm-kp.si www. ti igt.fn i-kp. si Managing Editor: Alen Ježovnik Editorial Assistant: Tina Andrejašič Copy Editor: Alan McConnell-Duff Cover Design: Studio Marketing jwt Text Design and Typesetting: Alen Ježovnik Managing Global Transitions International Research Journal volume 8 • number 2 • summer 2010 • issn i58i-63ii Table of Contents 123 Iranian Angle on Non-Audit Services: Some Empirical Evidence Mahdi Salehi Mel idi Mor adi 145 Assessing Microfinance: The Bosnia and Herzegovina Case Anne Welle-Strand Kristinu Kjellesdal Nick Sitter 167 Application of Bootstrap Methods in Investigation of Size of the Granger Causality Test for Integrated var Systems Lukasz Lach 187 Development, Validity and Reliability of Perceived Service Quality in Retail Banking and its Relationship With Perceived Value and Customer Satisfaction Aleksandra Pisnik Korda Boris Snoj 207 Achieving Increased Value for Customers Through Mutual Understanding Between Business and Information System Communities Dijana Močnik Iranian Angle to Non-Audit Services: Some Empirical Evidence Mahdi Salehi Mehdi Moradi The purpose of this paper is to show different Iranian accountants' as well shareholders' ideas on Non-audit services and their effects on audit independence in Iran. In other words, in this paper the authors have attempted to deal with this question: does providing non-audit services by an Iranian auditor impair audit independence? And in order to gather usable data a suitable questionnaire was designed and developed. The results of this study show that the participants strongly believe that non-audit services may impair audit independence. It is interesting to note that, although the auditors offer to clients non-audit services, they believe that offering such services leads to audit independence being questionable. Further, the result reveals that literate participants moderately agree that nas has a negative effect on audit independence, however illiterate participants strongly agree that nas has a negative affect on audit independence. This paper is the first paper which includes two groups of participants: the first group is auditors in general, or we can call them academiciana with pretensions to having auditing literacy and the second group is non- academician, including stakeholders who may not have auditing literacy skills. This may useful for future studies regarding the non-audit service and its effect on audit independence. Key Words: auditor, independence, non-audit services, Iran jel Classification: M41, M42 Introduction This paper provides some preliminary empirical evidence on the determinants and consequences (impairment of auditor independence) of Non-Audit Service (nas) provided by auditors in Iran. The requirement of auditor independence arises from the need to establish the independent auditor as an objective and trustworthy arbiter of the fair presenta- Dr Mahdi Salehi is an Assistant Professor in the Accounting and Management Department, Ferdowsi University ofMashhad, Iran. Dr Mahdi Moradi is an Assistant Professor in the Accounting and Management Department, Ferdowsi University ofMashhad, Iran. Managing Global Transitions 8 (2): 123-144 tion of financial results (Salehi and Nanjegowda 2006; Salehi 2007). Indeed, Mautz and Sharaf (1964) and Berryman (1974) posit that independence is the cornerstone of the audit profession and an essential ingredient of users' confidence in financial statements. Since independent auditors occupy a position of trust between the management of the reporting entity and users of its financial statements, they must be perceived to be operating independently on the basis of sound auditing standards and strong ethical principles. Over the years, an extensive literature on the subject of auditor independence has developed; a focal point of much of this literature has been to identify those factors which do and do not impact upon auditor independence. Among all the factors identified in the researches which might threaten the independence of the auditor, the provision of nas has been the subject of the most heated debate (Canning and Gwilliam 1999). Especially, the collapse of Enron in the us and the demise of Andersen have generally undermined confidence in the world's capital markets. Much of the concern has focused on accounting and auditing practices, and particularly on the independence of auditors. Auditor independence is fundamental to public confidence in the audit process and the reliability of auditors' reports (Salehi and Abedini 2008; Salehi 2008a). The audit report adds value to the financial statements provided by managers (capital seekers) to shareholders (capital providers) through the independent verification it provides Qohnstone, Sutton, and Warfiled 2001; Salehi, Mansoury, and Pirayesh 2008). The audit is not just a benefit to investors. It also reduces the cost of information exchange for both sides (Dopuch and Simunic 1982) and benefits management by providing a signaling mechanism to the markets that the information which management is providing is reliable (Salehi, Mansouri, and Azar 2009). It has been further argued that the auditors' liability insurance serves to indemnify investors against losses. So, the auditors must be independent in order to be patrons of the shareholder (s). However, from recent years on, the external auditing practice has become questionable just because of proving nas to the same clients. Before going to the heart of the problem here we are briefly explain the nature of independence. Independence One of the key factors of the auditor's work is independence, without independence users of financial statements cannot rely on the auditors' report (Barzegar and Salehi 2008). In short, the external system of audit, with its final product, the audit opinion, adds credibility to the financial statements so that users can rely on the information presented and, as a result, the entire system of financial reporting is enhanced (Sucher and Maclullich 2004). Furthermore, independence is the core of this system. In addition, the concept of audit independence is fuzzy, the rules governing it are complex and burdensome, and a re-examination is long overdue (Elliott and Jacabson 1992; Salehi and Azary 2008). De Angelo (1981) defined auditor independence as the conditioner probability of reporting a discovered bridge. Arens et al. (1999) defined 'independence in auditing' as taking an unbiased viewpoint in the performance of audit tests, the evaluations of the results and the issuance of audit reports. Independence includes the qualities of integrity, objectivity and impartiality. Knapp (1985) states the independence from a different angle. He views it as 'the ability to resist client pressure'. According to Flint (1988) independence, therefore, is not a concept which lends itself to universal constitution prescription, but one for which the constitution prescription will depend on what is necessary to satisfy the criteria of independence in the particular circumstances. The Independence Standard Board (2000) defines independence as: Freedom from pressures and other factors that impair, or are perceived to impair, an auditor's willingness to exercise objectivity and integrity when performing an audit; it is the absence of certain activities and relationships that may impair, or may be perceived to impair, an auditor's willingness to exercise objectivity and integrity when performing an audit. There are two approaches to audit independence which have commonly been referred to as independence of fact and independence of appearance. According to Mautz and Sharaf (1964), there are three dimensions of auditor independence which can minimize or eliminate potential threats to the auditor's objectivity: 1. Programming independence includes: freedom from managerial interference with the audit program; freedom from any interference with audit procedures; and freedom from any requirement for the review of the audit work other than that which normally accompanies the audit process. 2. Investigative independence encompasses: free access to all records, procedures, and personnel relevant to the audit; active co-operation from management personnel during the audit examination; freedom from any management attempt to specify activities to be examined or to establish the acceptability of evidential matter; and freedom from personal interests on the part of the auditor leading to exclusions from or limitations on the audit examination. 3. Reporting independence includes: freedom from any feeling of obligation to modify the impact or significance of reported facts; freedom from pressure to exclude significant matters from internal audit reports; avoidance of intentional or unintentional use of ambiguous language in the statement of facts, opinions, and recommendations and in their interpretations; and freedom from any attempt to overrule the auditor's judgment as to either facts or opinions in the internal audit report. The immediate objective of the audit is to improve the reliability of information used for investment and credit decisions; according to Elliott and Jacabson (1992) the principles of independence are as follows: Audit independence improves the cost-effectiveness of the capital market by reducing the likelihood of material bias by auditors that can undermine the quality of the audit. Therefore, they play a vital role in the economic sector. However, some factors may have a negative effect on independence; these should be identified by professionals, and severe action should be taken to reduce such factors. Factors Affecting Independence Several situations may impair the auditor's independence, such as contingent fee arrangements, gifts, auditor's contact with personnel or operations, nas, outsourcing, opinion shopping, reporting relationships, and other matters. Among the factors that affect auditor independence that have been studied are: 1. The effects of gifts (Pany and Reckers 1988) 2. The purchase discount arrangement (Pany and Reckers 1988) 3. The audit firm size (Shockley 1981; Gul 1989; Salehi 2008b) 4. The provision of Management Advisory Services ( m a s ) by the audit firm (Shockley 1981; Knapp 1985; Gul 1989; Bartlett 1993; Teoh and Lim 1996; Abu Bakar, Abdul Rahman and Abdul Rashid 2005) 5. The client's financial condition (Knapp 1985; Gul and Tsui 1992) 6. The nature of conflict issue (Knapp 1985) 7. The audit firm's tenure (Shockley 1981; Teoh and Lim 1996) 8. The degree of competition in the audit services market (Knapp 1985; Gul 1989) 9. The size of the audit fees or relative client size (Gul and Tsui 1992; Bartlett 1993; Teoh and Lim 1996; Pany and Reckers 1988) 10. The audit committee (Gul 1989; Teoh and Lim 1996; Salehi, Man-souri, and Azar 2009) 11. Practicing nas by auditors (Beattie, Fearnley, and Brandt 1999; Ray-hunandan, 2003; Salehi and Rostami 2009) In this paper the authors have only attempted to clarify nas and its effect on the independence of auditors. The audit failures that have been reported have led to major criticism of the auditing profession worldwide by exposing the weaknesses of the profession in terms of safeguarding shareholders' and stockholders' interests (Citron 2003; Gwilliam 2003; Higson 2003; Brandon, Crabtree, and Maher 2004; Cullinan 2004; Fearnley and Beattie 2004; Karnishnan and Levine 2004; Mahadevaswamy and Salehi 2008; Salehi and Rostami 2009); thus some of this criticism arose from nas practices by auditors which are the subject of this survey. non-audit services nas may be any services other than audit provided to an audit client by an incumbent auditor. As the demand for business expert services grew over the late 20th century, public accounting firms expanded the scope of their services to include corporate and individual tax planning, internal audit outsourcing, and consulting related to mergers and acquisitions, information systems, and human resources. Recent concerns about auditor independence have focused on the provision of nas to audit clients. It is found that auditors believe that the auditors' work would be used as a guide for investment, valuation of companies, and in predicting bankruptcy; furthermore, the third party felt that there is a strong relationship between the reliability of the auditor's work and the investment decision. Also the auditor's work facilitates the process of economic development through the presentation of reliable information concerning the financial position of the companies (Wahdan et al. 2005). Today's public accounting firms have undergone dramatic changes in the last 25 years. Over the last decade the proportion of the revenue of large public accounting firms which derived from providing nas grew from 12 percent to 32 percent (Public Oversight Board 2000), suggesting that the economic bond between auditors and their clients strengthened over this time as auditors delivered more consulting-oriented services to their audit clients. Based on the amounts reported in the Public Accounting Report, last year audit fees for the top seven accounting firms were approximately USD 9.5 billion. These accounting firms audited over 80 percent of all registrants, and virtually every company with a large market capitalization. What's more, the audit and accounting fees of the largest accounting firms, as a percentage of their revenue, has decreased significantly from 70 percent of total revenue in 1976 for the Big Eight to 34 percent of total revenue for the same firms in 1998 (Ashbaugh 2004). Given the shift in revenue streams of public accounting firms, it is important to discuss the services that audit firms provide. An accountant becomes a Certified Public Accountant (cpa) to engage in attestation services, that is, conduct audits. Scholars are concerned that benefits either from cost savings, or from fees revenue increases, can strengthen the economic bond between auditors and their clients, which can further threaten auditor independence. Therefore, the main question that arises when auditors provide or could provide both audit and nas is whether the auditors are able to conduct their audits impartially, without being concerned about losing or failing to gain additional services, and without considering the subsequent economic implications for the audit firm (Lee 1993). Auditors seek to provide nas because of the considerable economies of scope that ensue, i.e. cost savings that arise when both types of service are provided by the same firm. These economies of scope are of two types: knowledge spillovers that originate in the transfer of information and knowledge, and contractual economies that arise from making better use of assets and/or safeguards already developed when contracting and ensuring quality in auditing. Thus far, globalization in accounting and assurance service has also created the multi disciplinary nature of large audit firms (Brierley and Gwilliam 2003). These multi disciplinary firms offer audit and nas to audit clients, and this has become one of the major concerns regarding the potential auditor independence dilemma (Quick and Rasmussen 2005). The prohibition of specified non-audit services is predicated on three basic principles: • an auditor cannot function in the role of management, • an auditor cannot audit its own work, and • an auditor cannot serve in an advocacy role for its client. The range of services now offered by the audit firms to both the public and private sector is wide. This may summarized as follows (Salehi, Mansouri, and Pirayesh 2009): • designing system, and it, • training, • services for payroll, • risk management advice, • taxation, including tax compliance and tax planning advice, • corporate recovery and insolvency, • forensic and litigation support, • mergers and acquisitions services, • transaction support and follow up, • public offering, • recruitment and human resources, and • portfolio monitoring. Provision of some of these services may pose a real threat to independence in the case of audit client. The principal threats which arise from the provision of non-audit services are: • Self interest: the increase in economical benefit dependence. • Self review: taking management decisions and auditing one's work. Advocacy: acting for the client's management in adversarial circumstances. • Familiarity: becoming too close to the client's management through the range of services offered. In the United States, the Sarbanes Oxley Act of 2002 implemented a ban on nine non-audit services which are as below: 1. Bookkeeping and other services related to the audit client's accounting records or financial statements 2. Financial information systems design and implementation 3. Appraisal or valuation services and fairness opinions 4. Actuarial services 5. Internal audit services 6. Management functions 7. Human resources plan 8. Broker-dealer services 9. Legal services However, in some countries external auditors still practise nas which it caused to dependence auditors. Review of the Literature After several scandals of international and national dimensions, especially after the Enron Collapse, professionals, academics, and researchers have focused on non-audit services. However, many writers maintain that the nas impair objectivity, as well as independence, whereas others argue that there exists no association between nas and audit quality. In short, the findings of prior studies on impacts of nas on audit quality are negative, positive, or have no effects. In a nutshell the various researchers came to three different conclusions about the effect of nas on audit independence. Below, we briefly explain three different schools of nas. studies indicating negative effect of nas on auditor independence Several prior studies suggest that nas has negative effects on auditor practices and auditor independence. Antle (1984) considers auditor independence to be an auditor's freedom from management influence as desired by the company's owners. He considered that since management controls the auditor's fee, an auditor can ignore independence in favor of management, unless a control mechanism is implemented. A survey carried out by Wines (1994) suggests that auditors receiving nas fees are less likely to qualify their opinion than auditors who do not receive such fees, based on his empirical analysis of audit report issued between 1980 and 1989 by 76 companies publicity listed on the Australian stock exchange. He found that auditors of companies with clean opinions received a higher proportion of non audit fees than did auditors of companies with at least one qualification. In relation to management advisory services (mas), Gul and Tsui (1992) conducted a survey, also using Australian companies, indicating that provision of management advisory services affects the informativeness of earnings. They found evidence that the explanatory power of earnings for returns is less for firms that provide mas. Frankel, lohnson, and Nelson (2002) found empirically that levels of discretionary accruals are higher for firms whose au- ditors provided nas than for firms whose auditors do not provide such services. According to Beeler and Hunton (2002) contingent economic rents, such as potential non-audit revenue, increase unintentional bias in the judgments of auditors. Frankel, Johnson, and Nelson (2002) and Larcker and Richardson (2004) found some evidence of potential links between nas and earnings management measures. Beck, Frecka, and Solomon (1988) argue that non-audit fees further increase the client-auditor bond by increasing the portion of the audit firm that is delivered from serving a client. Hackenbrack and Elms (2002) revisit the asr 250 fee disclosures and find a negative association between stock returns and non-audit fees for sample companies with the highest ratio of non-audit fees. Brandon, Crabtree, and Maher (2004), opponents to the Joint provision of audit and nas, claimed that auditors would not perform their audit services objectively and joint provision would impair perceived independence. Mitchel et al. (1993) believed that the joint provision of audit and nas to audit clients would cause unfair competition due to the use of audit services to the same client. studies indicating no effect of nas on auditors' independence Several prior studies suggest that nas has no effects on auditor practices and auditor independence. Glezen and Millar (1985); Corless and Parker (1987); Wines (1994); and Kinney, Palmrose, and Schoolz (2004) did not find systematic evidence showing that auditors violate their independence as a result of clients purchasing relatively more nas. According to Frankel, Johnson, and Nelson (2002) several studies have re-examined the negative effects of nas on audit quality, and found in their study that nas has no effect on auditors' independence. Abdel-Khalik (1990) reported no significant difference in audit fees between clients purchasing audit service only and those purchasing both audit and nas. Using Discretionary Accruals (da) as a surrogate for auditor objectivity, Reynolds, Deis, and Francis (2004) find no association between nas and da, and conclude that little evidence exists supporting the negative effects of nas on auditor's objectivity. O'Keefe, Simunic, and Stein (1994) extended Davis, Ricchiute, and Trompeter (1993), using disaggregated labor hours by rank (Partner, Manager, senior and staff) for clients of the Big Six firms in 1989, and using also the percentage of tax fees to audit fees and the percentage of management consulting fees to audit fees as independent variables. They fail to find evidence that audit effort is reduced in a joint provision scenario. Palmrose (1999) found that less than one percent of auditor litigation has nas as part of the basis on which the lawsuits are founded. Jenkins and Krawczyk (2001) asked 83 Big Five and 139 Non-Big Five accounting professionals and 101 investor participants to rate their perceptions of auditor independence, integrity, and objectivity for two scenarios in which an auditor provides neither nas to one firm, nor a nominal amount of nas (3 percent of total client revenues) nor a material amount of nas (40 percent) to another. Although they found investors' perceptions of independence and decisions on whether or not to invest were not affected by either level of non-audit service provision. Investors (non-big-professionals) did consider the 40 percent level of nas to be significant in their investment decisions. Sori (2006) investigated the perception of Malaysian auditors, loan officers and senior managers of public listed companies on the effect of joint provision of audit and nas on auditor independence. The majority of the responses agreed with the provision of nas to audit client by the audit engagement team. Chung and Kallapur (2003) report no statistically significant association between abnormal accruals and the ratio of client fees to total audit firm fees. studies indicating the positive effect of nas on auditors' independence Several prior studies suggest that nas has positive effects on auditor practices and auditor independence. Gul (1989) studied the perceptions of bankers in New Zealand and found that the effect of provision of nas was significantly and positively associated with auditor independence. In Malaysia, Gul and Yap (1984) reported that nas provision increased their confidence in auditor independence. Arruanda (1999) pointed out that joint provision of audit and nas would reduce overall cost, raise the technical quality of auditing, and enhance competition. This would ultimately increase auditor independence. Carlton and Perloff (2005) emphasize that the outcome is a more efficient allocation of scarce resources without the need to duplicate efforts to recreate the required input. Kinney, Palmrose, and Schoolz (2004) noted that knowledge of a client's information system and tax accounting could spill over to the audit, improve the information available to the auditor and thus improve audit quality, which in turn would increase the probability that problems are discovered. Auditor's concern for reputation (Dopuch and Simunic 1982) and legal liability (Palmrose 1988; Shu 2000) should drive auditors to maintain their independence. Larcker and Richardson (2004) also document the relation between the level of nas fees and accrual, especially for firms with weak governance. Their results suggest that auditors of firms that purchase large nas are less likely to allow the firm to make choices that lead to large abnormal accruals. They interpret their findings as suggesting that auditors working for firms with weak governance may play a more important role in the governance process in limiting choices of abnormal accruals and that enhanced knowledge through nas has a merely incremental positive effect on audit quality. Ghosh, Kallapur, and Moon (2006) studied 8940 firm-years for observation over the (2000-2002) period and found that the nas fees ratio (ratio of nas to total fees from the given clients) is negatively associated with Earning Response Coefficients (erc). Sharama (2006) studied the impact of audit providing nas and audit-firm tenure on audit efficiency. He was opposed to restricting regulations on the joint provision of audit and nas. His studies provided evidence that demonstrates an increase in the amount of the provision of nas, as a result of which the audit lag is reduced. He also provides evidence demonstrating that extended audit-firm tenure reduces audit lag, while shorter audit-firm tenure increases audit lag. Gore, Pope, and Singh (2001) report a positive association between the provision of non-audit services and earnings management in UK companies, suggesting that auditors' reporting standards are affected by whether the auditor also provides non-audit services to the audit client. Lennox (1999) suggests that nas increases auditors' knowledge on clients as well as the probability of discovering problems. Their empirical data, collected from u k firms, show a significant, though weak, positive relationship between nas fees and auditors, a surrogate for audit quality. Motivation of the Study Companies currently demand a broad set of nas (Wallman 1996). cpa firms are responding by offering such varied services as investment banking, strategic management planning, human resource planning, computer hardware and software installation, and internal audit outsourc- ing services (Berton 1995; aicpa 1997). Growth in the revenues earned from these services has been significant. In constant 1999 dollars, nas fees grew from usd 2.8 billion in 1990 to usd 15.7 billion in 1999 - an increase of over 460 percent (Antle 2000). One of the major public concerns which have emerged from the Enron collapse has been the extent to which audit firms are providing nas to their audit clients. Much of the current publicly expressed concern about the integrity of auditors and the influence of nas on auditor independence is based on opinion and assertion relating principally to the current causes, and observers generally are not looking beyond these cases. Further, the Sar-banes Oxley Act of 2002 prohibited auditing firms from providing certain nas to audit clients and left open the possibility that other currently non-prohibited services could also be banned. However, Iranian legislators still do not mandate these rules to the Iranian environment (Salehi 2008b). Further, with regard to review of the literature, it is known that the researchers did not come to the same conclusion, in other words they came to three divergent conclusions. So, in this study, we investigate whether provision of nas has a positive affect, a negative effect or no effect on audit independence. Research Methodology According to the above literature the objective of this study is to examine the reaction of auditors, and shareholders regarding nas and consulting services provided by the auditors to the same clients in Iran. In order to provide an accurate answer to this question, the authors have designed and developed a questionnaire based on the method used by previous researchers Qenkins and Krawczyk 2001; Frankel, lohnson, and Nelson 2002; Brandon, Crabtree, and Maher 2004; Krishnan and Levine 2004). In order to find an accurate answer to the research question, the authors have designed and developed a questionnaire which it is stable for gathering useful data. Our selected method of investigation is a questionnaire, for three reasons. First, since it is acknowledged that current theory is not well specified in Iran, the general objective of this study is to incorporate qualitative behavioural factors concerning audit independence into the research design, in order to assess the relative influence of each factor type. This necessitated the use of a direct method. Second, other specific objectives necessitate the use of direct methods to elicit non-public information. Finally, closed-form questions can be identified from the extant auditor choice literature. The research instrument was designed with close reference to the literature on questionnaire design. The questionnaire contains two parts, namely (a) bio-data, and (b) the section including several questions regarding the rejection/acceptance level of nas by participants in Iran. The questionnaire was designed on the bases of the Likert spectrum, and all participants were requested to determine the degree of agreement or disagreement with each question by assessing the degree of disagreement and agreement, using the range of integer numbers from -2 to 2, where -2 represents high disagreement and 2 represents high agreement with the hypotheses, while zero represents none of them (they graded corneal staining using a -2 to 2 scale, where -2 means highly agreeing, -1 means agreeing, o = none, 1 means disagreeing and 2 means highly disagreeing). The questionnaires were distributed among the respondents from the 1 June to 30 October 2008. The Cronbach's Alpha coefficient, used to assess reliability of the questionnaire, was 0.946 for the final questionnaire. On the bases of important factors we postulated three hypotheses as follow: Hj Presenting bookkeeping services by auditors to the same clients has a negative effect on audit independence. h2 Presenting managerial consultancy services to the same clients has a negative effect on audit independence. h3 A large amount of audit fees has a negative effect on audit independence. Results of the Study Regarding the data analyses, at first we wanted to know from the all participants' views of nas, which kinds of effects they have on audit independence, so the Binomial Test will be used to assess how many of the participants accept the effects of independent factors on dependent ones. Then the an ova (Friedman) Test will be conducted at this stage. In the last part the statistical population are sub-divided into two groups, namely: the first group was those have accounting and auditing literacy skills which, according to table 1, amounted to 2009 participants, and the second group including those participants who do not have accounting and auditing literacy skills (142 participants). In this stage the authors want to know if there are any differences between the literate group and the illiterate one. So, Mann-U Whitney will be employed in this stage. table 1 Bio-data of participants Item Work's field Frequency Percent Academic degrees Accounting and Auditing 2009 93-39 Other 142 6.61 Total 2151 100.00 Job position Accountants 825 38.35 Internal auditor 741 34-48 Financial Management 123 5.72 Financial analyst 144 6.68 Stockholders 318 14-77 Total 2151 100.00 Results: In total, out of 2450 questionnaires which were distributed among the participants, 87 percent of respondents completed them (2151 questionnaires were completed). Out of 1450 participants, 2009 participants had accounting knowledge (93.39 percent); the remaining 142 participants (6.61 percent) had no accounting knowledge. To conclude: the majority of participants had accounting and auditing knowledge. Out of 2151 participants, 2009 were accountants (93.39 percent), 741 were internal auditors (34.48 percent), 123 were financial managements (5.72 percent), 144 were financial experts (6.68 percent), and 318 were stockholders. The demographic characteristics of participants are summarized in table 1. First the binomial test was conducted to assess how many of the participants accept the effects of independent factors on dependent ones. For this purpose we divided participants into two groups including those agreeing and disagreeing with hypotheses. The results revealed that 1399 participants (65.04 percent) agreed that presenting bookkeeping services by auditors to the same clients have a negative effect on audit independence, therefore the first hypothesis is significantly confirmed (p < 0.05). The mean degree of agreement for this hypothesis was 0.498 (sd = 0121, 95 percent of confidence interval from 0.32 to 0.56). The results also show that 1676 participants (77.92 percent) agreed that presenting managerial consultancy services to the same clients has a negative effect on audit independence. As shown by the results, this hypothesis is accepted (h2) and the mean degree of agreement for this hypothesis was 0.244 (sd = 0.161, 95 percent of confidence interval from 0.182 to 0.323). table 2 Dependent variable effect on detecting distortions and test result by binomial test (1) (2) (3) (4) (5) (6) (7) h1 (bookkeeping) Disagreeing 452 0.21 0.5 0.00 Confirmed Agreeing 1399 0.79 Total 2151 1.00 h2 (manag, cons.) Disagreeing 475 0.22 0.5 0.00 Confirmed Agreeing 1676 0.80 Total 2151 1.00 h3 (audit fees) Disagreeing 452 0.21 0.5 0.011 Confirmed Agreeing 1399 0.79 Total 2151 1.00 notes Column headings are as follows: (1) hypothesis, (2) category, (3) frequency, (4) observed prop., (5) test prop., (6) asymp. sig., (7) results. table 3 Mean degree of participants' agreement or disagreement and other statistics Independent variable Mean deg: ree Standard deviation 95% of conf. int. Bookkeeping 0.498 0.121 0.32-0.51 Managerial consultancy 0.241 0161 0.182-323 Audit fees 0.185 0.041 0.112-0.255 notes Positive numbers represent the mean degree of agreement, while negative numbers represent the mean degree of disagreement. Regarding the third hypothesis, the results reveals that the majority of participants confirmed that a large amount of audit fees has a negative effect on audit independence (h3); 1399 participants (65.04 percent) agreed with the third hypothesis, thus this hypothesis is also significantly confirmed (p < 0.05). The mean degree of agreement was 0.185 (sd = 0.0.41. 95 percent of confidence interval from 0.112 to 0.255). The result of testing the hypotheses by the binomial test is shown in table 2. Regarding those participants requested to determine the degree of agreement or disagreement with the question by the Likert Spectrum, table 3 represents the mean degree of agreement or disagreement according to their ideas and other statistics. The competency of auditors has the most effect on detecting important distortion by the auditor in order to improve audit independence. The dependent variables were compared in their effects on auditor in- table 4 Results of the Mann-Whitney U-test (1) (2) (3) (4) (5) (6) (7) Hj (bookkeeping ) alp 2.36 O.81 -O.24 -2.578 0.010 aip 2.60 I.I42 h2 (manag, cons.) alp 2.5O 0.909 -0.51 -5-873 0.000 aip 3.OI I.O95 h3 (audit fees) alp 2.46 O.878 -0.62 -6.468 0.000 aip 3.08 i.I78 notes Column headings are as follows: (1) hypothesis, (2) respondents, (3) group statistics: mean, (4) group statistics: std. dev., (5) paired differences: mean, (6) MannWhitney U-test, (7) Z* test. dependence by the non-parametric anova (Friedman) Test. The results showed that three factors do not have significant differences. In this part the statistical population are sub-divided into two groups, namely: the first group was those have accounting and auditing literacy skills skills, which according to table 1 amounted to 2009 participants, and the second group included those participants who do not have accounting and auditing literacy skills (142 participants). At this stage the authors wanted to know whether there are there any differences between the literate and the illiterate group? So, Mann-U Whitney is employed at this stage. Table 3 indicates that accounting literate participants have a different perception than accounting illiterate participants regarding the first hypothesis (Bookkeeping), in other words, although both groups strongly agree that nas have a negative effect on audit independence, there is a gap between the two groups. As table 4 reveals, the mean value of literate participants stood at 2.36, whereas the mean value of illiterate participants stood at 2.60. So, we can conclude that the illiterate participants strongly agree that proving nas has a more negative effect on audit independence. As is shown in the table below, the results of the test confirmed that there is a gap between illiterate participants and literate participants on the negative effect of nas on audit independence. Returning to the second hypothesis above, the table showed that illiterate participants strongly agreed (mean value 3.01) that managerial consultancy has a negative effect to audit independence, however, the literate participants moderately (mean value 2.50) agreed that managerial consultancy has a negative effect on audit independence. The result of the test confirmed that there is a gap between the two groups on this matter. Concerning the last hypothesis, the table above showed that the illiterate participants strongly agreed that audit fees have a negative effect on audit independence, whereas the literate participants moderately agreed with this statement (mean value (2.46). The results confirmed that in this statement also there is a gap between the two groups of participants. Conclusion and Remarks According the results of this survey, practising nas to the same clients has strong negative effects on auditor independence. With regard to the review of the above literature on surveys conducted in several countries, the same results have been obtained in Iran, where nas has a negative effect on audit independence. With regard to the results of table 4, we can conclude that illiterate participants have more negative perceptions than literate participants. In other words, although both groups agree that nas has a negative effect on audit independence, the literate participants only moderately agree that nas has negative effects on audit independence. A large gap was found in this area. To close this gap, two options may available. First, the nas should be banned as in other countries around the world; and second, illiterate participants sholud be made aware of more information related to accounting and auditing. In sum, around the world in many countries accounting and auditing legislators have enacted rules and regulations for reducing nas. However, unfortunately in Iran there still are no such regulations and rules regulating the outcome of such an economical environment in practising nas to the same clients by the external auditors. Last but not least, it is very interesting to note that the Iran Audit Organisation is the only legislator for enacting accounting regulations, yet unfortunately, this important and vital organization offers both audit legal service and nas to clients. In a nutshell, to improve external auditors' independence, this organization should enact new rules in this economical environment. Otherwise, the same old story such as another Enron collapse may happen to Iranian corporations. References Abdel Khalik, A. R. 1990. The jointers of audit fees and demand for mas: A self selection analysis. Contemporary Accounting Research 6 (2): 295322. Abu Bakar, A., A. Abdul Rahman, and H. Abdul Rashid. 2005. Factors influencing auditor independence: Malaysian loan officers' perceptions. Managerial Accounting 20 (8): 804-22. AicPA. 1997. Serving the public interest: A new conceptual framework for auditor independence. New York: Commerce Clearing House. Antle, R. 1984. Auditor independence. Journal of Accounting Research 22 (1): 1-20. -. 2000. Comment letter regardingthe revision of the commissions auditor independence requirements. New Haven, ct: Yale University Press. Arens, A., J. Loebbecke, T. Iskandar, S. Susela, and M. Boh. 1999. Auditing in Malaysia: An integrated approach. Selangor: Prentice-Hall. Arruanda, B. 1999. The provision of non-audit services by auditors: Let the market evolve and decide international. Review of Law and Economics 19 (4): 513-31- Ashbaugh, H. 2004. Ethical issues related to the provision of audit and non-audit services: Evidence from academic research. Journal of Business Ethic 52(2): 143-8. Bartlett, R. W. 1993. A scale of perceived independence: New evidence on an old concept. Accounting, Auditing & Accountability Journal 6 (2): 52-67. Barzegar, B., and M. Salehi. 2008. Re-emerging of the agency problem: Evidences from practicing non-audit services by independent auditors. Journal of Audit Practice 5 (2): 55-66. Beattie, V., S. Fearnley, and R. Brandt 1999. Perception of auditor independence: uk evidence. Journal of International Accounting, Auditing and Taxation 8 (1): 67-107. Beck, P. J., T. J. Frecka, and I. Solomon. 1988. A model of the market for mas and audit services: Knowledge spillovers and auditor-auditee bonding. Journal of Accounting Literature 7:50-64. Beeler, J., and J. Hunton 2002. Contingent economic rents: Insidious threats to auditor independence. Advances in Accounting Behavioral Research 5:21-50. Berryman, R. G. 1974. Auditor independence: its historical development and some proposals for research. In: Contemporary auditing problems: Proceedings of the 1974 Arthur Anderson University of Kansas symposium on auditing problems, ed. H. F. Stettier, 1-15. Lawrence, ks: University of Kansas. Berton, L. 1995. Big six's shift to consulting accelerates. Wall Street Journal, September 12. Brandon, D., A. Crabtree, and J. Maher. 2004. Non-audit fees, auditor independence and bond ratings. Auditing: A Journal of Theory and Practice 23 (2): 89-103. Brierley, J., and D. Gwilliam. 2003. Human resource management issues in audit firms: A research agenda. Managerial Auditing Journal 18 (5): 431-8. Canning, M., and D. Gwilliam. 1999. Non-audit services and auditor independence: Some evidence from Ireland. European Accounting Review 8 (3): 401-19. Carlton, D. W, and ]. M. Perloff. 2005. The firm and costs in modern industrial organization. New York: Addison Wesley. Chung, H., and S. Kallapur. 2003. Client importance, non-audit services, and abnormal accruals. The Accounting Review 78 (4): 931-55. Citron, D. 2003. The uk's framework approach to auditor independence and the commercialization of the accounting profession. Accounting, Auditing and Accountability Journal 16 (2): 244-72. Corless, ]. C., and L. M. Parker 1987. The impact of mas on auditor independence: An experiment. Accounting Horizon 1 (3): 25-9. Cullinan, L. 2004. Enron as a symptom of audit process breakdown: Can the Sarbanes-Oxley act cure the disease? Critical Perspectives on Accounting 15 (617): 853-64. Davis, L. R, D. N. Ricchiute, and G. Trompeter. 1993. Audit effort, audit fees and the provision of non-audit services to audit clients. The Accounting Review 68 (1): 135-50. De Angelo, L. E. 1981. Auditor independence, low balling, and disclosures regulation. Journal of Accounting and Economics 3:183-99. Dopuch, N., and D. Simunic. 1982. Competition in auditing: An assessment. In Fourth symposium on auditing research, 401-5. Chicago, il: University of Illinois. Elliott, R. K., and P. lacabson. 1992. Audit independence: Concept and application. The cpa Journal 62 (3): 34-9. Fearnley, S., and V. Beattie. 2004. The reform of the uk's auditor independence framework after the Enron collapse: An example of evidence based policymaking. International Journal of Auditing 8 (2): 117-38. Flint, D. 1988. Philosophy and principles of auditing: An introduction. London: Macmillan. Frankel, R. M., M. F. lohnson, and K. K. Nelson. 2002. The relation between auditors' fees for non-audit services and earnings quality. The Accounting Review 77 (supplement): 71-105. Ghosh, A., S. Kallapur, and D. Moon. 2006. Audit and non-audit fees and capital market perceptions of auditor independence. Working paper, Baruch College. Glezen, G. Wi., and ]. A. Millar. 1985. An empirical investigation of stockholder reaction to disclosures required by asr no. 250. Journal of Accounting Research 23 (2): 859-70. Gore, P., P. Pope, and A. Singh 2001. Non-audit services, auditor independence, and earnings management. Working paper, Lancaster University. Gul, F. 1989. Bankers' perceptions of factors affecting auditor independence. Accounting, Auditing & Accountability Journal 2 (3): 40-51. Gul, F., and J. Tsui. 1992. An empirical analysis of Hong Kong bankers' perceptions of auditor ability to resist management pressure in an audit conflict situation. Journal of International Accounting, Auditing and Taxation 1 (2): 177-90. Gul, F. A., and T. H. Yap 1984. The effects of combined audit and management services on public perception of auditor independence in developing countries the Malaysian case. The International Journal of Accounting Education and Research 20 (1): 95-108. Gwilliam, D. 2003. Audit methodology, risk management and non-audit services. London: The Institute of Chartered Accountants in England and Wales. Hackenbrack, K., and H. Elms 2002. Mandatory disclosure and the joint sourcing of audit and non-audit services. Working paper, University of Florida. Higson, A. 2003. Corporate financial reporting: Theory and practice. London: Sage. Independence Standard Board. 2000. Statement of Independence Concepts: A Conceptual Framework for Auditor Independence. Exposure draft, Independence Standards Board. Jenkins, J. G., and K. Krawczyk. 2001. The relationship between non-audit services and perceived auditor independence. Working paper, North Carolina State University. Johnstone, K. M., M. H. Sutton, and T. D. Warfield 2001. Antecedents and consequences of independence risk: Framework for analysis. Accounting Horizons 15 (1): 1-18. Kinney, W R., Z. Palmrose, and S. Schoolz 2004. Auditor independence, non-audit services, and restatements: Was the us government right? Journal of Accounting Research 42 (3): 561-88. Knapp, M. 1985. Audit conflict: An empirical study of the perceived ability of auditors to resist management pressure. The Accounting Review 60 (2): 202-11. Krishnan, J., and B. Levine 2004. Discipline with common agency: The case of audit and non-audit services. Accounting Review 79 (1): 173200. Larcker, D. F., and S. A. Richardson. 2004. Fees paid to audit firms, accrual choices, and corporate governance. Journal of Accounting Research 42 (3): 625-58. Lee, A. 1993. Corporate audit theory. Devon: Chapman and Hall. Lennox, C. S. 1999. Non-audit fees, disclosure and audit quality. The European Accounting Review 8 (2): 239-52. Mahadevaswamy, G. H., and M. Salehi. 2008. Audit expectation gap in auditor responsibilities: Comparison between India and Iran. International Journal of Business and Management 3 (11): 134-46. Mahdi S., and V. Rostami. 2009. Audit expectation gap. International Journal of Academic Research 1(1): 140-46. Mautz, R. K., and R. Sharaf. 1964. The philosophy of auditing. Sarasota, fl: American Accounting Association. Mitchel, A., R Sikka, A. Puxty, and H. Wilmott. 1993. A better future for auditing. London: University of East London. O'Keefe, T. B., D. A. Simunic, and M. T. Stein. 1994. The production of audit services: Evidence from a major public accounting firm. Journal of Accounting Research 3 (2): 241-61. Palmrose, Z. V. 1999. Empirical research on auditor litigation considerations and data. Sarasota, fl: American Accounting Association. Pany, K., and P. Reckers. 1988. Auditor performance of mas: A study of its effects on decisions and perceptions. Accounting Horizons 2 (2): 31-8. Public Oversight Board. 2000. The panel on audit effectiveness report and recommendations. Stamford, ct: Public Oversight Board. Quick, R., and W. Rasmussen. 2005. The impact of mas on perceived auditor independence: Some evidence from Denmark. Accounting Forum 29 (2): 137-68. Rayhunandan, K. 2003. Non-audit services and shareholders ratification of auditors' appointments. Auditing: A Journal of Practice and Theory 22 (1): 155-63. Reynolds, J. K., D. R. Deis, Jr., and J. R. Francis. 2004. Professional service fees and auditor objectivity. Auditing: A Journal of Practice and Theory 23 (1): 29-52. Salehi, M. 2007. Reasonableness of audit expectation gap: Possible approach to reducing. Journal of Audit Practice 4 (3): 50-9. -. 2008a. Corporate governance and audit independence: Empirical evidences from Iran. International Journal of Business and Management 3 (12): 46-54. -. 2008b. Evolution of accounting and auditors in Iran. Journal of Audit Practice 5 (4): 57-74. Salehi, M., and B. Abedini. 2008. Iranian angle: Worth of audit report. scms Journal of Indian Management 5 (2): 82-90. Salehi, M., and Z. Azary. 2008. Fraud detection and audit expectation gap: Empirical evidence from Iranian bankers. International Journal of Business and Management 3 (10): 65-77. Salehi M., A. Mansouri, and Z. Azar. 2009. Audit independence and audit expectation gap: Empirical evidence from Iran. International Journal of Economic and Finance 1 (1): 165-74. Salehi, M., A. Mansouri, and R. Pirayesh. 2009. Audit committee: Iranian imperative, scms Journal of Management 6 (2): 70-81. -. 2008. A study to evaluate the need of an audit committee: An empirical evidence from Iran. Knowledge Hub 4(2): 39-50. Salehi, M., and K. Nanjegowda. 2006. Audit expectation gap: The concept. Journal of Audit Practice 3 (4): 69-73. Salehi, M., and V. Rostami. 2009. Reactions to non-audit services: Empirical evidence of emerging economy. Interdisciplinary Journal of Contemporary Research in Business 1 (4): 17-39. Sharama, D. S. 2006. Evidence on the impact of auditor provided non-audit services and audit-firm tenure on audit efficiency. Working paper, Auckland University of Technology. Shockley, R. A. 1981. Perceptions of auditors' independence: An empirical analysis. The Accounting Review 56 (4): 785-800. Shu, S. 2000. Auditor resignations: Client effects and legal liability. Journal of Accounting and Economics 29:173-205. Sori, Z. M. 2006. Empirical research audit, non-audit services and auditor independence, staff paper the Malaysian case. The International Journal of Accounting Education and Research 20 (1): 95-108. Sucher, P., and K. Maclullich. 2004. A construction of auditor independence in the Czech Republic: Local insights. Accounting, Auditing and Accountability Journal 17 (2): 276-305. Teoh, H. Y., and E. Lim. 1996. An empirical study of the effects of audit committees, disclosure of non-audit fees and other issues on audit independence: Malaysian evidence. Journal of International Accounting, Auditing and Taxation 5 (2): 231-48. Wahdan, M. A., P. Spronck, H. E Ali, E. Vaassen, and H. J. van den Hernick. 2005. Auditing in Egypt: A study of challenges, problems and possibility of an automatic formulation of the auditor's report. Http: //ticc.uvt.nl/~pspronck/pubs/Wahdan2005c.pdf. Wallman, S. M. H. 1996. The future of accounting. Part in: Reliability and auditor independence. Accounting Horizons 10 (4): 76-97. Wines, G. 1994. Auditor independence, audit qualifications and the provision of non-audit services. Accounting and Finance 34 (1): 75-86. Assessing' Microfinance: The Bosnia and Herzegovina Case Anne Welle-Strand Kristian Kjollesdal Nick Sitter Microfinance is often hailed both as a tool for fighting poverty and as a tool for post-conflict reconciliation. This paper explores the use of microfinance in post-civil war Bosnia and Herzegovina, assessing its results in terms of both goals. As it combined high unemployment with a highly educated population in an institutionally open context, Bosnia and Herzegovina provides a crucial test of the effect of microfinance. If unambiguous signs of success cannot be found in a case with such favorable conditions, this would raise serious questions about the potential benefits of microfinance. The paper draws together evidence from a series of independent reviews of microfinance in Bosnia and Herzegovina, to assess its impact in terms of economic performance, the economic system, social welfare and post-conflict integration. Based on this case study, microfinance appears a better tool for dealing with poverty than with social integration or institution building. Key Words: micro finance, post-conflict, poverty alleviation, economic development, Bosnia and Herzegovina jel Classification: g 21, 01 Introduction Microfinance is often hailed both as a tool for fighting poverty and as a tool for post-conflict reconciliation. Since the 2000s microfinance has increasingly been employed as a means for poverty-reduction. Its international profile as a tool for poverty alleviation was secured in 2006, when Muhammad Yunus and Grameen Bank were awarded the Nobel Peace Prize. Yet, since the 1970s, microfinance has also been seen as part of a p ost-conflict reconstruction strategy. It is in recent years that the strategy has been more frequently used (Nagarajan and McNulty 2004). In Dr Anne Welle-Strand is a Professor at the Norwegian School of Management b i, Norway. Kristian Kjollesdal is a reseacher at the Comte Analysebyrä, Norway. Dr Nick Sitter is a Professor at the Norwegian School of Management b i, Norway, and Central European University, Hungary Managing Global Transitions 8 (2): 145-166 Bosnia and Herzegovina (hereinafter bh), both approaches of microfi-nance were used after the 1992-1995 war left the country in ruin. The bh microfinance sector developed rapidly, transforming microfi-nance institutions (mfis) from donor-funded institutions into financially-sustainable microcredit organizations (mcos). The sector gradually became institutionalized. Many commentators have therefore proclaimed a case of the bh microfinance sector as successful. However, there are also voices raised against such evaluation, due to the country's stagnation in developing the small and medium enterprise (sme) sector (Bateman various years). Microfinance is extensively used as a development tool, and is largely supported by existing financial, technical and political resources. However, less is known about microfinance in post-conflict situations. For instance, Woodworth (2006) argues that more research is needed on the complexities of managing mfis in times of conflict and post-conflict, while Nagarajan and McNulty (2004) find that implementing effective microfinance in specific post-conflict contexts is not yet well understood. These and similar studies suggest that while post-conflict microfinance strategies may indeed resemble normal strategies, there are also important differences - for example the trust-building aspect of micro-finance emphasized by Doyle (1998). However, we can go even further in exploring the relationship between microfinance and conflict. For example, we can explore the relationship between the social environment and microfinance in order to discover what impact microfinance has on the post-conflict environments. Such an inquiry would be in line with the general approach applied in the microfinance research field, which is mainly preoccupied with scrutinizing the social and financial impact of microfinance. In the present study, evaluation of the financial impact is carried out in terms of: (i) the immediate effect on profit, business start-ups and economic performance, as well as (ii) the effect on the development of the broader economic system including institutions and the rule of law. In terms of its social impact, the focus is on (iii) the welfare and social equality of the users of microfinance, as well as (iv) on post-conflict reconciliation and integration. bh represents a key test case for this type of inquiry because it mirrors a broader debate about the impact of microfinance - a divided opinion among scholars and practitioners about the impact and contribution of microfinance to the country's overall development. Therefore, this report has three main objectives: first, it will provide a brief overview of microfinance as a tool for poverty reduction and as used in post-conflict situations, as well as a critique of microfinance. Second, it will outline a case of the bh microfinance sector - its establishment, development, current trends and impact. Third, it will identify what we can learn from this case as well as offer avenues of further research, focusing on issues of evaluation. Microfinance, Poverty and Post-Conflict Reconciliation Microfinance (mf) is a development tool designed to address issues of poverty, under-development and marginalization. It is based on a simple idea: to provide poor individuals with access to microloans. These microloans allow a client to start a micro-business. In the best-case scenario, such micro-businesses develop into a small-medium enterprise. Such MFis are often supported by government or international donor funding. Over time, these institutions aim at commercializing their operations and achieving their own financial self-sustainability. As a development tool, microfinance has been used both to reduce the poverty and to support countries recovering from a conflict or a major disaster. mf is seen as an important factor in reaching the Millennium Development Goals (Littlefield, Murduch and Hashemi 2003). Donor funding provided to mfis usually includes poverty reduction in their mission. Based on a review of impact studies that took place in the period 1994-2002, Littlefield et al. (2003) find that microfinance goes beyond just business loans - it affects investments in health and education, management of household requirements, and other cash needs. For example, since 1989 'Freedom from Hunger' has worked with local partners to develop and distribute a cost-effective strategy to improve the nutritional status and food security of poor households in rural areas of Africa, Latin America and Asia. MkNelly and Dunford conducted an impact evaluation study of these programs in Ghana in 1998 and Bolivia (crecer - Credit with Education Program) in 1999. In Ghana, clients' economic, social and health status were shown to improve due to micro-finance (MkNelly and Dunford 1998). In Bolivia, MkNelly and Dunford (1999) documented that microfinance led to improved nutritional and health status of the clients' families, as well as higher involvement in local government. These studies point to what Boudreaux and Cowen (2008) named 'the micromagic of microcredit.' Boudreaux and Cowen (2008, 31) explain: 'With microcredit, life becomes more bearable and easier to manage. The improvements may not show up as an explicit return on investment, but the benefits are very real.' According to Boudreaux and Cowen (2008), microcredit is an alternative to money lending, which is a traditional way of borrowing and lending money to the poor part of the population. As such, microcredit is a more humane way of providing access to credit for the poor. They compare the microcredit and moneylender approaches and suggest some advantages for the microcredit approach, such as easing debt burdens as well as formalizing and institutionalizing the business relationships. Compared to the traditional banks, on the other hand, the microcredit institutions lend to people who work also in the informal sector (Boudreaux and Cowen 2008). In short, microti -nance initiatives reduce poverty, promote education of children, improve health and empower women. However, because poverty and war are sometimes linked, microfi-nance is often used in post-conflict contexts. Conflicts cause degradation of both quality of life and the economic situation. They lead to dysfunctional states; in some cases, states practically even cease to exist. Many post-conflict environments suffer from a lack of financial and social capital, infrastructure and functioning relationships (Nagarajan 1999). Business activities are adversely affected by such instable macro-economic frameworks. Humanitarian aid projects are suitable for the immediate post-conflict period, helping the population overcome starvation and diseases. Microfinance can be a means for managing the transition from humanitarian relief to economic reconstruction and sustainable development (Seibel 2006; Hudon and Seibel 2007). It can be an integral part in 'jump-starting' a crippled economy, aiming to rebuild and recover local economies (Nagarajan 1999). It can also foster legitimate business activities by facilitating linkages between refugee and local markets, as well as undermining informal lending. However, microfinance may also have a broader impact in terms of peace and reconciliation. For example Nagarajan (1999) emphasizes the possibility of MFis playing a role in the restoration of social capital when they provide long-term viable services. Doyle (1998) shows how microfinance initiatives in Rwanda helped local Hutu and Tutsi populations to overcome their differences following the civil war in the 1990s, and to find common ground in business development through microfinance. Generalizing about such effects may still be premature, but there are signs that microfinance might help restore and build social capital through creating a meeting point and facilitating business relations. The main criticism of microfinance is that it may have an adverse effect on broader development. This applies particularly to post-communist states, because the legacy of the communist (the 'second world') may be very different from conditions in developing countries (the 'third world'). A 2007 issue of the Development and Transition1 addressed the current state of the private sector in post-communist Eastern Europe and asked how development agencies can strengthen their work in five areas, one of which was microfinance. A central argument focused on the inadequate attention paid to the local and global conditions and post-socialist legacies when designing 'blanket' reforms such as generic sme support, commercializing the microfinance sector and liberalizing the business environments (Hughes and Slay 2007). A stronger critique asks whether microfinance can actually undermine medium-term economic development because it supports inefficient activities. This relates specifically to commercial microfinance. Bateman (2007a) criticized 'new wave commercial microfinance' institutions that avoid international and government financial support and supervision, while extending microcredit to many poor individuals and communities. These commercial mfis tend to support microenterprises that operate well below the minimum efficient scale and have little realistic chance of long-run survival. This, Bateman argues, has two negative consequences: first, a high rate of microenterprise exit caused by the saturation phenomenon within the informal sector; and second, high opportunity costs for the countries, since a standard commercial microfinance business model does not make it possible for microenterprise to deploy advanced technologies, skills and product and process innovations. Bateman found that there is little solid evidence to confirm that commercial microfinance facilitates sustainable economic and social development. Reviewing the data from Central and Eastern Europe, Galusek (2007) found that mfis in the region had served over 4 million clients (predominantly low-income families and microenterprises) by the end of 2005, but that although there was rapid growth in microfinance in the region, there was no evidence that mfis are successful in reaching the low-income groups on a large scale (Galusek 2007). He emphasized that a commercialized approach of mfis is a consequence of the donor community that encouraged such a model, and called for microfinance strategies and products to respond to the needs of the region. In a recent paper Roodman and Morduch (2009) argue that many of the oft-cited positive impacts of microfinance originate in two studies from Bangladesh, Pitt and Khandker (1998) and Khandker (2005), and are in fact based on faulty statistical methods. After replicating the analyses of the same datasets, with slightly different statistical techniques, Roodman and Morduch found that 'reverse or omitted-variable causation is driving the results, and that the endogenous credit-consumption relationship varies substantially by subsample, as well as borrower sex, which can explain the seeming gender differential in impact' (Roodman and Morduch 2009, 3). Their paper argues that, with few exceptions, the idea that microfinance effectively reduces poverty, that poverty-reduction is especially effective when the borrowers are female, and that the poorer the recipient the greater the benefit, are results of incorrectly applied statistical analyses rather than evidence of the true effect of microfinance. A broader debate concerns the issue of whether microfinance fits a country's strategy for economic growth. Calling for 'jobs, not microcredit,' Karnani (2007a) reviewed macroeconomic data and found that although microcredit yields some non-economic benefits, it does not significantly eradicate poverty. Even though microfinance can make life better at the 'bottom of the pyramid,' creating jobs and increasing worker productivity is a better way to get rid of poverty. Unless microfinance directly affects the jobless, it is merely a way of transforming employees into micro-entrepreneurs - simply by replacing old businesses with microcredit-funded micro-businesses. Such crowding-out results neither in net job nor in income gains (Storey 1994). Therefore, Karnani (2007b) suggested that romanticizing the poor as 'resilient and creative entrepreneurs' harms the same poor individuals in two ways: it under-emphasizes modern enterprises as well as legal, regulatory and social mechanisms to protect the poor; and it overemphasizes the impact of microcredit. Therefore, Karnani (2007a) suggested that governments, businesses and civil society should work together to reallocate their resources away from microfinance and instead support larger enterprises in laborintensive industries. This formula, he claims, worked well in China, Korea, and Taiwan. Even though the microfinance concept has been present for more than thirty years, the evidence on its effects is at best mixed. It is still difficult to identify any sustainable development trajectories arising from this experience. Countries such as India, China, Thailand, Vietnam, Malaysia, Taiwan, Brazil and South Korea, have been successfully dealing with underdevelopment by employing a variety of state and non-state interventions as well as institutional innovations not related to microfinance. China, Vietnam and South Korea have significantly reduced poverty in recent years with little microfinance activity. Bangladesh, Bolivia and Indonesia have not been as successful at reducing poverty despite the influx of microcredit (Karnani 2007a). Somewhere in between these extremes lie the countries of Eastern Europe, which started channeling of donor funds into microfinance in the 1990s as a response to a number of processes, ranging from the conflict and post-conflict era to deindustria-lization. Microfinance in Bosnia and Herzegovina Using Bosnia and Herzegovina as a case study allows for testing microfinance both as a tool for post conflict mitigation and as a means for economic development and poverty reduction in a post-communist economy. BH represents what can be labeled a best-case scenario for testing the impact of microfinance initiatives. First, BH still had a highly educated work force when the conflict ended - what Demirguc-Kunt et al. (2007) called an 'entrepreneurial middle class.' However, war activities resulted in a per capita gdp drop of 75 percent (from 2,400 usd to 600 usd) registered over a five-year period, from 1990 to 1995 (Demirguc-Kunt et al. 2007). Consequently, three out of four could not afford basic family necessities. Second, most commonly identified favorable conditions were present. The state-owned enterprises (soes) had been broken up during the war - leaving highly educated people unemployed and displaced. There was also a lack of capital for business start-ups (sdn 2001). Consequently there was an opportunity for microfinance to provide sought-after capital to suitable borrowers. There may be parallels here to the importance of trust (social capital) and affordable credit (loans to small enterprise) in post-conflict reconstruction in other European states, such as Italy after the Second World War (Weiss 1988; Putnam 1993). Third, weak political and economic institutions meant that mfis could be set up with relative ease. In the formative years of the microfinance institutions the regulatory framework was very limited, which allowed the mfis significant leeway to establish their lending policies. These factors meant that the 'new poor' in post-conflict bh represented a different clientele than the poor in Africa and Asia, since they were educated and usually possessed some physical assets. Hartarska and Nadolnyak (2007, 4) offer their description of the new poor: The potential microentrepreneurs were people who before the war might have had sophisticated private businesses but were displaced or, alternatively, people who, before the war, were factory workers but became unemployed after the industry collapsed (post-war unemployment reached as much as 85%). Painting a less optimistic picture, Ohanyan (2002,398) has underlined the complexity of bh's post-socialist, post-conflict context in the following way: The post-Communist conflict cases are characterized by the problem of multiple transitions, which refer to the variety of policy goals that international organizations working in the country bring to the table. Specifically, transitions (1) from war to peacetime economy, (2) from command to market economy, and (3) from humanitarian assistance to long-term development assistance are the major policy goals driving the international involvement in bh. Although it warrants some focus on possible favorable conditions, it is argued that microfinance in a p ost-conflict environment is not dramatically different from other, 'normal' environments (Larson 2001). The core idea still is to provide small amounts of credit to those deemed not creditworthy in a traditional, commercial banking setting. A core condition is that there must be a low intensity of conflict. This is amongst other things necessary in order to ensure the safety and working conditions of the m fi staff, their offices as well as the clients. This is the first of three essential conditions set out by Doyle (1998). Doyle's second condition is that markets be reopened. In order for an mf initiative to be successful there must be a certain resumption of economic activity, so that the local population can believe in the changes at hand. The third condition concerns long-term displacement, which is needed in order to ensure some reiteration of relations. Doyle finds that displacement of at least 18 months is reasonable for creating a possibility for the economic investments to yield returns. This is especially important if the main objective of mf is to initiate economic growth, not to serve humanitarian goals in the wake of disaster. Adopting a long-term view also makes the mf seem more stable, rather than a quick, one-off fix (Woodworth 2006). All three conditions were present in bh. Doyle also lists a range of other helpful conditions, including the existence of a functioning commercial banking system; an absence of hyperinflation, which secures a predictable value of money over time; and a certain density of population that increases the possibility for economic linkages. Enabling legislation for mfis is also preferred in order to make the future more predictable for the microfinance lenders. It is also preferable for the providers of m f to have access to locally skilled and educated labor to staff the organization, while more social capital is preferable to less, due to its effect on demand, scale, training needs and operational efficiency. Finally, trust in the local currency and financial institutions is preferred, mfis should also be cautious about their client target when offering credit in a post-conflict setting (Woodworth 2006). They should not offer credit to everyone as a peace-building measure, but keep in mind the probability for repayment and economic development. bh is sometimes hailed as an excellent example of success of a microcredit enterprise in post-conflict situations, considering its financially sustainable microfinance institutions (Woodworth 2006). The mfis initiated their activities shortly after the 1995 Dayton Peace Agreement ended the war, and over the ten following years 50 mfis were established (Ribic 2005). The increase in the number of mfis in bh followed a general trend of increased funding for microfinance activities. This happened in an environment in which financial sectors officials accepted NGos giving loans and government officials made the registration process for NGos quick and simple (Woodworth 2006). Today, the bh mfis are some of the largest in Eastern Europe, financially self-sustainable and operating in a competitive environment. Both short- and medium-term microfinance strategies have been applied in bh (Ohanyan 2002). While the first approach sees microfinance as an immediate peace-building instrument targeting segments of the population and offering financial services to these specific clients, the second approach is more commercially focused on the long-term sus-tainability of the microfinance institutions. The first approach tends to be favored by humanitarian organizations - such as the unhcr - in order to reintegrate minorities, offer a social safety net and minimize the social cost of transition (Ohanyan 2002; Hudon and Seibel 2007). The second approach was favored by some of the international donors funding the mfis, such as the World Bank. Overall there seems to be a consensus that the environment for growth in microfinance in bh was good. Financial officials accepted that ngos were providing the credit, and government officials facilitated the registration process. The brief history of microfinance in bh can be divided into two periods, before and after 2000. The initial period, from 1996 to 2000, covered the early years of microcredit during which legislation was nearly nonexistent. This coincided with the first of two World Bank-financed mi- crofinance institutions in bh: Local Initiatives Project i and n, covering the time periods 1996-2000 and 2002-2005, respectively. The World Bank was joined by the United Nations High Commissioner for Refugees (unhcr), the Netherlands, Italy, lapan, Switzerland and Austria in financing the lip i project, which had the following three objectives (World Bank 2001): • lip i should respond to urgent needs by targeting demobilized soldiers, displaced persons, returning refugees and widows. • It should commence a process of establishing financially sustainable MFis in 5-10 years time. • Further it should improve the business and regulatory environments for self-employment, micro- and small-enterprises, as well as the regulatory environment for non-financial mfis. In the World Bank report (2004), lip i was qualified as a successful project and had produced results beyond original expectations: Through micro-credit organizations (mco) established under the project, some 20,000 micro-enterprises with up to five employees had received 50,261 loans with maturity ranging between 6 and 18 months. The loaned sums were small, averaging 1,600 usd. Repayment records were very good, likely helped by the incentive of receiving larger loans if repaid on time. 50 percent of recipients were females (war widows), 21 percent were displaced persons, while 5 percent were returning refugees. 17 ngos were funded in order to initiate microcredit activities in 1996, and by the end of 2000, eight of nine mfis funded under lip i became self-sustainable. On the basis of these results the second phase of the project was launched in 2001. The second period is the post-2000 era, when a more advanced legal framework was elaborated. The first two years, 2000-2001, saw the adoption of the microcredit organization (mco) laws that were put in place to cope with the institutions that had developed in the immediate postwar period. It was not an easy political environment in which to operate, and the adopted legislative agenda was limited. Cicic and Sunje (2002) analyzed the proposed regulation, and found that it proved difficult to have mfis assume the form of a finance company - a privately owned lending institution not necessarily limited to microenterprise lending. Neither was it possible to regulate them as savings and credit associations that would allow them to mobilize capital from their members in the form of savings, nor as microfinance institutions that would be authorized to ac- cept deposits from the general public (Cicic and Sunje 2002). Rather, the legislation allowed the mfis to be regulated as microcredit organizations that were non-profit, credit only-institutions. The Mco-law was passed in bh parliament in 2000 and rs parliament in 2001, allowing m cos to operate in both the Federation of Bosnia and Herzegovina, and the Republika Srpska (Lyman 2005). The World Bank was an important driver in furthering this legislative agenda. In 2002-2003 the broader legislative developments concerning the financial system took place. The Central Bank is the main monetary authority of bh, while the Banking Agencies conduct supervision and enforcement of banking regulations (de Montoya et al. 2006). Since 2004 the microcredit sector has been maturing, and the Central Bank has emerged as the likely dominant force in the future development of the bh financial system (Lyman 2005). During this period, the World Bank's second Local Initiatives Project (lip ii) focused on facilitating the transformations of mcos from ngos to commercial-legal organizations. This furthered financial market legislation and regulation, as well as mco market consolidation (Lyman 2005). These developments gave the mcos the opportunity to operate as mfis - offering both credit and deposit, as well as making it possible for them to transform into for-profit institutions. The lip 11 came into effect in the spring of 2002, with a budget of some 24 million usd of which the World Bank/International Development Agency financed 20 million usd and the two Entity governments the remainder (Dunn 2005). The project's aim was to raise incomes, develop businesses and increase employment through the use of microfinance institutions. This was to be done in two specific ways (World Bank 2005): • lip ii should finance the growth and institutional development of high-performing microfinance institutions with the potential to provide sustainable financial services to the 'unbankable.' • It should also support the transition of the microfinance sector towards sustainable sources of financing. The World Bank found the lip i i to produce satisfactory outcomes, as the project had significantly influenced the entrepreneurial poor, strengthened the mcos' capacity for providing high-quality credit services to their clients, and had helped 'create and/or sustain more than 200,000 jobs in nearly 100,000 micro-businesses' (World Bank 2005, 4). Consequently, lip 1 and 11 have had major influences on the development of a microfinance sector in bh (Dunn 2005). The lips were seen as necessary for providing micro entrepreneurs with access to loans. With the transition from the first to second phase, microfinance institutions in bh gradually acquired more autonomy and financial independence. Initially, the Ministry of Finance wanted direct control of the microcredit project, while the World Bank preferred the mfis to be autonomous entities. The two compromised by using the existing Employment and Training Foundations to create the Local Initiative Departments (lids) in both the Federation and Republika Srpska. After this proved successful they agreed to channel all the funds through the lids (sdn 2001). Most mfis became financially self-sufficient shortly after their establishment. Under the lip i Project, eight mfis became self-sustainable and five were fully financially sustainable - perhaps most notably among them the Pro-Credit Bank (Cicic and Sunje 2002). Five mfis thus had a positive fully adjusted return on assets, and hence sufficient income to cover all costs including inflation, market costs of funds and adjustments of subsidies, mfis tripled their total assets and gross loan portfolio in the 1999-2003 period (Berryman and Pytkowska 2003). Long-term sustainability can be challenged when the mfis operate in limited markets, have similar groups of beneficiaries, offer similar product portfolios, and share similar missions. In bh, however competition both inside the m fi-sector and from the commercially oriented banking sector prevented this outcome, and prompted major consolidation of microfinance institutions (Berryman and Pytkowska 2003). Therefore, the lip il could reduce the number of organizations that received funds to 8, a significant downscaling from the 17 receiving funds through lip 1 (Dunn 2005). Since the introduction of new mco legislation in 2006, the organizations can be transformed into non-profit microcredit foundations (mcf) or for-profit microcredit companies (mccs) (Mehmedovic and Sapundzhieva 2009). This legislation opens the capital structure of the mfis/mccs to investors - securing further access to finance. The Impact of Microfinance in Bosnia Several independent studies have assessed the impact of microfinance activities in bh. The results vary from positive, through neutral, to negative. It is likely that the varying assessments have to do with issues of methodology. For example, as found by Matul and Tsilikounas (2004), the impact on entrepreneurs' finances has been underemphasized in bh, as the social impact was the main interest in the post-war environment. The following section provides an overview and assessment of this range of studies and reports. These findings are organized under four labels: (i) the immediate effect on profit, business start ups and economic performance, as well as (ii) the effect on the development of the broader economic system including institutions and the rule of law. In terms of its social impact, the focus is on (iii) the welfare and social equality of the users of microfinance, as well as (iv) on post-conflict reconciliation and integration. First, assessments of the success of mfis and their effect on growth have generally been positive. Although it is inadequate to assess the success of the microfinance initiative by assessing the sustainability of the mfis themselves, because profitability is not necessarily correlated with the sustainability of the businesses to which they lend, this is a first step. Because the mfis in bh competed for clients, they quickly learned the use of methods like focus groups, exit interviews and market research in order to better learn clients' needs and how to meet them with improved financial services (Hartarska and Nadolnyak 2007). This market-solution, in turn, led to a growth in products offered, and as a result, the range of mfis was better able to meet the needs of a diverse group of entrepreneurs (Goranja 1999). Thus, mfi profitability can be seen as a measure of selling successful products to the client population. The World Bank's own evaluation (2004) of lip 1 found that the project had made an important contribution to the resumption of economic activity in the post-war bh, through supporting start-ups and the expansion of small businesses (World Bank 2004). Specific data on business start-ups were not collected in the baseline survey, however. In the evaluation of lip ii, the World Bank (2005) found that microcredit services had helped create and/or sustain more than 200,000 jobs and served 98,852 active clients. However, the lip ii, and consequently the self-evaluation, focused mainly on the development of the sector rather than its broader impact on the economy and society. The sustainability of these micro-businesses is also drawn into doubt as World Bank researchers have found that just under half of new microenterprises established in bh between the 2002 and 2003 failed within just one year (Demirguc-Kunt et al. 2007). It is also noteworthy that the project's accomplishments on documenting its impact on a client-level were rated not higher than satisfactory. There have also been independent assessments of the lips, as well as of the bh microfinance initiative as a whole. In a study of 1,437 microcredit clients and rejected clients,2 Cicic and Sunje (2002) found that microcre- dit is positively correlated with new business-generation, the addition of new lines to existing businesses, a growth in production, and, less regularly, employment. By using data collected from the World Bank's Living Standards Measurement Survey, Hartarska and Nadolnyak (2007) evaluated the impact of the bh microfinance industry as a whole. Their findings showed that mfis improved access to credit in municipalities where two or more mfis offered competing financial products. The main critique has centered on the broader impact of microfinance on the bh economy. Bateman (2007b) argues that the commercial microfinance approach has led to deindustrialization and infantiliza-tion of the bh economy. It has atomized the local enterprise sector, programmed to reach only the minimum efficient scale of operations, while simultaneously facilitating trade deficits and destruction of local social capital. He argues that there was never an attempt to establish a local industrial policy for bh, although 'the comparatively high level of industrial development, skills and technology in 1995 represented a once only opportunity to establish a core of small-scale, innovative, relatively technology-intensive, industry-related ventures' (2007b, 214). These high opportunity costs are due to the commercial microfinance model, which represented 'the only major local financial support structure in bh' and was used for funding of 'largely unsustainable trade-and household-based economic activities.' Bateman has identified an adverse selection of the microfinance clients, and an anti-industrial bias characterized by '[filtering out those potential entrepreneurs wishing to work in the industrial sector but who cannot hope to service the onerous terms and conditions offered by the commercial microfinance institutions, and filtering in those ventures incorporating only the very simplest of non-industrial business ideas that just about can' (Bateman 2007b, 214). Rather than providing credit to microenterprises, it has been argued that credit should be offered on affordable terms and maturities to small businesses, drawing on the experiences from the military-industrial complex in Northern Italy, amongst other places, that was central in rebuilding, restructuring and developing the post-Second World War economy (Weiss 1988). Bateman (2007a) argues that microfinance has been mostly directed towards the informal sector. That way, he argues, it does not create businesses but rather subsidizes microenterprises and that it has a redistributive effect. Microfinance has 'undermined most local economic and social development triggers, such as cumulative and coordinated investments, capturing economics of scale and scope, technological innovation, inculcating social capital, or incorporating technical skills and knowledge' (Bateman 2007a, 4). Bateman (2007a) has tried to nuance the impression of microfinance in b h by pointing out that 30 per cent of borrowers from mfis during the lip i-period that were surveyed in 2002 had failed within two years. Second, most studies point to a positive effect of microfinance on bh's economic and legal system, although there are some broader criticisms of its small-scale and liberal focus. Again, the World Bank's own assessment is the most positive. The transitions for m cos to increasingly rely on the market as a source of finance, as well as the m cos' project management, which were both rated highly satisfactory. Its impacts in furthering legal and regulatory reform, providing institutional support for m cos, and documenting microcredit impact were rated satisfactorily (World Bank 2005). This indicates a highly professional microfinance sector that has been granted due attention by the World Bank. In a study of the mf experience in bh, Dunn (2005) analyzes data on the effect of mf for clients and non-clients alike. In her impact assessment of the lip il, Dunn found that microcredit indeed has positive impacts on household welfare, business development and business start-ups, employment, workers' wages, as well as livestock development, amongst other things. However, isolating the impact of mf is difficult, mf lending is not the only source of finance for the micro-businesses. Dunn found that 29 percent of entrepreneurs had loaned finances from alternative sources, in addition to the mfis with which they were associated at the time. For example, foreign-owned commercial banks, incentivized by the profitable microcredit sector, did increase their supply of microloans to bh. Chen and Chivakul (2008) attribute much of the real growth in credit to bh households between 2001 and 2006 to the emergence of these banks on the bh scene, standing at nearly 50 percent compared to the real growth rate for credit to enterprises of 13.5 percent. Critics of microfinance in bh point to the close association with neo-liberalism, trade and privatization, and its adverse effects on investment and the overall economic system. Bateman and Chang (2009) criticize the microfinance model on several accounts. For example, the focus on m f ignores economies of scale by stimulating individual start-ups. This leads to economic activity being organized in the least efficient way, they argue. Proponents of mf lending have also drawn attention away from other perverse effects. For example, Bosnian mfis provided financing for its clients to purchase cows, resulting in over-supply of milk and consequently decreasing prices (Bateman and Chang 2009). The indirect effect of m f lending thus adversely affected the one-cow farmers and incumbent milk producers alike, mf favors easy business ventures that are able to meet the short maturity of the loans, while capital-intensive businesses are under-represented in terms of mf (Bateman and Chang 2009). The lack of capital-intensive investments further leads to a trade deficit since such goods must be imported. Third, in terms of the overall effect of micro-finance on welfare and social equality, the verdict is even more mixed. But measurement is also more problematic. It is not unusual for mfis to be designed for reaching certain especially sought-after groups. However, as Meissner (2005) points out, reaching only these groups may have the perverse effect of creating jealousy among other groups - thus undermining the very restoration of social capital. Rather good programs should be available for everyone - including the conflict-affected groups. In designing products and strategies, the institutions offering mf should take care not to discriminate between potential clients, as mf can best help refugees integrate into society by offering them access to credit along the same lines as non-refugees. Furthermore, mere mf outreach does not constitute m f impact; while reaching the sought-after groups is a necessary condition for impact, it is the actions undertaken with the borrowed funds that result in real impact. Measuring and determining any causal effects is difficult - perhaps especially so with regard to the social goals of m f in post-conflict settings. Determining whether the rebuilding of social capital can be credited as mf, or whether it is due to resumption of 'normal' social activity, is challenging to measure. As noted by Meissner (2005), it might take years before the effects of social capital construction initiatives emerge. Matul and Tsilikounas (2004) also point towards methodological issues arising from an evident selection bias. They found that, in most cases, the non-clients either did not know of the possibility for receiving microcredit, or else were concerned that their 'lack of entrepreneurial spirit, skills, or ability to plan' would negatively affect their ability to repay the loans (Matul and Tsilikounas 2004, 8). The World Bank's self-evaluations of the lip projects concluded that implementations were more or less successful. The World Bank's (2004) self-evaluation of lip i found that the project had successfully assisted both the economically disadvantaged and other underserved groups in resuming economic activities, as well as supporting the establishment of an institutional framework for micro-credit. The evaluation also found bank performance and borrower performance to be highly satisfactory. Dunn (2005) found that over the period 2002-2004, m fi clients' annual per capita income increased, while the change in non-clients' income was insignificant. Over the same period, the figures showed that only the clients that had a relationship with the mfis for over more than two years experienced a lower share of households below the national poverty level - set at km 1100 per person. This effect was not identified with the clients that had a relationship extending for less than two years. Dunn's data indicate that 'income trends were generally positive' (2005, 23). But an alternative reading is also possible, inasmuch as the positive income effect from MF does not extend beyond the client base. Furthermore, the poverty-alleviating effect is not well documented. The data are somewhat unclear on this point, and may indicate that time is of the essence and that such effects may take a few years to materialize. Critics such as Bateman (2007b, 220) argue that: Very little evidence has emerged in b h to suggest that the commercial microfinance model actually possesses the required 'transformative capacity' to secure genuinely sustainable poverty reduction, through genuinely sustainable local economic and social development. On the contrary, the commercial microfinance model is quite centrally implicated in the evolution of the disturbingly weak, unsophisticated, anti-social, disconnected and unfair economic and social structures we see in bh today. Fourth, and finally, the evidence that microfinance has contributed to post-conflict reconstruction is limited, but positive. Conflicts and acts of war traumatize the entire civilian population. After the conflict has ended, the demobilized soldiers may experience issues related to participating in conflicts, and in reconciling with former enemies. Generally speaking, one can say that post-conflict societies tend to be characterized also by low social capital. As trust is normally expected, at least partially, to be built on iteration of relations (Hardin 2003), microfinance may have a positive impact as it facilitates network development by providing an arena for clients to meet. Goronja (1999) finds evidence that mf in bh did indeed produce such reconciliatory effects. Dunn's 2005 study found that microfinance did indeed reach the displaced persons - 36 percent of beneficiaries and 34 percent of new beneficiaries were dislocated - the demobilized, disabled and widowed. Conclusion The overall picture is mixed, bh may be a good critical test case, but the results are not unambiguous. It is clear that mf has had some positive contributions in this post-conflict situation, and that m fi sustainability in bh has been remarkable. However critics suggest that mf may actually have impeded sme development in the country, and that it remains a challenge to make mf have an impact beyond the client base. The mfi-model applied in bh has produced self-sustainable mf institutions with, amongst other improvements, acceptable repayment rates. But equating the success of mf institutions and their clients with the success of mf is not enough, mf also affects the broader legal and economic system, and seems to have had a positive impact in the bh case. Even so, mf does not necessarily translate into success in effect among the population. This point might be especially true in post-conflict situations where some of the goals are of a social nature. There is therefore some concern that a focus on m fi commercialization can be seen as a mission drift. For a poverty alleviation and post-conflict strategy to be truly successful, it cannot be limited to improving the standard of living for the mf clients alone. Such a strategy would be significantly impacted by the issue of adverse selection of mf clients. It can hardly be based on mf institutions' assessment of the potentially most profitable clients, mf may indeed have a positive effect on both the economic and social arena, and working to maintain and enhance these beyond the immediate p ost-conflict situation should still be a priority for mfis and governments alike. However, the evidence reviewed in this article should indicate that m f is no silver bullet, neither for securing economic growth nor for post-conflict reconciliation. It should therefore be carefully evaluated in each setting how m f is to be designed, and how it can best be complementary to other programs for economic and social enhancement. Yet, as Christen and Drake (2002, 16) point out, the 'ultimate irony of microfinance is that broad outreach is possible only if mfis are commercialized.' The principal lessons of the short history of microfinance in Bosnia-Herzegovina seem to be that microfinance is more impressive as a tool to improve the economic performance of the mf institutions and their clients than to reach broader social goals. It appears to be a better tool for aiding its participants' quest for wealth and escape from poverty on an individual basis, than for addressing wider problems of social integration or institution building. The bh case provides an almost ideal test case, since it featured well-educated citizens with incentives to re-build wealth after the destruction of war, and the evidence suggests that this did indeed take place. Moreover, in a context where political and economic institutions had been weakened by war, mf and mfo legislation provided a building block toward rebuilding a small- and medium-scale financial system. Broader social goals, from integration and reconciliation to inclusion, seem mainly to have been achieved (if at all) through mf's effect on its individual participants. Perhaps the clearest conclusion from the mf experience in bh is that even microfinance can only be used to pursue one goal directly; the broader goals may be achieved indirectly, if at all. Acknowledgments The authors would like to thank Dr. Dijana Tiplic and Professor Arild Tjeldvoll for their valuable comments and contributions. Notes 1 This is an online journal published by the United Nations Development Program (undp) and the London School of Economics and Political Science. 2 1,032 were current microcredit clients, and 405 were people who applied for loans but were not approved. References Bateman, M. 2007a. Undermining sustainable development with commercial microfinance in Southeast Europe. Development and Transition, no. 7:2-4. -. 2007b. De-industrialization and social disintegration in Bosnia. In What's wrong with microfinance? Ed. T. Dichter and M. Harper, 20723. Warwickshire: Intermediate Technology Publications. Bateman, M., and H. J. Chang. 2009. The microfinance illusion. Mimeo, University of Juraj Dobrila and University of Cambridge. Berryman, M., and J. Pytkowska. 2003. A Review of the Bosnian micro-finance sector: Move to financial self-sufficiency. Microfinance information exchange. Http://www.mfc.org.pl/doc/Publication/Review.pdf. Boudreaux, K., and T. Cowen. 2008. The micromagic of microcredit. Wilson Quarterly 32 (1): 27-31. Chen Chen, K., and M. Chivakul. 2008. What drives household borrowing and credit constraints? Evidence from Bosnia and Herzegovina, imf Working Paper 08/202. Christen, R. R, and D. Drake. 2002. Commercialization: The new reality of microfinance? In Commercialization of micro finance: Balancing business and development, ed. D. Drake and E. Rhyne, 2-21. Bloomfield, ct: Kumarian Press. Cicic, M., and A. Sunje. 2002. Micro-credit in transition economies. In Small enterprise development in South-East Europe: Policies for sustainable growth, ed. W. Bartlett, M. Bateman, and M. Vehovec, 145-70. Boston, ma: Kluwer. Demirguc-Kunt, A., L. E Klapper, and G. A. Panos. 2007. The origins of self-employment. Washington, dc: World Bank. De Montoya, M. L., R. Deshpande, and ]. Glisovic-Mezieres. 2006. Bosnia-Herzegovina: Country level savings assessment. Washington, dc: Consultative Group to Assist the Poor. Doyle, K. 1998. Microfinance in the wake of conflict: Challenges and opportunities. Bethesda, md: Micro enterprise Best Practices. Dunn, E. 2005. Impact of microcredit on clients in Bosnia-Herzegovina: Final report. Washington, dc: World Bank. Galusek, G. 2007. Commercial Microfinance: Too much, or not enough? Development and Transition, no. 7:4-7. Goronja, N. 1999. The evolution of microfinance in a successful post-conflict transition: The case study for Bosnia-Herzegovina. Paper presented at the loint ilo/unhcr Workshop: Microfinance in Post Conflict Countries, Geneva. Hardin, R. 2003. Gaming trust. In Trust and reciprocity: Interdisciplinary lessons for experimental research, ed. E. Ostrom and ]. Walker, 80-102. New York: Russell Sage Foundation. Hartarska, V., and D. Nadolnyak. 2007. An impact analysis of microfinance in Bosnia-Herzegovina. World Development 36 (12): 2605-19. Hudon, M., and M. D. Seibel. 2007. Microfinance in post-distaster and post-conflict situations: Turning victims into shareholders. Savings and Development 31 (1): 5-22. Hughes, ]., and B. Slay. 2007. The private sector and poverty reduction. Development and Transition, no. 7:9-12. Karnani, A. 2007a. Microfinance misses its mark. Stanford Social Innovation Review 5: (3): 34-40. -. 2007b. Romanticizing the poor harms the poor. Ross School of Business Working Paper Series 1096. Khandker, S. R. 2005. Microfinance and poverty: Evidence using panel data from Bangladesh. World Bank Economic Review 19 (2): 263-86. Larson, D. 2001. Searching for differences: Microfinance following conflict vs. other environments; mbp Microfinance following conflict. Bethesda, md: Micro enterprise Best Practices. Littlefield, E., J. Murduch, and S. Hashemi. 2003. Is microfinance an effective strategy to reach the millennium development goals? Focus Note, cgap. Lyman, T. R. 2005. Legal and regulatory environment for microfinance in Bosnia-Herzegovina: A decade of evolution and prognosis for the future. Essays on Regulation and Supervision 2. Washington, dc: cgap. Matul, M., and C. Tsilikounas. 2004. Role of microfinance in the household reconstruction process in Bosnia and Herzegovina. Journal of International Development 16 (3): 429-66. Mehmedovic, D., and R. Sapundzhieva. 2009. Bosnia and Herzegovina microfinance analysis and benchmarking report 2008. Association of Microfinance Institutions in Bosnia and Herzegovina / Microfinance Information Exchange. Meissner, L. K. 2005. Microfinance and social impact in post-conflict environments. Washington, dc: School of International Service of American University. MkNelly, B., and C. Dunford. 1998. Impact of credit with education on mothers and their young children's nutrition: Lower rural bank credit with education in Ghana. Freedom for Hunger Research Paper 4. -. 1999. Impact of credit with education on mothers and their young children's nutrition: crecer credit with education program in Bolivia. Freedom for Hunger Research Paper 5. Nagarajan, G. 1999. Microfinance in post-conflict situations: Towards guiding principles for action. Paper presented at the Joint ilo/unhcr Workshop on Microfinance in Post-Conflict Countries, Geneva. Nagarajan, G., and M. McNulty. 2004. Microfinance amid conflict: Taking stock of available literature; Accelerated microenterprise advancement project. Washington, dc: usaid. Ohanyan, A. 2002. Post-conflict global governance: The case of microfinance enterprise networks in Bosnia-Herzegovina. International Studies Perspectives 3 (4): 396-416. Pitt, M. M., and S. R. Khandker 1998. The impact of group-based credit on poor households in Bangladesh: Does the gender of participants matter? Journal of Political Economy 106 (5): 958-96. Putnam, R. 1993. Making democracy work: Civic traditions in modern Italy. Princeton: Princeton University Press. Ribic, M. 2005. Mikrofinansiranje u BiH od 1996 do 2005 godine. Mikrofi-nansije 5:1-4. Roodman, D., and J. Morduch 2009. The impact of microcredit on the poor in Bangladesh: Revisiting the evidence. Working Paper 174, Center for Global Development. sdn. 2001. Innovative approaches to microfinance in post-conflict situ- ations: Bosnia local initiatives project. Social Development Notes 50, Social Development Family in the Environmentally and Socially Sustainable Development Network of the World Bank. Seibel, H. D. 2006. Financial linkages in Mali: Self-reliance and liquidity balancing versus liquidity supply and donor dependence. Small Enterprise Development 17 (1): 40-9. Storey, D. J. 1994. Understanding the small business sector. London: Rout-ledge. Weiss, L. 1988. Creating capitalism: The state and small business since 1945. Oxford: Blackwell. Woodworth, W 2006. Microcredit in post-conflict, conflict, natural disaster, and other difficult settings. Provo, ut: Brigham Young University. World Bank. 2001. Project evaluation for credit of 20 million usd to Bosnia-Herzegovina for project local (microfinance) initiatives 11. World Bank Report. -. 2004. Project performance assessment report Bosnia and Herzegovina. World Bank Report. -. 2005. Implementation completion report local initiatives (micro- finance) project ii. World Bank Report. Application of Bootstrap Methods in Investigation of Size of the Granger Causality Test for Integrated var Systems Lukasz Lach This paper examines the size performance of the Toda-Yamamoto test for Granger causality in the case of trivariate integrated and cointe-grated var systems. The standard asymptotic distribution theory and the residual-based bootstrap approach are applied. A variety of types of distribution of error term is considered. The impact of misspecifica-tion of initial parameters as well as the influence of an increase in sample size and number of bootstrap replications on size performance of Toda-Yamamoto test statistics is also examined. The results of the conducted simulation study confirm that standard asymptotic distribution theory may often cause significant over-rejection. Application of bootstrap methods usually leads to improvement of size performance of the Toda-Yamamoto test. However, in some cases the considered bootstrap method also leads to serious size distortion and performs worse than the traditional approach based on^2 distribution. Key Words: bootstrap methods, simulation, Granger causality, var models jel Classification: ci2, C15 Introduction The causal relationship (in the Granger sense) between some considered variables is one of the most important issues in modern economics. The existence of this type of dynamic link guarantees that the knowledge of past values of one considered time series is useful in predicting current and future values of another one. Since the development of this concept (Granger 1969) a number of studies examining properties of different testing methods have been published. One of the first approaches was the standard Wald test based on asymptotic distribution theory. The biggest advantage of this method was its simplicity and clarity. However, in case of variables which are integrated of order one (1(1)) or cointe-grated, the standard asymptotic approach turned out to be an improper Lukasz Lach is an assistant in the Department of Economics and Econometrics, agh University of Science and Technology, Poland. Managing Global Transitions 8 (2): 167-186 tool for testing the causal effects. These nonstandard asymptotic properties of the Wald test were investigated by Granger and Newbold (1974, empirical findings) and Philips (1986, theoretical framework). As a cure for this problem the idea of the Vector Error Correction Model (see En-gle and Granger (1987) and Granger (1988)) was developed. Although theoretically it was a useful tool for testing for causality in integrated-cointegrated var systems, the complicated pretesting procedure (estimation of unit roots, analysis of cointegration properties and sensitivity for improper lag establishment) turned out to be a serious difficulty in empirical applications. Another solution was proposed by Toda and Yamamoto (1995). This approach ensures that asymptotic distribution theory is valid for var systems, regardless of the order of integration of considered variables or the dimension of cointegration space. Furthermore, the important advantage of this method is its simplicity since it is just a small modification of the standard Wald test. The absence of pretesting bias made this procedure one of the most widely applied approaches in recent economic research. However, when some standard assumptions do not hold (especially concerning the distribution of error term) the Toda-Yamamoto approach is also likely to fail. Application of the bootstrap approach may often provide better results since bootstrapping does not strictly depend on model specification (for more details on bootstrap see Efron (1979)). The properties of the augmented Wald test in both the asymptotic and bootstrap variant were examined by a number of authors in recent years. Dolado and Lütkepohl (1996) conducted a simulation exercise to examine the power of the considered testing method in the case of the integrated var model (in this paper the error term was independently drawn from identical multivariate normal distribution). Their outcomes show that in high dimensional vars with a small true lag length the significant reduction of power of the considered causality test may occur, especially for small samples. Mantalos (2000) conducted similar studies of size and power properties of eight versions of the Granger causality test (this time the error term was only N(o, i2) i. i. d.). His findings indicate that the standard asymptotic approach may often lead to significant size distortion. Application of the residual-based bootstrap technique usually improves the size and power performance of causality tests. Hacker and Hatemi (2006) examined size properties of the t y (Toda-Yamamoto) test for two-dimensional var systems. In contrast to previously mentioned authors, they also investigated the simple arch(i) case for error term series, finding that the bootstrap technique performed relatively well in all cases. On the other hand they restricted the research only to models without cointegration. This paper is a generalization of previous studies concentrated on investigation of size properties of the ty test. The simulation study contained in this article (in both asymptotic and bootstrap variants) examines three-dimensional integrated and cointegrated var models. All possible cointegration ranks are also considered. To check the size properties of the investigated test (also in cases where some standard assumptions do not hold) a variety of distributions of error term is applied in dgp (spherical multivariate normal distribution, highly correlated error terms, structural break, mixture of distributions, arch(2) effect). The impact of misspecification of initial parameters is also examined in each case. Finally, the impact of increase of sample size (from small to medium) as well as the influence of increase in the number of bootstrap replications on size performance of the ty test is examined in some specific cases. To the knowledge of the author, the results of this kind of study of size performance of the ty test in both asymptotic and bootstrap variant have not been published so far. This paper is organized as follows. The next section contains the main research hypotheses to be tested by the simulation study. Section 3 provides details on the methodology of the ty test, specification of var models used for simulation purposes and the considered bootstrap technique. Section 4 contains results of all conducted simulations. Section 5 concludes the paper. Main Hypotheses The main objective of this paper is the investigation of size properties of the Toda-Yamamoto test for Granger causality. The first important point that distinguishes this study from the existing literature is the use of the trivariate var model for simulation purposes. Most of the previous papers examine two-dimensional models. In the three-dimensional case the structure of causal links may be more extended. Another important point is the fact that this paper examines all possible dimensions of cointegration space. As already mentioned, former studies concentrating on a similar topic provided evidence of poor performance of the modified Wald procedure in the case of nonstationary variables. Thus, it seems to be reasonable to formulate: Hj The Toda-Yamamoto test (asymptotic variant) often tends to over- reject the null hypothesisfor integrated and cointegrated var systems (with various cointegration ranks). There are some ways to avoid the mentioned problem. One of the possibilities is the application of bootstrap methods. This approach has been commonly used in recent years despite its numerical complexity. Thus, one may be interested in testing the following hypothesis: h2 The residual-based bootstrap method usually improves size performance of the ty test. In practice the proper specification of the var model is often difficult to obtain. One of the most common problems is the misspecification of lag parameter. Previous studies (see Hacker and Hatemi (2006) and Mantalos (2000)) show that in this case the size performance of the ty test (asymptotic variant) may significantly worsen. It may be interesting to determine how the bootstrap-based technique performs in this case. Therefore, we should test: h3 Misspecification of lag parameter in the var model leads to considerable aggravation of size performance of ty only in the asymptotic variant. Despite the fact that bootstrap methods are often a useful tool to overcome the problem of size distortion in the ty test, there are some specific cases where this approach may also fail. One important point that distinguishes this study from the existing literature is the fact that, in order to perform suitable simulation, a variety of types of error term distribution was used (possibilities, where some standard assumptions about the structure of the considered var models and ty methodology are unfulfilled, are examined). Therefore, this paper contains verification of the following: h4 Residual-based bootstrap is likely to fail in some specific cases and therefore should not be used without second thought. One of the main problems with the application of standard asymptotic distribution theory is the sample size. Previous papers provided empirical proof that the increase of sample size may significantly improve size performance of the ty test (see Dolado and Lütkepohl 1996; Hacker and Hatemi 2006; Mantalos 2000). However, this process may strongly depend on model specification (especially the error term structure). Thus, it seems to be interesting to test the following hypothesis: h5 When standard assumptions hold, the increase of sample size improves size performance of the ty test (asymptotic variant). In order to apply the bootstrap technique the researcher must establish the number of bootstrap replications. In previous papers this number varied significantly (from dozens to hundreds). It may be interesting to investigate whether the change of number of bootstrap replications may lead to significant improvement of size performance of the ty test in some specific cases (namely, cases of relatively significant size distortion). This problem may be captured in verification of following: h6 There is a relationship between the number of bootstrap replications and size performance of the ty test in some specific cases. In order to test the above research hypotheses some simulation study must be performed. In the first step, comprehensive analysis of the considered methodology and dgp should be presented. The next section contains some essential information concerning methodology and data. Methodology and the Data Generating Process In this article the Toda-Yamamoto approach for testing Granger causality is considered. This method has been commonly applied in recent studies since it is relatively simple to perform and free of complicated pretesting procedures. Another issue worth underlying is the fact that this method is useful for integrated and cointegrated systems. To understand the idea of this type of causality testing consider the following «-dimensional var(p) process: P yt = c + Yj Aiyt-i + £t> C1) i=1 where yt = (y\,... ,yf)tr, c = (cu..., cn)tr and et = (e1)t,..., e„)t)tr are «-dimensional vectors, and {Ai}pj=l is a set of n x n matrices of parameters for appropriate lags (in this paper transpose of matrix M is denoted by Mtr). The order p of the process is assumed to be known. Furthermore, we shall assume that the error vector is an independent white noise process with nonsingular covariance matrix (the elements of which are constant over time). In this article cases where these standard assumptions do not hold are also investigated. We also assume that the condition £|g)t)t|s+2 < 00 holds true for all k = 1and some s > o. The Toda-Yamamoto (1995) idea of testing for causal effects is based on esti- table i Compact notation used to formulate ty test statistics Object Description D: = (c,À1,...,Àp,...,Àp+d) i rt yt-1 Zt-.= _yt-p-d+i __ 6:= (e1,...,eT) nxT matrix n X (l + n(p + d)) matrix (i + n(p + d))x i matrix, t = i,..., T (i + n(p + d)) X T matrix nxT matrix mating the augmented var (p + d) model (circumflex indicates the ols estimator of a specific parameter): p+d yt = c + ^Aiyt-i +st. (2) i=1 The value of parameter d is equal to the maximum order of the integration of considered variables y1,... ,)>". We say that the k- th element of yt does not Granger-cause the /-th element of yt(k,j e {1,...,«}) if there is no reason for rejection of the following hypothesis: n0:asjk = o, (3) for s = 1,...,p, where As[aspq]p>q=1>,„>n for s = 1,...,p. According to Toda and Yamamoto (1995) the number of extra lags (parameter d) is an unrestricted variable since its role is to guarantee the use of asymptotic theory. In order to present the test statistics we shall make use of the compact notation (T denotes the considered sample size) presented in table 1. The initial point of the considered procedure is the calculation of Sjj = 66tr/T - the variance-covariance matrix of residuals from the unrestricted augmented model (i.e. model (2)). Then we can define ß: = vec(c,A1,..., Ap, onxnci) and ß: = vec(c,A1,... ,Ap,... ,Ap+ci) where vec(-) denotes the column stacking operator and o,,xlu/ stands for the n X nd matrix filled with zeros. Using this notation one can write the Toda-Yamamoto test statistics for testing for causal effects between variables in yt in the following form: ty: = (Cß)tr{C{{ZZtrV ® Su)Ctrr\Cß), (4) where <8> denotes Kronecker product and C is the matrix of suitable linear restrictions. In our case (testing for causality from one variable in yt to another) C is p x (1 + n(p + d)) matrix, the elements of which take only the value of zero or one. Each of p rows of matrix C corresponds to a restriction of one parameter in ß. The value of every element in each row of C is one, if the associated parameter in ß is zero under the null hypothesis and it is zero otherwise. There is no association between matrix C and the last nxd elements in ß. This approach allows us to write the null hypothesis of non-Granger causality in the following form: h0: Cßtr = o. (5) Finally we shall note that the ty test statistic is asymptotically^2 distributed with the number of degrees of freedom equal to the number of restrictions to be tested (in our case this value is equal to p). In other words, the ty test is just a standard Wald test applied for the first p lags obtained from the augmented var (p + d) model. In order to examine the size properties of the ty test some 1(1) models are considered. Causality tests are conducted in the case of various cointegration ranks. At this place we shall once again consider model (1). This process can be rewritten in the following error correction form: p-i Ayt = c + ]~[ yt-1 + J] TiAyt-i + et, (6) i=1 where n = _I + Z'-'=l A' and V, = - Aj. To ensure that y, is integrated of order one the following assumptions must hold (these assumptions are sufficient to prove the so-called lohansen-Granger representation theorem, for more details see lohansen 1991; 1996): • The roots of the characteristic polynomial: det(i„ - Axz - A2z2-----Apzp) (7) are either outside the unit circle or equal to one; • The matrix n has reduced rank r < n and therefore may be expressed as the product n = <>'ß"\ where a and ß are n x r matrices of full column rank r; • The matrix (y'fVß^ has full rank, where T = I - X'-'=l T, and where a± and ß± are the orthogonal complements to a and ß. table 2 Specification of trivariate var models considered in this paper Matrix form Properties Symbol 100 0 10 0 0 1 i o -o, 125 01 o 0,5 0,5 0,5 0,25 o -0,125 010 -0,75 o 0,875 No cointegration A Two cointegrating equations A2 One cointegrating equation If the first assumption holds, then the considered process is neither explosive (roots in the unit circle) nor seasonally cointegrated (roots on the boundary of the unit circle different from z = 1, for more details on this issue see Hylleberg et al. 1990; Johansen and Schaumburg 1988). The second assumption ensures that there are at least n-r unit roots. Cointegration occurs whenever r > o and the number of cointegrating vectors is equal to r. To restrict the process from being 1(2), we shall assume the last condition, because together with the second one it ensures that the number of unit roots is exactly n-r. In this paper trivariate var models are considered. In each the case process described by the model is integrated of order one and the parameter p is equal to one. Therefore, we consider the following var(i) model which is used as a dgp: yt = c + Ayt-r + st, (8) where c(o, 01 o, 01 o, oi)tr in all cases and matrix A provide specific cointegration properties (see previously presented assumptions). For details about matrices used in the simulation study explore table 2. Directly from table 2 we can obtain some essential information. Namely, in A2 and A3 models y3 is a causal variable for y1. Furthermore, in all considered cases y2 does not Granger-cause y1 (this will be our null hypothesis for further analysis of size performance). We should underline that in three-dimensional var models the relationship between y3 and /, as well as between y3 and y2, may have indirect impacts on links between y2 and y1. Beside various schemes of algebraic structure, Application of Bootstrap Methods 175 table 3 Models used to generate distribution of error term Distribution of error term Parameters Symbol n(o3x1,o-2i3) IT = 1 E, 0 100 N(M,ct) ß = 0 ) c = 0 1 0,9 e2 0 0 0,9 1 N(o3X1,cr12i3)/or t = 1,..., T/2 CT1 = 1, cr2 = 2 E3 N(o}yi,crli}) for t = T/2 + 1, ...,T sN-, + (1 - s)N2, where: N-, ~ N(o3x1, ct2i3), 0"i = 1, 0-2 = 3 ,p = 0,7 E4 n2 ~n(o3x1, o-2i3), P(s = 1) = p, P(s = o) = 1 - p £j,t = nt y0' 5 + o, + o, 4£*t2 Wjj - i. i. d. N(o, 1) j = 1,2,3, t = 1,..., T some specific distributions of error vectors are also examined. At this place it should be noted that in previous studies concentrating on similar topics the error term was usually N(onxl, t are highly correlated (E,) is also investigated. In this case the variance-covariance matrix Su is 'nearly singular,' which may often lead to problems with application of bootstrap methods (see Horovitz 1995; or Chou and Zhou 2006). Another specification of the distribution of error term series is related to the structural break (E3). It is a well known fact that in this case huge size distortions may occur while testing for Granger causality. Another question is whether application of the bootstrap approach may significantly improve the investigated size properties. The fourth examined possibility (E4) is related to the idea of a mixture of distributions. The last considered dgp for error vector (E5) is a simple arch(2) model with constant unconditional variance (equal to one). A similar type of time dependence structure in the error term series was examined by Hacker and Hatemi (2006) (the authors used arch(i) model for var(i) and var(2) processes). As a cure for the effect of start-up values, 50 presample observations of yt are generated for each simulation study. Some of these data points (based on random draw from N(o, 1) distribution) are used as the initial observations for var models. To make the results of the presented research more comparable, the same random draw from N(o, 1) distribution is also used for every type of the error term analyzed. Namely, to create the E2 = (£2,t)t=i,...,r series, the following transformation of £1 = (£i,t)t=i,...,t series is applied: Eu = ZE1>t, (9) where t = 1,..., T and ZZtr = 2 (Cholesky decomposition). The values of the series are also used in the process of generation of E4 series and E3 series (for first 772 observations). In order to generate E5 series, initial observations are once again drawn from N(o, 1) distribution and (1-V wu w}>t)tr = E1>t for t = 1,..., T. To examine the size properties of the considered test a set of simulated observations is generated each time (using model (1) with specific A, and Ej) and the ty test statistics are calculated to test the hypothesis that y1 does not Granger-cause /. Typical significance levels (namely, 1%, 5% and 10%) are considered, and both the asymptotic distribution theory (as noted by Toda and Yamamoto) and a residual-based bootstrap approach are used to get suitable critical values. Let me now discuss shortly the bootstrap methods used in this paper. All bootstrap simulations conducted for the use of this article are based on resampling leveraged residuals. The application of leverages is the simple modification of regression raw residuals, which helps to stabilize their variance (for more details on this issue see Davison and Hinkley 1999; Hacker and Hatemi 2006). At first the considered augmented var model (2) is estimated through ols methodology with the null hypothesis assumed (that is: yx does not Granger-cause y'). Many authors use ols methodology in their empirical research, although other estimation methods are more adequate for their data. This paper partly investigates the influence of the mentioned approach on performance of the considered causality tests. In the next step, regression raw residuals are transformed with the use of leverages (modified residuals will be denoted as {£™};=i,...,r). Finally, the following algorithm is conducted: • draw randomly with replacement (each point has a probability measure equal to 1 IT) from the set {£"'};=i,...,7 (as a result we get the set {e"}i=1>...>T); • subtract the mean to guarantee that the mean of bootstrap residuals is zero (in this way we create the set {e-};=i,...,r> such that yT e* ek,i - ek,i J i = i,...,T, k = 1,2,3); • generate the simulated data {y*}i=1,...,t through use of the original data ({yi}i=i,...,r)> coefficient estimates from the regression (c, {/\;};=li i/)+(/j and the bootstrap residuals {e-};=i,...,r ; • calculate the ty test statistics. After repeating this procedure N = 250 times it is possible to create the empirical distribution of ty test statistics and next obtain empirical critical values (bootstrap critical values). The suitable procedure (which allows one to conduct every type of simulation presented in this article) written in Greti is available from the author upon request. Empirical Results In this section, results of the conducted causality tests are presented. The following tables contain the rejection rates obtained while testing the null hypothesis in the ty test with the application of both the standard asymptotic distribution theory and the residual-based bootstrap approach. In recent years the problem of establishing adequate significance levels for diagnostic applications has been intensively discussed. Some researchers recommended relatively large levels (Maddala 1992), while others argue that typical values are the best choice (MacKinnon 1992). As already mentioned in this article, typical significance levels are considered. Thus the results of the presented simulations are more comparable with the similar research conducted by Hacker and Hatemi (2006) and Mantalos (2000). To judge whether empirical rejection rates are significantly different from considered nominal sizes for each significance level, the 95% two-sided confidence intervals were created by the following expression: Ts±2i(10) where Ts denotes the considered nominal size (1%, 5%, 10%) and Nr = 1000 stands for the number of repetitions. This value (Nr = 1000) was also used by Dolado and Lütkepohl (1996), Hacker and Hatemi (2006) and Mantalos (2000). Furthermore, the considered type of confidence intervals was used by Dolado and Lütkepohl (1996) and Mantalos (2000). In this way, the intervals [0.4%; 1.6%], [3.6%; 6.4%], [8.1%; 11.9%] were established for 1%, 5% and 10% significance levels respectively. The considered approach leads to the criteria of bad performance, namely, the actual test size is significantly distorted whenever it lies outside the suitable confidence interval. In the following tables these findings are indicated by bold typeface. In each case the parameter d (maximal order of integration of considered variables) is equal to one (properly specified). For tables 4-9 the considered sample size is T = 40 (small sample size). First we shall focus on cases where parameter p was chosen properly. Suitable results are contained in tables 4-6. After analyzing the results contained in table 4, one can easily see that the asymptotic distribution theory was found to cause serious size distortions in almost all cases. The largest distortions were indicated in the case of structural change in error term distribution (£3). Furthermore, it should be noted that whenever critical values were taken from suitable^2 distribution the over-rejection was indicated, which seems to prove that Hypothesis 1 is true. The application of the bootstrap method improved the size properties of the ty test for all significance levels in cases of E1} E4 and £5 distribution. These results provided a strong basis for claiming that Hypothesis 2 is also true. Although the significant over-rejection was still found for E3 error distribution (except for 10% level), size distortions were much smaller than in the non-bootstrap approach. However, one must note that the bootstrap test was found to under-reject the null hypothesis in the case of E2 distribution, which led to significant size distortions by 5% and 10% significance levels (even worse performance than for^2 distribution). The outcomes obtained by Hacker and Hatemi (2006) in corresponding research conducted for similar two-dimensional cases (Aj model, E, and E5 error term) are in line with the results presented in table 4. The outcomes contained in table 5 and 6 also lead to some interesting regularities and provide no significant reason for rejection of Hypothesis 1 or Hypothesis 2. Firstly, they confirmed the hypothesis that the ty test based on asymptotic distribution theory tends to over-reject the null hypothesis also when there exists cointegration between considered vari- Application of Bootstrap Methods 179 table 4 Size of t y test for Granger causality - no-cointegration case (1) (2) (3) X2 distribution Bootstrap distribution 1% 5% 10% 1% 5% 10% A e1 1 1.7% 6.1% 13.2% 0.8% 4.6% 10.9% ex 1 1.9% 5.6% 11.6% 0.4% 2.9% 7-7% e3 1 7-7% 15-3% 20.6% 2.8% 7-4% 10.8% e4 1 1.7% 7.8% 12.4% 0.6% 4.2% 9.6% e, 1 1.4% 6.5% 11.2% 0.8% 5.2% 9.1% notes Column headings are as follows: (1) algebraic structure, (2) distribution of error term, (3) lag p. table 5 Size of ty test for Granger causality - case of two cointegrating vectors (1) (2) (3) X2 distribution Bootstrap distribution 1% 5% 10% 1% 5% 10% A2 e, 1 0.8% 3-5% 10.8% 0.9% 4.8% 9.9% ex 1 1.2% 5.5% 14% 1% 4.9% 11% e3 1 5% 14% 25% 3.6% 8.9% 18% £4 1 1.9% 6.7% 14% 1.1% 5.3% 12% e, 1 1.5% 6.8% 11% 1.1% 4.7% 10.5% notes Column headings are as follows: (1) algebraic structure, (2) distribution of error term, (3) lag p. table 6 Size of ty test for Granger causality - case of one cointegrating vector (1) (2) (3) distribution Bootstrap distribution 1% 5% 10% 1% 5% 10% A3 e, 1 1.2% 7-4% 11.5% 0.9% 6.1% 10.6% E2 1 2.6% 5.8% 14.7% 0.2% 2.1% 5.2% e} 1 6.7% 11.6% 26% 2.4% 5.9% 11.4% e4 2.5% 8% 15.6% 0.8% 4.7% 10.6% e-, 1 1.5% 5.9% 12.6% 0.7% 4.2% 9% notes Column headings are as follows: (1) algebraic structure, (2) distribution of error term, (3) lag p. ables (Dolado and Lütkepohl (1996) and Mantalos (2000) examine cointegration ranks which are no greater than one). Secondly, they provided a basis for claiming that the application of bootstrap methods leads to reduction of actual test size in comparison to the asymptotic method. However, this reduction is still insufficient for the A2 algebraic structure and E3 error distribution scheme (still over-rejection) and too intensive for the A3 and E, case (under-rejection, worse performance in comparison to^2 distribution on 5% and 10% significance levels). In practice it is often difficult to establish the lag parameter properly before estimating the var model. Despite the variety of econometric methods (aic, bic, fpe information criteria, more recently Hatemi's (2003) criterion) many researchers are still struggling to decide what value of lag length to choose for further analysis. In the context of our investigation this problem was examined by the repetition of all causality tests in case of a misspecified value of parameter p (set at the level of 2). For clarity it should be mentioned that true dgp was unchanged. The results are shown in tables 7-9. It seems to be obvious that the results contained in tables 7-9 should be analyzed together with corresponding outcomes from previously presented cases (contained in tables 4-6 respectively). After analyzing the results contained in table 7 (no-cointegration case) one can easily see that the standard approach (based on distribution) causes even stronger over-rejection (higher rejection rates) than in the corresponding case (table 4). On the other hand, the results obtained with application of the bootstrap method belong to suitable confidence intervals in all except for one case (in comparison to the corresponding case). For the model with two cointegrating vectors (A2) the actual test size (case of X2 distribution) is too high in all except for 3 cases. This means that mis-specification of parameter p considerably worsens size performance of the ty test. Furthermore, the actual size of the bootstrap test was found to lie outside the confidence interval for exactly the same combination of considered significance levels and error term schemes, like in the corresponding case (table 5). The standard asymptotic approach was also found to cause serious over-rejection for the A3 structure in almost all cases. On the other hand, actual test size based on the bootstrap method was distorted only for the E2 (under-rejection) and E3 (over-rejection) case. In general, size performance of the ty test worsened significantly only for the asymptotic variant, which allows us to claim that Hypothesis 3 is true. Furthermore, the results contained in tables 4-6 as well as in tables 7-9 strongly indicate that Hypothesis 4 is also true (see results obtained for the E2 and E3 cases). Additionally, to examine the size performance of the ty test in both considered variants, causality tests were conducted for a longer sample. table 7 Size of ty test for Granger causality - no-cointegration case, misspecified parameter]? (1) (2) (3) x2 distribution Bootstrap distribution 1% 5% 10% 1% 5% 10% a % 2 2.1% 10.6% 16% 0.9% 4.5% 10.2% ex 2 1.8% 6.5% 13-5% 0.8% 3-1% 7.1% e3 2 9% 19% 33% 4-5% 9.1% 18.5% £4 2 1.8% 9% 15.5% 0.9% 4.6% 9.5% 2 1.4% 4.6% 14% 0.7% 4.1% 9.3% notes Column headings are as follows: (1) algebraic structure, (2) distribution of error term, (3) lag p. table 8 Size of ty test for Granger causality - case of two cointegrating vectors, misspecified parameter p (1) (2) (3) x2 distribution Bootstrap distribution 1% 5% 10% 1% 5% 10% A2 e, 2 1.3% 6.1% 12.8% 1.2% 4.6% 9.4% ex 2 1.4% 7.2% 13.6% 0.8% 4.8% 9.6% e, 2 8.5% 20% 27% 6.1% 14% 19.7% e4 2 2.8% 6.8% 17.1% 0.8% 4.8% 13-4% 2 2.1% 8.4% 12.7% 1.1% 5.6% 9.7% notes Column headings are as follows: (1) algebraic structure, (2) distribution of error term, (3) lag p. table 9 Size of ty test for Granger causality - case of one cointegrating vector, misspecified parameter p (1) (2) (3) distribution Bootstrap distribution 1% 5% 10% 1% 5% 10% A3 E, 2 1.4% 6.3% 14.1% 0.9% 4.8% 10.3% ex 2 3-9% 8.2% 14.8% 0.1% 1.9% 5.1% e3 2 7.6% 13.6% 29% 3-9% 8.3% 14.7% e4 2 2.8% 9.2% 17.6% 1.1% 4.4% 11.3% e-, 2 2.2% 8.5% 13.9% 0.8% 4.6% 9.5% notes Column headings are as follows: (1) algebraic structure, (2) distribution of error term, (3) lag p. One should expect the standard asymptotic approach to perform relatively better in this case. Suitable tests were conducted for the sample size table io Impact of increase of sample size on size properties of ty test for Granger causality - no-cointegration case (1) (2) (3) X2 distribution Bootstrap distribution 1% 5% 10% 1% 5% 10% A E, 1 1.1% 6.2% 12% 0.9% 4.2% 9.5% (1.7%) (6.1%) (13-2%) ( 0.8%) (4.6%) (10.9%) E, 2 1.3% 5.6% 13-5% 1.1% 4.9% 10.3% (2.1%) (10.6%) (16%) ( 0.9%) (4.5%) (10.7%) notes Column headings are as follows: (1) algebraic structure, (2) distribution of er- rorterm, (3) lag p. table 11 Size of t y test for Granger causality - different number of bootstrap replications in s pecific cointegrated systems (1) (2) (3) r ! distribution Bootstrap distribution N 1% 5% 10% 1% 5% 10% A2 E3 2 8.5% 20% 27% 9.1% 19.6% 24% 100 5.2% 16.3% 22.1% 200 6.1% 13-5% 20.1% 300 A3 E2 2 3-9% 8.2% 14.8% 0% 3% 3-4% 100 0.5% 2.5% 5-5% 200 0.6% 1.2% 3-5% 300 notes Column headings are as follows: (1) algebraic structure, (2) distribution of error term, (3) lag p. T = 100 and no-cointegration model with parameter p = 1 and p = 2 (Hacker and Hatemi (2006) also considered a sample size equal to T = 40 (small sample) and T = 100 (medium sample)). For comparability with previous results (obtained for T = 40), the first 40 data points were exactly the same. Once again the true value of parameter d was assumed to be known. The results are presented in table 10. For clarity it should be noted that values in parentheses denote the rejection rates obtained in a similar investigation conducted for a small sample (T = 40). The analysis of the above table confirmed the hypothesis that size properties of the ty test for Granger causality are improving with the increase of sample size. Although for a 10% significance level the actual size of tests still lies outside the 95% confidence interval, the increase of sample size moved actual size closer to the nominal one. Furthermore, the actual size of bootstrap tests was again found to lie with in suitable confidence intervals in all cases. On the other hand, it should be noted that for other the considered distributions of error term (E2,E},E4,E5) such significant improvement of size performance was not found in considered algebraic specification (AJ. All these facts confirm that there is no significant reason for the rejection of Hypothesis 5. One of the initial arbitrary decisions in every bootstrap application is the establishment of the number of replications. In previous research concentrated on similar investigation this value varied significantly. Horovitz (1994) used 100 replications, Mantalos (2000) - 200, Hacker and Hatemi (2006) - 800, while Davidson and MacKinnon (1996) used 1000 replications to create bootstrap distribution each time. Increase of the number of replications may often have an important impact on improvement of performance of the ty test size. However in some situations bootstrap methods are likely to fail, regardless of the number of replications used (Horovitz 1995). This paper takes part in the discussion of the mentioned problem, as it contains results of some simulations based on different numbers of bootstrap replications. The investigation covers two specific cases in which the size distortion of bootstrap distribution was relatively largest and far away from 95% confidence intervals (namely, high correlation and structural change cases). It should be noted that for comparability with the previously presented outcomes (conducted for 250 bootstrap replications) the same series of random numbers were used to generate the data. Therefore, the actual size of the ty test conducted with application of x1 distribution was unchanged. Parameter d was again assumed to be known (d = 1). The examined number of bootstrap replications was denoted by N. Table 11 contains the results of suitable simulations. The results contained in table 11 confirmed that the increase in the number of bootstrap replications caused a decrease of actual test size for the A2 model at 5% and 10% significance levels. However, the intensity of this process turned out to be insufficient and the actual size still lay outside confidence intervals in all cases. A similar effect (decrease of actual size) was found for the A3 model at 5% significance level, but this time the size performance had worsened while N increased. Finally, it should be noted that for the A3 model the actual size was found to grow with an increase of N at 1% significance level (relatively good performance was found for N = 200 and N = 300 replications). Summarizing, these outcomes provided no clear evidence of whether Hypothesis 6 is true or false. However, they did provide a strong basis for claiming that Hypothesis 4 is indeed true. Concluding Remarks The aim of this paper was to examine the size properties of the Toda-Yamamoto test for Granger causality in the case of a relatively small sample size. The simulation study was conducted for integrated order-1 trivariate var models, and a variety of distribution of error vector was also considered during computation. In order to perform suitable research, both the standard asymptotic distribution theory as well as the residual-based bootstrap technique were used. The results of the conducted simulation study in the case of properly specified lag parameters indicate that the standard asymptotic approach causes significant over-rejection in almost all considered cases. The application of the residual-based bootstrap method improved the size performance of the t y test, however, in the case of structural break and high correlation the actual size was still far away from the nominal one. The misspecification of the lag parameter caused much worse performance of the ty test when asymptotic theory was applied. In general, the performance of the bootstrap method has not worsened in such a significant way. The results contained in this paper support the hypothesis that asymptotic distribution theory performs better for longer time series. However, except for the case of spherical multivariate normal distribution of error term, this type of significant improvement has not been observed. Furthermore, test results obtained in cases of high size distortion of the bootstrap-based technique brought no clear suggestion about the relationship between the number of bootstrap replications and the actual size of the test. The outcomes contained in this article should be useful tips for other researchers using considered variants of the Toda-Yamamoto test in their practical applications. The presented results ensure that bootstrap based on leveraged residuals is often an effective tool for Granger causality testing, which allows avoidance of the problem of over-rejection of the considered null hypothesis. However, the conducted simulation study confirms that this method cannot be used without a second thought, since it is likely to fail for specific models. References Chou, R H., and G. Zhou. 2006. Using bootstrap to test portfolio efficiency. Annals of Economics and Finance 2:217-49. Davidson, R., and J. G. MacKinnon. 1996. The size distortion of bootstrap tests. Working Paper 936, Queen's University. Davison, A. C., and D. V. Hinkley. 1999. Bootstrap methods and their application. Cambridge: Cambridge University Press. Dolado, J. J., and H. Lütkepohl. 1996. Making Wald tests work for cointe-grated var systems. Econometrics Reviews 15 (4): 369-86. Efron, B. 1979. Bootstrap methods: Another look at the Jacknife. Annals of Statistics 7 (1): 1-26. Engle, R. E, and C. W. J. Granger. 1987. Cointegration and error correction: representation, estimation and testing. Econometrica 55 (2): 251-81. Granger, C. W. J. 1969. Investigating causal relations by econometric models and cross-spectral methods. Econometrica 37 (3): 424-38. -. 1988. Some recent developments in the concept of causality. Journal of Econometrics 39 (1-2): 199-211. Granger, C. W. J., and P. Newbold. 1974. Spurious regression in econometrics. Journal of Econometrics 2 (2): 111-20. Hacker, R. S., and A. Hatemi-I. 2006. Tests for causality between integrated variables using asymptotic and bootstrap distributions: theory and application. Applied Economics 38 (13): 1489-500. Hatemi-I, A. 2003. A new method to choose optimal lag order in stable and unstable var models. Applied Economics Letters 10 (3): 135-7. Horowitz, J. L. 1994. Bootstrap-based critical values for the information matrix test. Journal of Econometrics 61 (2): 395-411. -. 1995. Bootstrap methods in econometrics: Theory and numerical performance. In Advances in economics and econometrics: Theory and applications; Seventh World Congress, ed. D. Kreps and K. Wallis, 111:189-222. Cambridge: Cambridge University Press. Hylleberg, S., R. E Engle, C. W. J. Granger, and S. Yoo. 1990. Seasonal integration and cointegration. Journal of Econometrics 44 (2): 215-38. lohansen, S. 1991. Estimation and hypothesis testing of cointegration vectors in Gaussian vector autoregressive models. Econometrica 59 (6): 1551-80. -. 1996. Likelihood based inference in cointegrated vector autoregressive models. 2nd ed. Oxford: Oxford University Press. lohansen, S., and E. Schaumburg. 1988. Likelihood analysis of seasonal cointegration. Journal of Econometrics 88 (2): 301-39. MacKinnon, J. G. 1992. Model specification tests and artificial regressions. Journal of Economic Literature 30 (1): 102-46. Maddala, G. S. 1992. Introduction to econometrics. 2nd ed. New York: Maxwell Macmillan. Mantalos, P. 2000. A graphical investigation of the size and power of the Granger-causality tests in integrated-cointegrated var systems. Studies in Nonlinear Dynamics & Econometrics 4 (1), article 2. Phillips, P. C. B. 1986. Understanding the spurious regression in econometrics. Journal of Econometrics 33 (3): 311-40. Toda, H. Y., and T. Yamamoto. 1995. Statistical inference in vector autoregressions with possibly integrated processes. Journal of Econometrics 66 (1-2): 225-50. Development, Validity and Reliability of Perceived Service Quality in Retail Banking and its Relationship With Perceived Value and Customer Satisfaction Aleksandra Pisnik Korda Boris Snoj Despite its popularity, the concept of service quality in the marketing literature is still ambiguously and vaguely defined. Several measurement scales have been proposed, but some of these take into account only the method of measurement and ignore the idea that the same instrument may not be able to be automatically applied in different industries or in different cultures. Therefore the purpose of this paper is twofold: first to validate the perceived retail banking service scale in the case of a small transitional economy of Europe, and second to research service quality-customer satisfaction relationship and the role of perceived value within it. Content validity, face validity, construct validity, convergent validity, discriminant validity as well as nomological validity were assessed with efa, cfa and sem. The present research is the first attempt to measure the relationships among the concepts researched in the retailing banking industry in transitional economies in Europe. Therefore, its major finding, that the perceived value variable has a potential to be mediating variable between perceived quality and customer satisfaction relationship in retail banking settings, could be of interest also for other researchers in transitional economies in Europe and also for researchers from other environments. Key Words: perceived quality, perceived value, satisfaction, retail banking services jel Classification: M30, M31 Introduction The world economy is rapidly becoming intensely service-oriented, which trend is reflected in the vast number of marketing research projects Dr Aleksandra Pisnik Korda is Assistant Professor of Marketing at the Faculty of Business and Economics, University of Maribor, Slovenia. Dr Boris Snoj is Professor of Marketing at the Faculty of Business and Economics, University of Maribor, Slovenia. Managing Global Transitions 8 (2): 187-205 focused on services (Carrillat, Jaramillo, and Mulki 2007). The service industry in the us contributes more than 75% of that country's gdp and employs more than 80% of its entire workforce (Malhotra et al. 2004). In most oecd countries the service now account for well over 60% of total gross value added, and expenditures for services in oecd countries clearly outperform expenditures for physical products (oecd 2009). The globalization of services marketing represents a great challenge for academic researchers, as well as practitioners (Javalgi, Martin, and Young 2006). Perceived quality and perceived value play important roles in industries with high customer involvement, such as the banking industry (An-gur, Nataraajan, and Jahera 1999). Therefore, it is important to identify dimensions of these constructs correctly and to find out how the constructs are perceived by customers (Glaveli et al. 2006). Several research projects concerning the relationship between perceived quality and customer satisfaction and loyalty have been conducted, although the majority have been implemented in developed economies, especially the us (Yavas, Benkenstein, and Stuhldrei der 2004). In Europe, research projects investigating the quality of banking services for customers have been done in Greece (Athanassopoulos, Gounaris, and Stathakopoulos 2001) and Germany (Yavas et al. 2004), but few research projects have dealt with the perceived value of banking services as a central concept in more sophisticated models of relationships. Research on perceived quality and its relationship to customer satisfaction and loyalty in the banking services industry has been performed in Taiwan (e. g., Chiù, Hsieh, and Lee 2005; Chen, Chang, and Chang 2005), South Africa (Bick, Brown, and Abratt 2004), and Great Britain (Devlin 2000), and on a sample of employees in Spain (Fandos Roig et al. 2006) and Greece (Angelis, Lymperopoulus, and Dimaki 2005), but no such project has been implemented in a country in transition in Europe until now. In the early days after Slovenia attained independence, banks were preoccupied with reconstruction of core business processes, so they have only recently started to focus on their activities with customers. The intensification of competition from foreign banks has forced domestic banks in Slovenia to pay closer attention to customer satisfaction and loyalty, which are becoming the key factors of success (Bick, Brown, and Abratt 2004). Development, Validity and Reliability of Perceived Service Quality 189 Theoretical Background service quality The unique characteristic of services (more precisely, service elements in products) is that they are processes and not tangible things (Grönroos 2001). This characteristic is at the root of all other service elements characteristics. The two other generic characteristics of service elements are intangibility and perishability (Snoj 1998; Grönroos 2001). However, authors in their research usually treat services as bundles of intangible and tangible elements and this approach is seen also in the research dealing with services quality measurement. The fundamentals of theory on service quality originate from the literature on product quality and customer satisfaction. According to the majority of authors who have explored the subject, perceived service quality is the result of customers' subjective judgment of the level of the service offering and its delivery. While researchers agree that perceived service quality is a multidimensional construct, no consensus has been reached about its generally valid, generic dimensions. As researchers continue to debate the determinants of service quality a few important issues remain unanswered e. g., (a) the universality of service quality determinants across a section of services; (b) the importance and nature of operating characteristics of determinants as they together constitute the service quality; (c) whether the service characteristic gets reflected in what customers expect out of delivery of a particular service (Chowdhary and Prakash 2007; Pal and Choudhury 2009). Early conceptualizations of perceived service quality (e. g., Parasur-aman, Zeithaml, and Berry 1988) were based on the disconfirmation paradigm, according to which quality is the result of the comparison of expected versus perceived performance of service. Accordingly, Grönroos (2000) identified two dimensions of service quality: functional quality and technical quality. Functional quality reflects the 'how' of service performance, while technical quality defines the results of service or 'what' the customer receives from the service experience. This conceptualization is known as the Nordic model (figure 1). According to the model, customers perceive what they get out of the service process, but even more important is their perception of the way the service was delivered. The important limitation of the Nordic model is that it is relatively difficult to define the technical quality or result of some services (Kang and James 2004). In defining service quality, Grön- ' Perceived ^ n ross (1982) also stressed the importance of the dimension of company image, which relates to customers' awareness of their previous experiences with the company and their overall perceptions of its service; this, in turn, influences their perceptions of current service quality. The proponents of the us school of service quality, who define service quality as a judgment about overall excellence, also understand service quality as a customer's comparison of expectations versus performance. One of the contributions of this school of thought is the servqual model (figure 2). In the servqual model (Parasuraman, Berry, and Zei-thaml 1988), service quality is measured by identifying the gaps between customers' expectations of the service to be rendered and their perceptions of the actual performance of the service, servqual is based on five dimensions of service: • tangibles - the physical surroundings, represented by objects (e. g., interior design) and subjects (e. g., the appearance of employees); • reliability - the service provider's ability to provide accurate and dependable services; • responsiveness - a firm's willingness to assist its customers by providing fast and efficient service performance; • assurance - features that give customers confidence (e. g., the firm's specific service knowledge and polite and trustworthy behavior from employees); • empathy - the firm's readiness and ability to provide each customer with personal service. figure 2 servqual Model (adapted trom Parasuraman, Zeithaml, and Berry 1988) In both the Nordic and the us schools, the contact personnel play a crucial role in customers' perception of service quality. However, Ed-vardsson (2005) argued that, in studying perceived service quality, authors have failed to pay enough attention to customers as 'prosumers' (producer and user) - that is, one who participates in producing the service - in the process of service development and delivery. The customer as service co-creator sees service quality as the consequence of his or her experiences with service development, delivery and use. operationalization and measurement of perceived service quality The concept of quality is difficult to define (Cronin and Taylor 1992; Parasuraman, Berry, and Ziethaml 1993; Brady and Cronin 2001), and any generally valid definition is still far away (Athanassopoulos 2001). Some authors (e. g., Grönroos 2000) have even proposed to leave the as-sesment of perceived quality to customers themselves. The most frequently used scales in the measurement of perceived service quality are servqual (Parasuram, Zeithaml, and Berry 1988) and s e rv p erf (Cronin and Taylor 1992). Both are the result of research work from the us school of quality, servperf directly measures the customers' perceptions of service performance and assumes that respondents automatically compare their perceptions of the service quality levels with their expectations of those services. The servperf scale is identical to the servqual scale in its dimensions and structure. Both scales have also been used in numerous research projects concerning banking services (e. g., Athanassopoulos 1997; Angur, Nataraajan, and lahera 1999; Lassar, Manolis, and Winsor 2000; Gounaris, Stathakopou-los, and Athanassopoulos 2003; Yavas et al. 2004; Yap and Sweneey 2007). Despite their advantages and popularity, however, both scales have de- • Responsiveness • Assurance • Reliability • Empathy • Tangibles ficiencies. The main empirical problem is their unstable dimensionality (Van Dyke, Kappelman, and Prybutok 1997), which could differ depending upon the service industry to which the scale was applied (Babakus and Boiler 1992). The use of these scales in the hotel industry in Great Britain indicated that variables form three, and not the proposed five, dimensions (Ekinci, Dawes, and Massey 2008). Furrer et al. (in Petridou et al. 2007) warned that, because of differences in the level of social and economic development, service customers in different countries differently perceive the concept of service quality itself. Consequently, Babakus and Boiler (1992) proposed that a quality measurement scale should be adapted to the specifics of an individual service industry or even an individual service, and that a general scale shouldn't be used at all. Discussion has also been held on the suitability of using differences (between expectations and perceptions in servqual scale) in multivariate analyses. Some authors (e. g., Babakus and Bollen 1992) have proposed using only the perceived quality assessment (servperf), which correlated better with independent variables in their research findings than did an aggregate assessment from the servqual scale. Development of the Conceptual Model and Hypotheses Some authors have equated the concept of perceived quality with that of perceived value, but this conflation was due to inadequate understanding of the concepts (Caruana et al. 2000). The fusion of both concepts is the so-called 'integrative approach' (Klaus 1985). With the basic definition of perceived value in mind, it is clear that the unification of these two concepts is not appropriate. Perceived service value is the function of customers' comparison of all the benefits derived from the purchase and use of a service, along with all the costs (sacrifices) associated with the purchase and use of the service. Therefore, many authors conclude that the concept of perceived service quality is a similar but different concept from perceived service value (Bolton and Drew 1991; Wang, Lo, and Yang 2004; Sanchez-Fernandez and Iniesta-Bonillo 2007). Perceived service value could be one of the important sources of a company's competitive advantage and is also an important predictor of customer satisfaction, loyalty (McDougall and Levesque 2000; Cronin, Brady and Hult 2000), and financial performance (Khalifa 2004). There are also similarities concerning the concepts of customer satisfaction and perceived service value. Since customer satisfaction could be defined as fulfilment of customer expectations, the affinity between customer satisfaction and perceived value lies in their subjectivity and also in their use of comparison: in the case of perceived value, customers compare benefits and sacrifices, while, in the case of customer satisfaction, they compare expected value with the actually delivered (perceived) value. However, authors have speculated that customer satisfaction depends on the actually delivered value of products or services (e.g., Howard and Sheth in Oliver 1997). Thus, actually, the two concepts are different but complement one another (Woodruff and Gardial 1996; Eggert, Ulaga, and Schultz 2006). Authors have also suggested that customer satisfaction as a construct could be assessed only by current customers, while perceived value could be estimated not only by past customers but also by future customers. The majority of authors who have contributed to the marketing literature by researching the relationships in the models of perceived service value have ascertained that higher perceived service quality leads to higher perceived service value (e. g., Sweeney, Soutar, and Johnson 1999; Teas and Agarwal 2000). Some have also found that perceived service quality is a direct predecessor of and the best predictor of perceived service value (Petrick 2004). Therefore, we speculate that the relationship between perceived quality and customer satisfaction will also be positive in the case of retail banking services. The authors who have explored the direct relationship between perceived quality and customer satisfaction can be divided into two goups: (a) those who have explored the direct relationship between perceived quality and customer satisfaction without taking into account the mediating role of perceived value and who consider perceived quality to be the direct predecessor of customer satisfaction (e. g., Jamal and Nasser 2002; Yavas et al. 2004); and (b) those who have explored the relationship of the concepts and have included perceived value, finding that, in addition to its direct influence on perceived value, perceived quality also exerts direct and indirect influences (via perceived value) on customer satisfaction (e. g., Cronin, Brady, and Hult 2000; Chen and Chang 2005; Glaveli et al. 2006; Ladhari and Morales 2008). According to the results of previous research on the relationships between perceived quality and satisfaction, we propose the following hypotheses: Hj The higher the perceived quality of banking services, the higher will be their perceived value. h2 The higher the perceived quality of banking services, the higher will be customer satisfaction with these services. Authors have also explored the direct impact of perceived quality on customer satisfaction (without taking into account the relationship between perceived quality and perceived value); however, these models produced only a partial picture (McDougall in Levesque 2000). For example, in such a case, customers assess their satisfaction with a certain product or service, but there are no data on their assessment of the benefits compared with their efforts and sacrifices. It is clear that it is important to include perceived value as the predecessor of customer satisfaction because perceived quality is an important predecessor of perceived value, which, in turn, reflects on customer satisfaction and loyalty (Gal-larza and Saura 2006). Representative research has projected that higher perceived product (or service) value leads to higher levels of customer satisfaction (Moliner et al. 2007) and loyalty (Lin, Sher, and Shis 2005) and contributes to better financial performance (Ulaga 2001; Cronin, Brady, and Hult 2000). We speculate that customer satisfaction with banking service is the consequence of its perceived value and so propose the following hypothesis: h3 The higher the perceived value of banking services, the higher will be customer satisfaction with these services. With the empirical exploration of these hypotheses, we attempt to show the mediating role of perceived value of banking services in the study of the relationship between perceived quality of retail banking services and customer satisfaction in Slovenia. Methodology The measurement instrument for the empirical study was developed in three phases. First, some of the relevant items for the questionnaire were taken from the literature. This preliminary phase also included a focus group with the purpose of developing and generating an initial pool of items. The result of this phase was a wide range of 33 service quality items, 4 perceived value items and 6 items for measuring satisfaction. Items from the original servperf scale (Cronin and Taylor 1992) were used and modified to measure perceived quality, items for the measurement of perceived value were adopted from Cronin, Brady, and Hult (2000), and Oliver's (1997) scale was adopted for measurement of customer satisfaction. In the second phase, in-depth interviews with 8 banking managers and 4 experts from the marketing field were conducted to evaluate the initial pool of items. Then the questionnaire was examined by 6 specialists (4 academics and 2 in the field of marketing research methods) to determine content validity and help avoid redundancy. In the third phase, to test for internal consistency of the scales used in the final study and to further reduce the number of items, a pilot survey with exploratory factor analysis, more precisely principal component analysis with Varimax rotation was conducted on a sample of 234 retail banking customers, mostly in the Styria region of Slovenia. In the final study, the items in the questionnaire were measured on a 5-point Likert scale (from 1 = 'strongly disagree' to 5 = 'strongly agree'). From 33 initially perceived service quality items, eleven items with 67.4% of total variance explained, were finally chosen to measure perceived quality. Further, all four initially generated perceived value items with 68.7% of total variance explained were chosen, and four out of six items with 73.9% of total variance explained were chosen to measure satisfaction. Data for the main research were collected from 700 retail banking customers in Slovenia in June 2007 by means of a telephone interview. The stratus sample framework was used with random (systematic) sampling to improve the representativeness regarding retail banking customers structure by the number of inhabitants in each Slovenian region. The final structure of the sample is also in accordance with the market shares of retail banks in Slovenia. Results reliability and validity of the scales First, we assessed the dimensionality of perceived quality by performing exploratory factor analysis (efa) (table 1). Results showed that communalities of all items were relatively high and exceeded the value of 0.40, so a three-factor solution was proposed: core service with items sqi, sq3 and sq5; physical evidence, with items sq6, sq7, sq8 and sq9; and factor safety and confidence with items sq24, sq27, sq28 and sq29. Total variance extracted was 65.82%, with 12.66% for core service, 42.60% for physical evidence and 10.55% for safety and confidence. Cronbach Alpha coefficients were relatively high and indicated good measurement reliability. Second, confirmatory factor analysis (cfa) was performed. Two measurement models were compared: (a) a one-factor model, where perceived quality was conceptualized as uni-dimensional and where the co- table 1 Communalities and factor loadings of perceived quality Items of perceived quality Comm. Factors 1 2 3 sql This bank offers me a complete range of products. 0.783 0.840 sq3 This bank is innovative. 0.828 0.865 sq5 This bank matches my specific needs. 0.702 0.811 sq6 Employees in this bank are neat in appearance. 0.614 0.607 sq7 This bank has up-to-date facilities and equipment. 0.780 0.863 sq8 The outdoor facilities of my bank are visually appealing. 0.786 0.868 sq9 Informative materials (website, advertisements, brochures, etc.) are visually appealing. 0.479 0.596 sq24 The employees in this bank are well educated and professional. 0.512 0.562 sq27 In this bank my money and savings are safe. 0.559 0.699 sq28 Using services at outside bank facilities (atm, telephone banking, e-banking) is safe. 0.547 0.737 sq29 Recommendations of employees in this bank are trustworthy. 0.650 0.736 Variance extracted in % 42.60 12.66 10.55 Cronbach Alpha 0.795 0.838 0.712 k-m- 0 measure: 0.839 Total variance extracted: 65.82% notes Varimax rotation was used. variance for all the items could be accounted for by a single factor and (b) a multi-factor model, where perceived quality was conceptualized as multi-dimensional and where covariation among the items could be accounted for by several restricted first-order factors. Summary statistics for both models are shown in table 2. Concerning the perceived quality of retailing banking services, the multi-factor model was found to outperform the one-factor model on absolute measures (%2, gfi, and rmsea), incremental fit measure (cfi), and parsimonious fit measures (x2/df). The majority of the fit indices were within the suggested interval. In addition to Cronbach Alpha, construct reliability measures were used to assess reliabilities of the perceived quality subscales. The reliabil- table 2 Summary statistics for one-factor and multi-factor models (perceived quality) One-factor model Multi-factor model* X*ldf = 266.61/44 X*ldf = 125-5/41 rmsea = o.o99 rmsea = o.o94 nfi = o.92 nfi = o.97 c fi = o.93 cfi = o.97 srmr = o.184 srmr = 0.028 g fi = o.83 g fi = o.97 notes * Core service, safety and confidence and physical evidence. table 3 Items, standardized loadings, construct reliabilities and average variance extracted Dimension Item Std. loadings cr ave Core service sql 0.800 o.867 o.687 sq3 0-795 sq5 0.888 Safety and confidence sq24 0.610 o.838 o.568 sq27 0.829 sq28 0.684 sq29 0.864 Physical evidence sq6 0.873 o.883 o.653 sq7 0.822 sq8 0.755 sq9 0.776 notes * Items as in table 1. cr - construct reliability, ave - average variance extracted. ity coefficient of the three subscales ranged from 0.84 to 0.89 (table 3), which met the standard of 0.7 suggested by Nunnally (1978). Next, construct validity of single subscales was assessed by examining convergent and discriminant validity. Evidence of convergent validity in the single constructs was determined by inspection of the variance extracted for each factor, as shown in table 3. cfa results showed that, in all cases, the average variance extracted reached the suggested value of 0.50 (Diamantopoulos and Siguaw 2000), and the f-test results of all correlations between suggested dimensions were statistically significant (table 4). table 4 Correlations among dimensions of the perceived quality construct Dimension Core service (f-value) Physical evidence (t-value) Safety and confidence Core service 1.00 Physical evidence 0.87 (19.58) 1.00 Safety and confidence 0.89 (21.63) 0.90 (27.82) 1.00 table 5 Items, construct reliabilities and average variance extracted Construct Dimensions and items CR ave Perceived value • This bank offers me a lot of benefits. 0.77 0.53 a = 0.78 -In this bank the ratio between give and get components is very fair. • In relationship with this bank I perceive more positive than negative things. Perceived quality • Core service 0.79 0.55 a = 0.86 • Physical evidence • Safety and confidence Satisfaction • Services of this bank meet my expectations. 0.81 0.59 a = 0.87 • With this bank I have good experiences. • I am satisfied with this bank. Global fit indices:^2 = 299.91 /df = 100, rmsea = 0.052, standardized rmr = 0.04, nfi = 0.940, nnfi = 0.938, c fi = 0.955, gfi = 0-944) ifi = °-955 Next, discriminant validity was assessed for the subscales of perceived quality of retail banking. Several cfas were run for each possible pair of constructs, first allowing for correlation between the two constructs and then fixing the correlation between the constructs at 1. In every case, the chi square differences between the fixed and free solutions were significant at p