Investigating the Effectiveness of a Dynamic Integrated Approach to Teacher Professional Development Panayiotis Antoniou*', Leonidas Kyriakides2 and Bert Creemers3 ^^ This paper argues that research on teacher professional development could be integrated with validated theoretical models of educational effectiveness research (EER). A dynamic integrated approach (DIA) to teacher professional development is proposed. The methods and results of a study comparing the impact of the DIA and the Holistic - Reflective Approach (HA) to teacher professional development are presented. Teaching skills and teacher perceptions of teaching of 130 teachers and the achievement of their students (n=2356) were measured at the beginning and at the end of the intervention. Teachers found to be at a certain developmental stage were randomly allocated evenly into two groups. The first group employed the DIA and the second the HA. Teachers employing the DIA managed to improve their teaching skills more than teachers employing the HA. Teacher perceptions and attitudes towards teaching have not been modified due to their participation in the interventions. On the other hand, the use of DIA also had a significant impact on student achievement. Implications of findings for the use of EER for improvement purposes are drawn and suggestions for research and practice in teacher professional development are provided. Keywords: Dynamic integrated approach, Educational effectiveness research stages of teaching skills, Evaluation of teacher improvement, Teacher professional development 1 *Corresponding author. Cyprus International Institute of Management (CIIM), 21 Academias Avenue, 2107, Nicosia, Cyprus pantoniou@ciim.ac.cy 2 University of Cyprus, Department of Education, Cyprus 3 University of Groningen, Faculty of Psychology, Education and Sociology, The Netherlands Introduction This research is in line with the current approaches of merging the findings of Educational Effectiveness Research (EER) with initiatives to improve education and particularly teacher effectiveness. Many researchers (e.g., Creemers & Reezigt, 1997; Reynolds, Hopkins & Stoll, 1993) have identified that an important constraint of the existing approaches of modelling educational effectiveness is the fact that the whole process does not contribute significantly to the improvement of teaching practice. Taking this into consideration, this study aims to contribute to further development of the framework related to the use of the dynamic model of EER (Kyriakides & Creemers, 2008) for improvement purposes. Teacher professional development is considered an essential mechanism for deepening teachers' content knowledge and developing their teaching practices in order to teach to high standards (Borko, 2004; Day, 1999). Despite the number of studies on teacher professional development (e.g., Cohen, 1990; Desimone et al., 2002; U. S. Department of Education, 1999) the majority of these studies do not measure the impact of different approaches and programmes on student learning outcomes (Cochran-Smyth & Zeichner, 2005). While those responsible for professional development have generally assumed a strong and direct relationship between professional development and improvements in student learning, few have been able to describe the precise nature of this relationship (Guskey & Sparks, 2002). On the other hand, EER addresses questions as to what works in education and why, and refers to specific factors concerned with quality of teaching associated with student achievement. In this context, the present paper argues that research on teacher professional development should draw from validated theoretical models of EER in order to develop teacher professional development programmes that will not only have an impact on improving teacher knowledge and skills but will ultimately raise educational standards. By establishing links between EER and research on teacher professional development, both fields could enjoy mutual benefits. In particular, research on teacher professional development could expand its research agenda by taking into consideration the impact of effective programmes on student outcomes, while at the same time EER could identify the extent to which its validated theoretical models can be used for improvement purposes. In this way, stronger links between research, policy and improvement of teaching practice could be established. This paper presents the results of an experimental study comparing the impact of different approaches to teacher professional development upon the development of: a) teaching skills, b) teacher perceptions of teaching and c) student achievement gains in mathematics. Specifically, the holistic approach (based on teacher reflection), which is considered to be the dominant approach to teacher professional development, and the dynamic integrated approach (based on the groupings of teacher factors of the dynamic model). The methodology of the group-randomisation study, the results and the implications for the development of research and policy on teacher professional development are presented in the following sections. The Holistic / Reflective Approach The dominant approach in teacher professional development is focused on encouraging reflection on teaching practices, experiences, and beliefs (Golby & Viant, 2007). As Charlene (2008) argues, motivated by the need to prepare their citizens for a knowledge-based economy, many governments are striving to improve their schools by encouraging reflection among teachers. According to Elliot (2002), the expertise of teachers and the extent to which they can improve relates to their ability to continuously question and interrogate the terms and conditions that govern their own transactions with students. In this perspective, Van Manen (2002) proposes three levels of reflectivity: technical reflection, practical reflection and critical reflection. Technical reflection is concerned with techniques and strategies for specific goals, while critical reflection examines broader ethical issues. Situated between these two types of reflection is practical reflection, which goes beyond looking at skills, strategies and rules to question the goals themselves. Emphasis is also given to approaches involving the reflective capabilities of observation, analysis, interpretation and decision-making (Schon, 1983; Zeichner, 1987), which enable teachers to critically review their teaching practice. In addition, this approach involves making use of readings of journal writings, observation notes, transcribed conversations, videotaped analyses and self-regulation (Cornford, 2002). Although reflection has been very fashionable in all sectors of teacher education for a number of years, there is little solid empirical evidence that supports the view that it results in an improvement of teaching practices (Cornford, 2002; McNamara, 1990). One would have anticipated that there would have been concerted efforts to evaluate the practical effectiveness of these various approaches to reflection by empirical methods, but this has not occurred to any appreciable degree (Cochran-Smyth & Zeichner, 2005). The results from the few published empirical studies that have attempted to quantify the effects of reflective thinking programmes on teacher thought and classroom performance are rather disappointing (Winitzky & Arends, 1991). Chandler et al. (1991) found reflection not to be significantly related to teaching performance. In addition, Wub-bels and Korthagen (1990), comparing teachers who had graduated recently and some time before from conventional colleges and colleges implementing reflective teaching programmes found no differences between the two groups in attitude to reflection and inclination towards innovation. While there is some evidence that the reflective approach in some studies can produce greater ability to verbalise (Stoiber, 1991), there is no clear evidence that this can be carried through to superior practical teaching performance. Finally, defining what actually constitutes reflective practices is fraught with difficulty (see Hatton & Smith, 1995; Tom, 1985). According to Cornford (2002), the ideals or purposes of reflection in education are as manifold as the term itself: development of self-monitoring teachers, teachers as experimenters, teachers as researchers and teachers as inquirers. The above terms associated with reflective teaching have varied both in terms of their conception of the nature of reflective activity and of the content on which teachers are expected to reflect (Calderhead, 1989). Due to this, it is not always clear exactly what teachers are supposed to reflect on when trying to become better teachers, which is why the main critique of the reflective paradigm is that reflective approaches lack a grounded theoretical base on which specific teaching skills could be developed. Taking this into consideration, the present paper argues that EER, and especially the dynamic model of educational effectiveness (Creemers & Kyriakides, 2008), could be used for developing an integrated approach to teacher professional development. The Dynamic Integrated Approach The dynamic model of educational effectiveness was developed in order to establish links between EER and improvement practices (Creemers & Kyriakides, 2006). In relation to the teacher level, the dynamic model refers to eight factors that describe teachers' instructional role and are associated with student outcomes: orientation, structuring, questioning, teaching-modelling, applications, management of time, teacher role in making the classroom a learning environment, and classroom assessment. These eight factors do not refer only to one approach of teaching, such as the direct teaching model or the new teaching approach. An integrated approach in defining quality of teaching is adopted. The dynamic model is also based on the assumption that although there are different effectiveness factors, each factor can be defined and measured using five dimensions: frequency, focus, stage, quality and differentiation. Frequency is a quantitative way to measure the functioning of each effectiveness factor, and studies within the process-product paradigm were only concerned with this dimension. The other four dimensions examine qualitative characteristics of the functioning of the factors and help us describe the complex nature of effective teaching (for further information on the conceptual background of the teacher factors of the dynamic model and the five measurement dimensions see Creemers & Kyriakides, 2008). Another main assumption of the model is that these factors and their dimensions may be interrelated, and the importance of grouping specific factors for explaining achievement gains has been investigated. In particular, a longitudinal study revealed that the teacher factors of the dynamic model can be grouped into five levels, which are situated in developmental order (Kyriakides, Creemers & Antoniou, 2009). Table 1 demonstrates how the 42 teaching skills emerging from the dynamic model are grouped into these five stages. Table 1. The five developmental stages of teaching skills included in the Dynamic Model STAGES TEACHING SKILLS 1 Basic elements of direct teaching — Frequency management time — Stage management of time — Frequency structuring — Frequency application — Frequency assessment — Frequency questioning — Frequency teacher-student relation 2 Putting aspects of quality in direct teaching and touching on active teaching — Stage structuring — Quality application — Stage questioning — Frequency student relations — Focus application — Stage application — Quality of questions 3 Acquiring quality in active/direct teaching — Stage student relations — Stage teacher-student relation — Stage assessment — Frequency teaching modelling — Frequency orientation — Focus student relations — Quality: feedback — Focus questioning — Focus teacher-student relation — Quality structuring — Quality assessment 4 Differentiation of teaching — Differentiation structuring — Differentiation time management — Differentiation questioning — Differentiation application — Focus assessment — Differentiation assessment — Stage teaching modelling — Stage orientation 5 Achieving quality and differentiation in teaching using different approaches — Quality teacher-student relation — Quality student relations — Differentiation teacher-student relation — Differentiation student relations — Focus orientation — Quality orientation — Differentiation orientation — Quality of teaching modelling — Focus teaching modelling Looking at the description of these five levels in terms of the teaching skills situated in each level, one can observe that the first three levels are mainly related to the direct and active teaching approach by moving from the basic requirements concerning quantitative characteristics of teaching routines to the more advanced requirements concerning the appropriate use of these skills as measured by the qualitative characteristics of these factors. These skills also gradually move from the use of teacher-centred approaches to the active involvement of students in teaching and learning. The last two levels are more demanding since teachers are expected to differentiate their instruction (level 4) and also to demonstrate their ability to use the new teaching approach. Furthermore, taking student outcomes as criteria, teachers who demonstrate competencies in relation to higher levels were found to be more effective than those situated at the lower levels. This association is found for achievement in different subjects and for both cognitive and affective outcomes (Kyriakides, Creemers, Antoniou, 2009). Specific strategies for improving effectiveness that are more comprehensive in nature may emerge by looking at the grouping of teacher factors of the dynamic model. In this context, Creemers, Kyriakides and Antoniou (in press) develop the DIA to teacher professional development. It is argued that teacher professional development should be focused on how to address specific groupings of teacher factors associated with student learning rather than with an isolated teaching factor or with the whole range of teacher factors (as implied by the Reflective Approach) without considering the professional needs of student teachers and teachers. Each grouping of factors refers to different developmental stages of teacher professional behaviour and the dimensions used to measure their functioning may help us develop programmes assisting teachers to improve their teaching skills by moving from easier to more complicate stages. The dynamic dimension of this approach is attributed to the fact that its content derives from the grouping of teaching skills included in the dynamic model, while at the same time it is differentiated to meet the specific needs and priorities of teachers who were found to be situated in each developmental stage. Similarly, the integrated dimension of this approach is attributed to the fact that although its content refers to teaching skills that were found to be positively related to student achievement (drawn from EER) the participants are also engaged in systematic critical reflection upon these teaching skills (drawn from their experiences and perceptions). Methods A group randomisation study was conducted in order to compare the impact of the HA and the DIA approaches. Information about the participants, the four phases of the study and the research measures is provided below. Participants A total number of 130 teachers volunteered to participate in the professional development programme. Although the sample was not randomly selected, it was representative of the teacher population of Cyprus in terms of gender (x2=0.84, d.f.=i, p=0.42) and years of experience (t=i.2i, d.f. =1835 , p=0.22). Data were also collected from all students (n=2356) of the teacher sample. The student sample was representative of the elementary school student population of Cyprus in terms of gender (x2=0.89, d.f.=i, p=0.43). Data were collected both at the beginning and at the end of the intervention. Students with missing prior attainment or background data represented less than 7% of the original sample and were therefore excluded from each analysis. In regard to the teacher sample, only seven teachers left the experimental study. These teachers were equally distributed through the two intervention groups and the stage at which they were found to belong. Phases of the study The four phases of the experimental study are elaborated below. Phase 1: Initial evaluation At the beginning of the 2008-2009 school year, the teaching skills of the participants were evaluated by external observers. Data on student achievement were collected using external written forms of assessment designed to assess knowledge and skills in mathematics as identified in the Cyprus Curriculum (Ministry of Education, 1994). Teacher questionnaires were administered in order to collect data on teacher background characteristics and measure their perceptions of teaching. In addition, a student questionnaire was administered in order to collect information related to student background characteristics. Observation data were then analysed using the same procedure as described by Kyriakides et al. (2009) in order to classify teachers into developmental stages according to their teaching skills. Using the Rasch and the Saltus models, it was found that teachers could be classified into the same five developmental stages that had emerged from the previous study (see table 1). Phase 2: The formation of the two experimental groups The teachers who were found to be at a certain developmental stage were randomly allocated into two teams of equal size. The first team employed the DIA and the second the HA. For example, the 32 teachers who proved to be at Stage 1 were randomly allocated into two experimental groups, each consisting of 16 teachers. Phase 3: Establishment of training sessions In this phase, the teachers of each experimental group had to attend nine sessions, as described below: i) First Session The first session was a common/introductory session for all of the teachers of our sample and took place before the initial evaluation (phase 1). In this session the main phases of the professional development programme were analysed. tte importance of evaluating the impact of this professional development programme was stressed. It was made clear that provisions had been taken to ensure the anonymity and confidentiality of the evaluation results. Finally, training on how to develop an action plan was provided. ii) Sessions for teachers employing the DIA At the second session, the teachers employing the DIA were assigned to four groups according to their development stage. Supporting literature and material related to the teaching skills corresponding to their developmental stage were provided and the area on which each group had to concentrate their efforts for improvement was made clear. Finally, each teacher developed his/her action plan by exchanging ideas with the research team and the members of his/her group. After the second session, one session per month was scheduled until the end of the school year. This decision provided the teachers with sufficient time to implement the activities included in their action plans and also to reflect on the effectiveness of these activities in order to revise and improve their action plans. The monthly sessions were organised in groups (based on teachers' stages) and teachers were strongly encouraged to cooperate and share ideas and teaching materials, to exchange and discuss their experiences and generally to share the results of their exploration. Teachers' training was based on "active teaching" and the participating teachers had an opportunity to report teaching practices and comment on them, to identify effective and non-effective teaching practices, and to identify the significance of the effectiveness factors corresponding to their developmental stage and how these factors could be linked with effective teaching. Finally, researchers regularly visited teachers at their schools to discuss emerging issues and to provide them with support and feedback. 22 investigating the effectiveness of a dynamic integrated approach iii) Sessions for teachers employing the HA The primary aim of these sessions was to enable individuals to critically evaluate their own beliefs and practice and help them to transform their experiences from a past event to an ongoing learning process. In the second session, teachers had an opportunity to undertake discussion in groups, identify a problem that they considered important in their teaching and formulate a plan of action to tackle this problem. After the second session and the development of the teachers' initial action plans we scheduled one session per month until the end of the school year. ttis decision provided the teachers with sufficient time to implement the activities included in their action plans and to reflect on the effectiveness of these activities. The monthly sessions provided the teachers of each stage with an opportunity to revise and further develop their action plans, based on their own and others' experiences. tte participating teachers had an opportunity to report their teaching practices and comment on them, and to identify effective and non-effective teaching practices, attitudes and beliefs. For example, the teachers were asked to reflect on what they perceived to be successes and failures in terms of effective teaching and learning. Then they were encouraged to focus on one critical incident (positive or negative) that occurred in their classrooms and to write down their story of experience. ttey had to describe the incident in detail (e.g., situation, people involved, feelings and reasoning), what they had learned about teaching as a result, how their perspectives had changed and the changes they had made in how they taught as a result. At each monthly meeting we encouraged the teachers within the same group to cooperate and share ideas and teaching materials, and to exchange and discuss their experiences. Finally, as with the teachers employing the DIA, during that period the research team visited the teachers at their schools to discuss emerging issues related to the implementation of their action plans in their everyday teaching. Phase 4: Final evaluation By the end of the school year, the teaching skills, teacher perceptions of teaching and student achievement were measured using the same procedure as in Phase 1 of the study. tten a final meeting with all of the teachers took place in order to get feedback about the programme and present the results of the study. Measures Student achievement in mathematics For each year group of students, criterion-reference tests in mathematics were constructed in order to measure their knowledge and skills in mathematics in relation to the objectives of the national curriculum in Cyprus. The tests for different age groups were equated using IRT modelling in order to make the comparison of the test scores meaningful (see Antoniou, 2009). Student background factors Information was collected on two student background factors: sex (0=boys, 1=girls), and socioeconomic status (SES). Five SES variables were available: father's and mother's education level, the social status of the father's job, the social status of the mother's job and the economic situation of the family. Following the classification of occupations used by the Ministry of Finance, it was possible to classify parents' occupations into three groups with relatively similar sizes: occupations held by the working class (32%), occupations held by the middle class (39%) and occupations held by the upper-middle class (29%). Standardised values of the above five variables were calculated, resulting in the SES indicator. Opportunity to learn Time spent doing homework and time spent on private tuition were seen as measures of the opportunity to learn factor. Private tuition in Cyprus is common and a high percentage of students attend private lessons. Thus students were asked to report the average amount of time spent on homework and on private tuition in mathematics. Contextual factors at teacher/classroom level Variables concerned with the context of each classroom, such as the average score at the beginning of the intervention, the average SES score and the percentage of girls, were taken into account. tte contextual factors were aggregated from the student level data. We were also able to collect data about three teacher background variables: gender, position (i.e. teacher or deputy head) and teaching experience. Teacher background characteristics Information related to teacher gender (male/female), position 24 investigating the eeeectiveness oe a dynamic integrated approach (teacher/deputy head) and years of experience was collected. In addition, teachers were asked to indicate their future expectations (to do a postgraduate degree, to be promoted, etc.) and finally to indicate their attitudes towards teaching as a profession on a Likert scale ranging from 1 (most negative) to 7 (most positive). Teacher perceptions of the characteristics of effective teachers Teachers were also asked to provide information related to their perceptions of the characteristics of effective teachers. Specifically, the teachers had to indicate on a Likert scale ranging from 1 (least significant) to 5 (most significant) how they perceived the significance of several characteristics, such as being patient, having organisational skills, being able to communicate effectively with children, etc. The reliability of this section was calculated and the value of Cronbach Alpha for each subscale was found to be satisfactory, ranging from 0.75 to 0.84. Then, in order to examine the construct validity of this part of the questionnaire, a first-order Confirmatory Factor Analysis (CFA) model, designed to test the multidimensionality of a theoretical construct (Byrne, 1998), was used. Specifically, the model hypothesised that: (a) the 4 sub-scale scores could be explained by one factor; (b) each sub-scale would have a nonzero loading on this factor; and (c) measurement errors would be uncorrelated. The findings of the first order factor SEM analysis generally affirmed the theory on which this section of the questionnaire was developed. Specifically, the scaled x2 for the one factor structure (x2 = 2.3, df =2, p.31) did not reach statistical significance, the RMSEA was .013 and the CFI was .966, all meeting the criteria for an acceptable level of fit. All parameter estimates were statistically significant (p< .001). Validation of the first-order factor structure related to this variable provided support for the use of a single score concerned with perceptions of the characteristics of effective teachers. Teacher attitudes towards tasks that teachers have to perform A Likert scale was used in which teachers had to indicate the degree to which they like performing several tasks by indicating a number from 1 (least significant) to 5 (most significant). For example, teachers were asked to demonstrate their attitudes towards lesson preparation, dealing with discipline problems, assessing students' performance, etc. In order to examine the construct validity of this part of the questionnaire, a first-order CFA model was used. Specifically, the model hypothesised that: (a) the six sub-scales scores could be explained by two factors (i.e., Direct effect on learning and Indirect effect on learning); (b) each item (i.e., sub-scale score) would have a nonzero loading on the factor it was designed to measure and zero loadings on the other factor; (c) the two factors would be uncorrelated, and (d) measurement errors would be uncorrelated. tte findings of the first order factor SEM analysis generally affirmed the theory upon which this section of the questionnaire was developed. tte scaled x2 for the two factor structure (x2= 7.78, df = 5, p=.i7) was not statistically significant, the RMSEA was .073 and the CFI was .972, all meeting the criteria for an acceptable level of fit. ttus a decision was made to consider the two-factor structure as reasonable and the parameter estimates were calculated. Quality of teaching Quality of teaching was measured through classroom observations by independent observers both at the beginning (September 2008) and at the end (May 2009) of the intervention. Two low-inference instruments and one high-inference observation instrument were used. tte instruments were designed to collect data concerning the teacher factors of the dynamic model, and their construct validity had already been tested using Structural Equation Modelling approaches (see Kyriakides & Creemers, 2008). Observations were carried out by three members of the research team, all of whom had attended a series of seminars on how to use the three instruments. During the 2008-2009 school year, the external observers visited each class four times. For each scale of the instruments the alpha reliability coefficient was higher than 0.83. Since 26% of the lessons were observed by pairs of observers, the inter-rater reliability coefficient (p2) was estimated and was found to be higher than 0.81. Implementation effort Since one of the main threats to the internal validity of experimental studies has to do with the extent to which all of the groups put the same effort into implementing the intervention, different sources of data were used to measure this variable. Specifically, we conducted content analysis of the reflective diaries that each teacher kept in order to identify the extent to which the members of each group put effort into implementing their action plans in their teaching. Moreover, the constant comparative method was used to analyse data emerging from interviews with each teacher participating in this study. These interviews were concerned with the experiences, the attitudes and the amount of time each teacher devoted to the implementation of the intervention. tte analysis of the qualitative data from each source of data helped us generate ordinal data measuring the extent to which teachers of each experimental group put effort into implementing their improvement strategies and action plans. The Kolmogorov-Smirnov two sample test did not reveal statistically significant differences between the members of the two experimental groups in terms of their implementation effort (K-S Z= 1.01, p=0.36). Results Impact on teaching skills The observational data of each period were analysed separately following the procedure described by Kyriakides et al. (2009). Specifically, the Rasch model was used in order to identify the extent to which the five dimensions of the eight teacher factors (i.e., the 44 first order factor scores) could be reducible to a common unidimensional scale. The Rasch model does not test only the unidimensionality of the scale but also is able to determine whether the tasks can be ordered according to the degree of their difficulty and whether at the same time the people who carry out these tasks can be ordered according to their performance in the construct under investigation. When the Rasch model was applied to the data of the baseline measure it was found that all of the teaching skills included in the dynamic model were well targeted against the persons' measures, since Rasch person estimates range from -3.06 to 3.12 logits and the estimates of the difficulties of teaching skills ranged from -2.93 to 3.16 logits. Moreover, the reliability of persons (i.e., teachers) and items (i.e., teaching skills) is calculated through the Rasch analysis, indicating how well the scale discriminates among teachers based on their estimated teaching skills and how well the teaching skills can be discriminated from one another on the basis of their difficulty. It was found that the separability of each scale is satisfactory (i.e., higher than 0.93). ttis implies that the reliability of the scale is very high and furthermore indicates that five levels could be discerned (Bond & Fox, 2001). Finally, the fitting of the Rasch model to the data was tested against alternative item response theory models and was found to be statistically preferable. Having established the reliability of the scale, it was investigated whether teaching skills could be grouped into the five stages described in the previous section. The procedure for detecting pattern clustering developed by Marcoulides and Drezner (1999) was used. ttis procedure enables us to segment the observed measurements into constituent groups (or clusters) so that the members of any one group are similar to one another, according to a selected criterion that stands for difficulty. Applying this method to segment the teaching skills on the basis of the difficulties that emerged from the Rasch model showed that they are optimally clustered into the five clusters proposed by previous research findings. The cumulative D for the five-cluster solution was 58%, whereas the sixth gap adds only 4%. The above procedure was also employed to analyse data that emerged from the final measurement of teaching skills. The Rasch model revealed that there was no person who did not fit the model, and that all of the teaching skills were well targeted against the persons' measures since persons' scores range from -2.99 to 3.24 logits. It was also found that the difficulties of the teaching skills could be considered invariant across the two measurement periods within the measurement error (i.e., 0.10 logits). Furthermore, the indices of persons and of teaching skills separation were found to be higher than 0.94, indicating that the separability of each scale is satisfactory. Applying the clustering method mentioned above, it was found that the teaching skills could again be optimally clustered into five clusters. By comparing the classification of teachers into different stages at the beginning and at the end of the intervention, it was found that none of the teachers of the group employing the HA managed to move from one stage to another. On the other hand, 21 of the 65 teachers employing the DIA managed to move to the next stage, whereas the other teachers remained at the same stage. Specifically, 8 teachers of this group moved from stage one to stage two, 8 teachers of stage two managed to move to stage three and 5 teachers of stage three were found to be situated at stage four at the end of the intervention. In order to measure the impact of the two professional development programmes on teaching skills we also compared the Rasch person estimates. ttis comparison reveals that the final score of teachers employing the DIA (Mean=0.36, SD=1.05) was higher than their initial score (Mean=-0.28, SD=1.01), and that this difference was statistically significant (t=4.14, df=64, p<.001). On the other hand, the final score of teachers employing the HA (Mean=-0.25, SD=1.04) was not higher than their initial score (Mean=-0.26, SD=1.05) and the t-test for paired samples did not reveal any statistically significant progress (t=0.87, df=64, p=0.38). Impact on teacher perceptions and attitudes At the first stage of the analysis, an independent sample t-test was employed to identify any statistically significant difference between the teachers of the two experimental groups both at the beginning and at the end of the interventions. No statistically significant differences could be identified between the teachers of the two experimental groups at the beginning of the interventions. Similarly, the independent sample t-test was employed to identify statistically significant differences between the teachers of the two experimental groups at the end of the interventions. Again, no statistically significant differences could be identified. Information about the perceptions of each group before and after the innovation is presented in Appendix 1. Finally, the paired-sample t-test revealed that no statistically significant changes in perceptions could be identified either for the teachers who employed the DIA or for those who employed the HA. Impact on student achievement tte results of the multilevel analysis conducted in order to measure the impact of each of the two approaches to teacher professional development on student achievement are presented in this part. Empty models with all possible combinations of the levels of analysis (i.e., student, teacher and school) were established and the likelihood statistics of each model were compared (Snijders & Bosker, 1999). An empty model consisting of student, teacher and school level represented the best solution. tte empty model revealed that 72.3% of the total variance was situated at the student level, 18.5% of the variance was at the classroom level and 10.2% was at the school level. In subsequent steps explanatory variables at different levels were added, starting at the student level. Explanatory variables, except grouping variables, were centred as Z-scores with a mean of 0 and a standard deviation of 1. Grouping variables were entered as dummies with one of the groups as baseline (e.g., girls=o). tte models presented in Table 2 were estimated without the variables that did not have a statistically significant effect at level .05. Table 2. Parameter estimates (and standard errors) for the analysis of student achievement in mathematics (students within classes, within schools) Factors Model 0 Model 1 Model 2 Model 3 Model 4 Model 5 Model 6 Fixed part (Intercept) 5.19 (0.80) 4.10 (0.78) 3.80 (0.80) 3.70 (0.90) 2.90 (0.80) 2.10 (0.80) 1.90 (0.70) Student Level Context Prior achievement in maths 0.80 (.12) 0.79 (.12) 0.81 (.12) 0.80 (.11) 0.80 (.12) 0.80 (.11) Grade 3 -1.20 (.40) -1.09 (.40) -1.08 (.40) -1.10 (.40) -1.07 (.40) -1.07 (.40) Grade 4 -0.72 (.30) -0.66 (.30) -0.62 (.30) -0.63 (.30) -0.62 (.30) -0.62 (.29) Grade 6 0.65 (.30) 0.64 (.30) 0.64 (.30) 0.65 (.30) 0.66 (.30) 0.64 (.30) Sex (0=girls, 1=boys) 0.10 (.04) 0.10 (.04) 0.11 (.04) 0.10 (.04) 0.09 (.04) 0.10 (.04) SES 0.40 (.14) 0.41 (.14) 0.40 (.14) 0.41 (.14) 0.40 (.14) 0.40 (.13) Cultural Capital 0.19 (.08) 0.19 (.09) 0.20 (.08) 0.18 (.08) 0.18 (.08) 0.18 (.08) Opportunity to learn Homework 0.12 (.04) 0.12 (.04) 0.12 (.04) 0.12 (.04) 0.12 (.04) Private tuition (0 =no, 1=yes) N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. Classroom Level Context Average achievement in maths 0.40 (.10) 0.40 (.10) 0.40 (.10) 0.40 (.10) 0.40 (.10) 0.40 (.10) Average SES N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. Average cultural capital N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. Percentage of girls N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. Teacher background Gender (0=male, 1=female) N.S.S. N.S.S. N.S.S. N.S.S. Years of experience 0.08 (.03) N.S.S. N.S.S. N.S.S. Position N.S.S. N.S.S. N.S.S. N.S.S. Teacher expectations Plans for postgraduate degree N.S.S. N.S.S. N.S.S. N.S.S. Plans for promotion to head N.S.S. N.S.S. N.S.S. N.S.S. Attitudes towards teaching as a profession N.S.S. N.S.S. N.S.S. N.S.S. Perceptions of characteristics of effective teachers A) Importance of knowledge N.S.S. N.S.S. N.S.S. N.S.S. B) Classroom management N.S.S. N.S.S. N.S.S. N.S.S. C) Personal traits N.S.S. N.S.S. N.S.S. N.S.S. D) Communication skills N.S.S. N.S.S. N.S.S. N.S.S. Attitudes towards tasks that teachers have to undertake A) Lesson preparation N.S.S. N.S.S. N.S.S. N.S.S. B) Teaching N.S.S. N.S.S. N.S.S. N.S.S. C) Assessment N.S.S. N.S.S. N.S.S. N.S.S. D) Homework assignment N.S.S. N.S.S. N.S.S. N.S.S. E) Record keeping and reporting to parents N.S.S. N.S.S. N.S.S. N.S.S. F) Administrative work -0.06 (.02) -0.05 (.02) -0.06 (.02) -0.06 (.02) Attitudes towards professional development N.S.S. N.S.S. N.S.S. N.S.S. Quality of teaching Level 1 -0.52 (.09) -0.51 (.09) -0.52 (.09) Level 2 -0.24 (.09) -0.25 (.09) -0.25 (.09) Level 4 0.32 (.10) 0.32 (.10) 0.31 (.10) Experimental group (0=only reflection, 1=competence based) 0.24 (.08) 0.23 (.08) Teachers who managed to move to the next stage (0=no movement was observed, 1=move to the next) 0.09 (.03) School Level Context Average achievement in maths 0.09 (.04) 0.10 (.04) 0.08 (.04) 0.10 (.04) 0.09 (.04) 0.09 (.03) Average SES N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. Average cultural capital N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. Percentage of girls N.S.S. N.S.S, N.S.S, N.S.S, N.S.S, N.S.S, Variance components School 10.2% 10.0% 9.8% 9.5% 9.1% 8.5% 8.4% Class 18.5% 17.6% 17.2% 16.0% 11.0% 9.0% 8.6% Student 72.3% 49.0% 45.0% 44.3% 44.1% 44.0% 44.0% Explained 23.4% 28.0% 30.2% 35.8% 38.5% 39.0% Significance test x2 1213.4 687.3 650.1 590.1 520.0 480.5 460.1 Reduction 526.1 37.2 60.0 70.1 39.5 20.4 Degrees of freedom 9 1 2 2 1 1 p-value .001 .001 .001 .001 .001 .001 N.S.S. = No statistically significant effect at level .05. The following observations arise from this table. In model 1 the variables related to the student context were added to the empty model (model 0). ttis model explained 23.4% of the variance, most of which was attributed at the student level. tte x2 test revealed a significant change between the baseline model and model 1 (p<0.001). Second, all student context variables (i.e., prior achievement in maths, gender, SES, Cultural capital) had statistically significant effects on student achievement. Boys were found to have better results than the girls. Nevertheless, prior knowledge had the strongest effect in predicting student achievement at the end of the school year. In addition, prior achievement is the only contextual variable that had a consistent effect on achievement when aggregated either at the classroom or the school level. In model 2, the explanatory variables of the student level related to the opportunity to learn were added to model 1. tte amount of time students spent on doing their homework had a statistically significant effect on student achievement. In the third model, all variables related to teacher background factors and teacher perceptions and attitudes were added to model 2. "Teacher years of experience" has a statistically significant effect on student achievement, whereas "teacherpositive attitudes towards dealing with administrative work" has a negative effect on student outcomes. ttis model explained 30.2% of the variance and the X2 test revealed a significant change between model 2 and model 3 (p<0.00l). In the next model (i.e., model 4), the variable related to the quality of teaching was added to model 3. Quality of teaching was measured through classroom observations and teachers were assigned to four developmental stages according to their teaching skills. In order to measure the effect of each developmental stage on student outcomes, teachers at stage 3 were treated as a reference group (i.e., stage 3 = 0) and three dummy variables were entered in model 4. tte results revealed that the developmental stage at which a teacher is situated has a considerably large and statistically significant effect on student achievement. Specifically, we can observe that the students of teachers at stage 1 have the lowest achievement, whereas students of teachers at level 4 have higher achievement than students of the first three levels. ttis finding provides support to the developmental nature of the four stages, since students of teachers who were found to belong to higher levels performed better than students of teachers at lower levels. Finally we can observe that model 4 explained 35.8% of the variance while the x2 test revealed a significant change between model 3 and model 4 (p