jurnal pendidikan

Assessment can be a powerful force for supporting learning and a mechanism for individual empowerment (Broadfoot & Black, 2004). Formative assessment in particular has been prevalent in the educational discourse over the past decades, shifting the attention towards assessment practices that aid the learning and teaching process (e.g., Brookhart, 2011; Earl, 2003). This, in addition to the recognition of assessment as a key lever for promoting effective education, has led to classroom assessment being a centrepiece of various educational improvement efforts. The impact of formative assessment on student achievement has been widely documented (Black & Wiliam, 1998; Hattie & Timperley, 2007; Wiliam, Lee, Harrison, & Black, 2004); leading to the recognition of formative assessment as a determining factor of educational effectiveness at both the classroom and the school level (Teddlie & Reynolds, 2000). In addition, studies investigating teachers’ perceptions of assessment suggest that they are in favour of formative assessment; recognising its role in supporting teaching and learning (Brown, 2004; Kyriakides, 1997; Sach, 2012).

In line with international research, a series of effectiveness studies, which have been conducted in the context of Cyprus, provided empirical support for the impact offormative assessment on student learning outcomes (e.g., Kyriakides, 2005; Kyriakides, Campbell, & Gagatsis, 2000; Kyriakides & Creemers, 2008). These studies have demonstrated that primary school teachers who conduct assessment for formative reasons are more effective in terms ofpromoting student learning outcomes (both cognitive and affective outcomes were taken into account) than those who conduct assessment for summative reasons (Kyriakides, 2005). In addition, it has been found that schools with an established policy on formative assessment are more effective than schools with no policy on assessment (Kyriakides & Creemers, 2008). In this way, formative assessment at the classroom level and school policy on assessment have been identified as factors associated with student achievement gains. However, despite research findings suggesting that Cypriot teachers hold positive attitudes towards formative assessment (Kyriakides, 1997), only a limited number of teachers actually implement such practices in their teaching (Creemers, Kyriakides, & Antoniou, 2013). This finding is in line with international research suggesting that classroom assessment practice still appears to be outcome-oriented (Earl & Katz, 2000; Herman, Osmundson, Ayala, Schneider, & Timms, 2006; Lock & Munby, 2000). In this context, a large body ofresearch has emerged on teacher education and professional development with particular reference to assessment (e.g., Black, Harrison, Lee, Marshall, & Wiliam, 2002; Borko, Wolf, Simone, & Uchiyama, 2003; Hayward, Priestley, & Young, 2004; Marshall & Drummond, 2006; Poskitt & Taylor, 2007; Torrance & Pryor, 2001; Webb & Jones, 2009).

Taking the above into consideration, this paper supports the view that teachers’ skills in each aspect of the assessment process should be evaluated in order to develop appropriate and suitable professional development programmes to address teachers’ professional needs and priorities for improvement in their assessment practice. The difficulties in effective implementation of assessment need to be identified and tackled by researchers and policy-makers, if teacher assessment is to fulfil its promise (Baird, 2010). Specifically, this paper emphasises the need for the development and validation of an instrument measuring teacher assessment skills. This instrument must be in line with current conceptions of effective teaching and assessment and must also enable the identification of teachers’ specific needs in order for appropriate corrective actions to take place. In particular, this study focused on teachers’ skills in assessing students in mathematics recognising the need for assessments that are aligned and able to support current conceptualisations of effective mathematic instruction (Suurtamm, Koch, & Arden, 2010). Although the framework that was developed to measure assessment skills is not subject-specific, the study focused on a single subject due to the fact that the impact of assessment skills on student learning outcomes in mathematics was examined.

Drawing on research on classroom assessment and teacher developmental theory (Berliner, 1994; Dall’Alba & Sandberg, 2006), this study had three main aims. Firstly, a framework measuring teachers’ skills in assessment was proposed and a teacher questionnaire based on this framework was developed. Using the Rasch model, the construct validity of the questionnaire was investigated. Secondly, the study examined whether teachers’ skills in assessment can be situated on a common scale and whether these skills can be classified into developmental stages. Thirdly, classifying teacher skills into levels of difficulty has important implications for teacher professional development, especially if this classification can be related to student achievement, since training programmes could be developed to address teacher needs and priorities for improvement in each stage. Therefore, this study also investigated whether teachers found to be situated at higher stages of assessment skills are more effective in promoting student learning outcomes in mathematics.

A framework for investigating teachers’ skills in assessment

Previous attempts to define what teachers should know and be able to do in relation to assessment have not addressed assessment skills in a systematic way (Brookhart, 2011 ). Nevertheless, researchers have long recognised assessment skills as a crucial element of effective teaching practice (Gullickson, 1986; Schafer, 1991 ). As a result, various lists outlining basic assessment competencies have been developed (e.g., American Federation of Teachers, National Council on Measurement in Education & National Education Association [AFT/NCME/NEA], 1990; Schafer, 1991; Stiggins, 2009). These lists describe assessment competencies in relation to general standards ofassessment practice without providing details of the specific skills involved. In addition, these lists are not linked to a specific theoretical background and empirical evidence supporting their validity has not been provided to any significant extent (Brookhart, 2011 ).

Having recognised the need for a comprehensive framework based on which skills associated with classroom assessment can be defined and measured, a framework of teacher assessment skills was proposed. The proposed framework takes into account the dynamic nature of assessment and thus skills associated with each phase of the assessment process were examined. In addition, assessment skills were defined and measured in relation to teachers’ ability to use various assessment techniques in measuring different types of learning outcomes. Traditional as well as alternative assessment techniques were taken into consideration, since the literature supports the use ofa combination ofassessment techniques to assess student learning (Shepard, 2000; Suurtamm et al., 2010). Moreover, a measurement framework developed within the field of Educational Effectiveness Research (EER) was adopted and both quantitative and qualitative characteristics of the assessment process were taken into account. Finally, teachers’ skills in using assessment results not only for summative but also for formative purposes were taken into consideration. Each aspect of the framework is briefly described below.

Main phases of the assessment process

Classroom assessment is frequently presented in the literature as a cycle subdivided into a number of phases (e.g., Birenbaum, 2007; Bright & Joyner, 1998; Calfee & Masuda, 1997), the most common of which being planning, gathering and interpreting evidence, and using the results. In addition, other important and distinctive aspects of the process are discussed in the literature, such as the construction of assessment tools (Brookhart, 1997; De Lange, 1993), assessment administration (Anderson, 2003; Shepard, 2007), recording of assessment information (Goldhaber & Smith, 2002; Kroeger & Cardy, 2006; Schmoker, 2006) and communicating assessment results (Anderson, 2003; Stiggins, 2004). In order to measure teachers’ assessment skills, this study took into account four phases of the assessment cycle (see Fig. 1 ). Even though the main phases of the assessment process were considered as one of the three aspects on the basis of which the framework was developed, this does not imply a view of assessment as a step-by-step model that is ‘done’ by the teacher. On the contrary, the framework was based on current thinking in assessment that views it as an ongoing, iterative, dynamic process that engages both teacher and learner in the process (Shepard, 2000; Gardner, Harlen, Hayward, Stobart, & Montgomery, 2010; Wiliam et al., 2004). The literature also highlights the dynamic relationship between the various phases of the assessment process (Birenbaum, 2007; Black & Wiliam, 2009). Without neglecting the sequential character ofthe four phases involved in the process ofthe design and implementation ofassessment, this study considered all phases as interrelated and interchangeable. The division of the assessment process into particular phases was done to make sure that each aspect of assessment practice was taken into account in measuring teacher skills. Specifically, these phases were based on the assumption that effective teachers should make sure that:

a. appropriate assessment instruments are used to collect valid and reliable data

b. appropriate procedures in administering these instruments are followed

c. data emerging from assessment are analysed and recorded in an efficient way and without losing important information

d. assessment results are reported to parents and students in order to help them take decisions on how to promote student learning outcomes.

Planning and construction of assessment tools

This phase included skills referring to the planning and design of assessment as well as the construction of the assessment tools, as these are recognised in the literature. Therefore, the skills included cover decisions concerning the purpose that each assessment mechanism aims to serve (Brookhart, 2003; Gipps, 1994; Pellegrino, Chudowsky, & Glaser, 2001; Torrance & Pryor, 1998), the definition oflearning goals against which a student will be assessed (Herman et al., 2006; Sadler, 1989) as well as the selection and/or the development of quality assessment tools by means of which the purpose and goals of the assessment will be achieved (Green & Mantz, 2002; Shepard, 2000).

Administration of assessment instruments

The second phase included skills associated with the implementation of assessment. The skills included refer to decisions concerning the timing of an assessment, assessment’s link to instruction, and the teachers’ role during assessment administration (Anderson, 2003; Black & Wiliam, 1998; Shepard, 2007).

Recording and analysing data

This phase refers to skills associated with the documentation of assessment results (Goldhaber & Smith, 2002; Kroeger & Cardy, 2006; Schmoker, 2006) and eliciting information (Duschl & Gitomer, 1997; Schafer, 1991; Schmoker, 2006) as well as how this information is used (Stiggins & DuFour, 2009)

Reporting results to students and parents

The last phase refers to skills related to the communication of assessment results to intended users. Therefore the skills included in this phase refer to decisions concerning the purpose of reporting (Guskey & Bailey, 2001; Harlen & James, 1997), the audience to whom results are reported (Stiggins, 2004) and the instruments used to report data (Guskey & Bailey, 2001) as well as the quality of teacher communication with parents and students (Stiggins, 2004).

Assessment techniques

Assessment techniques play an important role in ensuring the quality and effectiveness ofassessment since they usually have an influence on how and what students learn. Choosing an assessment technique depends on the target to be assessed since student achievement in relation to certain targets can be more appropriately measured by using specific techniques (Stiggins, 1992). For example, valid assessment of students’ skills in oral communication requires the use ofdifferent oral assessment techniques rather than the use of written tests. In addition, the use of a variety of techniques allows students to demonstrate different types of learning. This holds true especially in the case of mathematics since current views of effective mathematical instruction place emphasis on the complexity of mathematics (Boaler, 2008) and require teachers to be able to use a variety of techniques to assess students’ conceptual understanding as well as their problemsolving and reasoning abilities (Suurtamm et al., 2010). Thus, given the development ofalternative assessment methods as well as the reconceptualisation of existing traditional methods (Green & Mantz, 2002; Shepard, 2000), it was necessary to examine assessment skills in relation to the four most common types of assessment techniques: (a) written assessment, (b) oral assessment, (c) observation and (d) performance assessment. For example with respect to written assessment, it was examined whether different types ofquestions (e.g., direct written questions, multiple choice questions, matching questions, extended written questions) were included in the written tests developed by each teacher. With regard to observation, the frequency of the use of formal and/or informal oral assessment to measure student achievement in mathematics was investigated.

Measurement dimensions

The dimensions used to measure teacher skills in assessment draw on methodological and theoretical developments in the area of educational effectiveness. Early studies in the field of educational effectiveness have demonstrated that quantitative characteristics of teacher assessment are associated with student achievement (see Scheerens & Bosker, 1997; Teddlie & Reynolds, 2000). However, recent studies have shown that qualitativecharacteristics of teacher assessment should also be taken into account (e.g., Heck & Moriyama, 2010; Kyriakides, 2005). In this context, the dynamic model of educational effectiveness was developed and a measurement framework using both quantitative and qualitative characteristics of effectiveness factors was proposed (Creemers & Kyriakides, 2008). It is important to stress that teacher assessment is included in the dynamic model as an effectiveness factor at the teacher level. Given that the dynamic model received empirical support from studies conducted in Cyprus (e.g., Creemers & Kyriakides, 2010; Kyriakides & Creemers,
2008, 2009; Kyriakides, Creemers, & Antoniou, 2009) and internationally (e.g., Panayiotou et al., 2013; Kyriakides, Archambault, & Janosz, 2013) as well as from empirical and theoretical reviews (see Heck & Moriyama, 2010; Hofman, Hofman, & Gray, 2010; Sammons, 2009; Scheerens, 2013), it was considered relevant to take into account the framework proposed by this model in measuring assessment skills.

The following five dimensions used in the dynamic model to measure the functioning ofeach characteristic ofeffective teachers were used: (a) frequency, (b) focus, (c) stage, (d) quality and (e) differentiation. These dimensions help us describe in a better way the functioning of each characteristic of effective teachers. Specifically, frequency is a quantitative way to measure the functioning ofeach effectiveness characteristic, whereas the other four dimensions examine qualitative aspects ofthe characteristics. The dimensions are not only important from a measurement perspective, but also, and even more so, from a theoretical point of view. Actions ofteachers associated with each characteristic can be understood from different perspectives, and not only by placing emphasis on the number of cases or the time duration of the actions which are occurring, in teaching (Creemers et al., 2013). In addition, the use of these measurement dimensions may help us develop strategies for improving teaching and assessment since the feedback given to teachers could refer not only to quantitative, but also to qualitative characteristics of their teaching and assessment practice. A brief description of the five dimensions is given below. The importance of taking each dimension into account is also illustrated below by explaining how the assessment factor included in the dynamic model is measured.

Frequency is measured by taking into account the number of assessment tasks that teachers administer to their students as well as how often assessment takes place. This measurement dimension helps us identify the importance attached to assessment by the teacher. The remaining four dimensions examine qualitative characteristics of classroom assessment. Specifically, focus is measured by looking at the ability of a teacher to use different ways of measuring student skills rather than using only one technique (Rao, Collins, & DiCarlo, 2002). It also is important to examine whether the teacher uses the information that she/he collects for more than one purpose (e.g., identifying needs of students, conducting self-evaluation, adopting his/her long-term planning, using evaluation tasks as a starting point for teaching) (Black & Wiliam, 1998). The stage dimension is measured by investigating the time at which the assessment tasks take place (e.g., at the beginning, during and at the end of a lesson/unit of lessons) and the time lapse between collecting information, recording results, reporting results to students and parents, and interpreting and using them for planning lessons. Quality is measured by looking at the properties of the evaluation instruments used by the teacher, such as the different forms of validity, the internal and external reliability, the practicality and the extent to which the instruments cover the teaching content (Cronbach, 1990). The type of feedback that the teacher gives to his/her students and the way students utilise teacher feedback is also examined. Finally, differentiation is examined in relation to the extent to which teachers use different techniques for measuring student needs and/or different ways to provide feedback to different groups of students by taking into account their background and personal characteristics.

Fig. 2 shows the theoretical framework that was used in measuring teacher assessment skills. Specifically, each of the four assessment phases was defined on the basis of assessment knowledge and skills involved across the five dimensions of the dynamic model and in relation to the four most common assessment techniques.

Methodology

By taking into account the theoretical framework, a teacher questionnaire was developed and administered to a representative sample of 10 per cent of Cypriot primary teachers at the beginning of the school year 2010–2011 (the questionnaire is available upon request from the first author). Of the 240 teachers approached, 178 responded, a response rate of 74.2 per cent. The teacher-sample was found to be representative of the teacher population of Cyprus in terms of gender ( x² = 0.81, d.f. = 1, p = 0.42) and years of experience (t = 1.21, d.f. = 2578, p = 0.22)

The questionnaire consisted of 87 items, designed to measure teachers’ assessment skills in mathematics across the three aspects of the framework presented in Fig. 2 (i.e., phases of assessment, techniques of assessment, measurement dimensions). The questionnaire consisted of five parts, and a five-point Likert scale was used to measure teachers’ perceptions of their skills in mathematics. In the first part, teachers were asked to provide information relating to their background characteristics (i.e., gender, position, and years of experience). In the next four parts, teachers were asked to indicate the extent to which they behave in a certain way during mathematics teaching in their classroom. Each part addressed a different assessment technique. The final part specifically addressed the recording of data and the reporting of results. Each assessment technique was examined in relation to the four aspects of the assessment process (construction, administration, recording and reporting), and for each aspect of the assessment process each of the five dimensions (frequency, focus, stage, quality and differentiation) was also measured. For example, when examining teachers’ skills in oral assessment, an item concerned with the stage dimension in relation to the aspect of reporting asked teachers to indicate the period of reporting oral assessment results to students. Similarly, in an item examining the quality dimension of the construction of written assessment, teachers were asked to indicate whether they included process questions in their written tests, whereas another item asked whether a specification table was established in order to develop a written test.

Since the proposed framework refers to all four phases of assessment, it was not practically possible to measure teachers’ skills in assessment by using external observation. More specifically, observation of teacher behaviour in the classroom could not provide information relating to teachers’ skills in assessment tool construction, recording and reporting of data since these tasks may take place outside classroom. Moreover, to measure teachers’ skills in administering assessment tasks, it would have been necessary to observe a large number of lessons per teacher, especially since a significant percentage of Cypriot teachers provide assessment tasks only at the end of a unit or series of lessons (see Kyriakides, 2005) and it would therefore have been very difficult to obtain data on teachers’ skills in assessment unless many lessons for each teacher had been observed. Although the limitations of collecting data through teacher self-reports are acknowledged, it was not feasible to conduct classroom observations on a large scale so as to ensure the generalisability of the data. Nevertheless, it was considered important to examine the internal validity of the study. For this reason, teachers participating in the survey were asked to indicate whether they were willing to give an interview. We then randomly chose eight of them and conducted semi-structured interviews. The constant comparative method (Maykut & Morehouse, 1994) was used in order to analyse the data emerging from the interviews. Initially, ‘within-case analysis’ (Denzin & Lincoln, 1998) of each teacher’s responses in the interview was conducted without having access to his/her responses to the questionnaire. After creating the profile of each interviewee, it was possible to match teachers’ responses from the interviews with the questionnaire data. This procedure provided support to the internal validity of the study. In particular, consistency was identified between the way teachers responded to the two research instruments (i.e., questionnaire and interview). For example, teacher 6 circled number 2 (i.e., rarely) on the Likert scale for the statement B7 of the questionnaire ‘‘Before creating a test, I write down the objectives I want to assess and further indicate which exercises of the test correspond to each objective’’. The same teacher during the interview stated: ‘‘I usually use ready-made tests to asses my students. I have a test for each unit and I use it every year. I do not really match the exercises with the unit’s objectives but I think the exercises match the content taught, since the content of each unit has not changed’’. Likewise, teacher 9 stated: ‘‘During teacher– parent meetings, I always report test results to parents. It is important that they are aware of their child’s performance and how their child is doing in comparison to the rest of the class. I focus on the student’s test grades since they provide a clear picture of what the student can do’’, her statement being consistent with her response to the items E8, E13 and E18 of the questionnaire. For each teacher, the responses to all questionnaire items were compared to his/her responses to the interview questions (see Christoforidou, 2013). This comparison allowed us to identify consistency in the way teachers responded to the two research instruments. The few cases for which matching was not possible were not related to a particular teacher or a particular item.

Furthermore, we searched for the extent to which measures of teacher skills in assessment that emerged from the teacher questionnaire were associated with the effectiveness status of the teacher sample, as elaborated below.

In order to test the impact that teachers’ skills in assessment have on student learning outcomes, we drew on data from an effectiveness study which investigated the effectiveness status of the same teacher sample in teaching mathematics (Christoforidou, 2013). Specifically, written tests were administered to all students (n = 2358) of the teacher sample (n = 178) at the beginning and at the end ofschool year 2010–2011. Given that the test administered to Grade 6 students at the end of the school year was obviouslymore difficult than the test administered to Grade 2 students at the beginning of the school year, it was considered necessary to make the scores comparable. Equating was done using Item Response Theory (IRT) modelling. The method of equating follows the same procedure as that used in the Programme for International Student Assessment (PISA) studies. Estimation was made by using theextended logistic model of Rasch (see Appendix A for more information about this model), which revealed that each scale had satisfactory psychometric properties. Thus, for each assessment period, achievement in mathematics was estimated by calculating the Rasch person estimates (Christoforidou, 2013).

Information on student background factors (i.e., SES, age, gender) was also collected from school records. Five SES variables were available: father’s and mother’s education level (i.e., graduate of a primary school, graduate of secondary school or graduate of a college/university), the social status of father’s job, the social status of mother’s job and the economic situation of the family. Following the classification of occupations used by the Ministry of Finance, it was possible to classify parents’ occupations into three groups of relatively similar sizes: working-class occupations (32%), middle-class occupations (39%) and uppermiddle class occupations (29%). Using structural equation modelling techniques, a first-order factor model was established. This model was found to fit the data (i.e., x2 = 9.4, d.f. = 5, p = 0.094; CFI = 0.961; RMSEA = 0.064) and thereby an indicator of SES emerged from this model.

The extended logistic model of Rasch (see more on this model in Appendix A) was applied to the whole sample ofteachers and all 87 measures concerned with their assessment skills, using the computer program Quest (Adams & Khoo, 1996). Fig. 3 illustrates the scale for the 87 measures of assessment skills, with item difficulties and teacher measures calibrated on the same scale. For the sake of brevity, the item threshold values are not presented in this figure but these values were found to be ordered from low to high, indicating that the teachers answered consistently with the ordered response format of our Likert scale. Moreover, the threshold distances range from 1.7 to 2.5 logits. Fig. 3 also shows that the 87 items of the questionnaire measuring teacher assessment skills had a good fit with the measurement model, indicating a strong agreement among the 178 teachers located at different positions on the scale, across all 87 items. Moreover, the questionnaire items were well targeted against the teachers’ measures since teachers’ scores range from 3.14 to 3.11 logits and item difficulties range from 3.11 to 3.34 logits. Furthermore, Table 1 provides a summary of the scale statistics. Reliability was calculated by the Item Separation Index and the Person Separation Index. Separation indices represent the proportion of the observed variance considered to be true. A value of 1 represents high separability in which errors are low and item difficulties and students’ measures are well separated along the scale (Wright & Masters, 1981 ). We can observe that the indices of cases and item separation were higher than 0.92, indicating that the separability of the scale was satisfactory (Wright, 1985). In addition, the infit mean squares and the outfit mean squares were found to be near one and the values of the infit t-scores and the outfit t-scores were approximately zero.

The results of the various approaches used to test the fit of the Rasch model with our data also revealed that there was a good fitwith the model when teachers’ performance in these assessment skills was analysed. Specifically, all assessment skills were found to have item infit with the range 0.85–1.16, and item outfit with the range of 0.76–1.40. All the values of infit t for both persons and assessment skills were greater than 2.00 and smaller than 2.00. Finally, the procedure proposed by Yen (1993) was also used to test for local independence, and it was found that local independence was not violated for any item.

Using cluster analysis to specify levels of difficulty

Having established the reliability of the scale, the procedure for detecting pattern clustering in measurement designs developed by Marcoulides and Drezner (1999) (see Appendix B) was used to find out whether assessment skills are grouped into levels of difficulty that may be taken to stand for the types of behaviour involved in evaluating student achievement in mathematics, which move from relatively easy to more difficult. Applying this method to segment the assessment skills on the basis of their difficulties that emerged from the Rasch model showed that they were optimally clustered into four clusters. Specifically, the cumulative D for the four-cluster solution was 59 per cent, whereas the fifth gap added only 2 per cent. According to the literature on cluster analysis, the fourcluster solution explaining 59 per cent of the observed variance was considered satisfactory (Romesburg, 1984). A description of the four different stages/types of teacher assessment behaviour is given below.

Type 1: Using written tests to measure basic skills in mathematics for summative reasons ( 3.10 up to 2.20 logits).

The assessment skills included in this stage revealed that teachers (n = 56) demonstrating this type of behaviour use everyday assessment routines. Type 1 teachers enrich or alter ready-made written tests and use a variety of types of written questions to assess students’ performance. However, they do not use oral assessment and/or observation in a systematic way to assess their students’ performance. Records are kept only in relation to written assessment results whereas results are reported to parents only for summative purposes. Finally, Type 1 teachers appear to check homework consistently.

Type 2: Using different techniques of assessment to measure basic skills in mathematics ( 1.40 up to 0.50 logits).

The assessment skills included in this stage revealed that teachers (n = 48) demonstrating this type of behaviour are able to use the various techniques of assessment in an appropriate way in order to measure basic skills in mathematics. Specifically, Type 2 teachers create a specification table before developing their written tests. In this way, they try to ensure that their tests are representative of what has been taught in the classroom. They also include test items which measure the student ability to give a correct answer to a task and items which investigate the process used by each student in his/her attempt to find an answer to a problem (i.e., process questions are included). In designing test items, they also take into consideration their students’ abilities. In addition, they reported that they offer clarification comments to students during assessment administration and that oral assessment and observation are planned in advance. Furthermore, teachers in this stage move beyond homework checking and use homework information to assess the basic skills oftheir students in mathematics. With regard to the recording of assessment data, they use descriptive comments to give feedback to their students. Finally, they report to parents on their students’ assessment results.

Type 3: Using assessment techniques to measure more complex educational objectives for formative reasons (0.20 up to 1.95 logits).

Teachers demonstrating this type of behaviour (n = 47) are able to use assessment techniques to measure more complex educational objectives in mathematics, such as students’ ability to communicate by using mathematics. Thus observation is used in a systematic way, by setting specific goals and creating observation tools related to these goals. Data deriving from all assessment techniques, not only from written assessment (as in Type 2 teachers) are recorded and records take the form of goal and/or exercise specific documentation. In addition, reporting is done for formative reasons and teachers in this stage report assessment information not only to parents, but to their students as well. Finally, group assessment is used in a systematic way and is primarily concerned with each student’s contribution to the team work rather than with the team’s overall performance.

Type 4: Differentiation in assessment: Applying assessment in and for different occasions and students (2.60 up to 3.35 logits).

Based on the assessment skills included in this type of behaviour, it appears that Type 4 teachers (n = 27) are able to differentiate assessment procedures and tools based on their students’ needs. Therefore teachers in this stage do not always use the same written tests to measure the achievement of different groups of students and they are more flexible during the administration process (e.g., they give extra tasks to those who finish earlier and more time to slow learners). They also differentiate reporting of assessment information to both parents and students (e.g., reporting is done more frequently about students who need it most; they use different forms/language that are in line with the educational level of parents) and pursue teacher-parent communication, especially with parents who rarely or never visit the school.

To what extent teacher assessment skills can be attributed to their background characteristics

The last part of this section investigates the extent to which teacher background characteristics are associated with teachers’ assessment skills. Initially, one-way analysis of variance was conducted to find out whether the school context had any effect on teacher responses to the questionnaire measuring their assessment skills. By taking into account the Rasch person/teacher estimate, which refers to assessment skills of each teacher, it was found that the between-group variance was no higher than the within-group (i.e., teachers within the same school) variance (F = 0.831, p = 0.67). As a consequence, a uni-level regression analysis was conducted to find out whether the Rasch person/ teacher estimate of assessment skills could be attributed to any of the three background factors (i.e., years of experience, gender, position). Thus the Rasch score for teachers was treated as the dependent variable, whereas the years of teaching experience, as well as the two dummy variables measuring gender (0 = male, 1 = female) and the posts of our teacher sample (0 = teachers, 1 = deputy heads) were treated as independent variables. The model that was found to fit better with the data was able to explain a relatively small percentage of the variance in the score for assessment skills (16%), and the standardised equation that emerged revealed that the effect of the dummy variable ‘post’ was bigger (0.34) than that of years of experience (0.22). Gender was not found to be associated with teachers’ skills in assessment. It should be acknowledged that it was not possible to gather information on other teacher background variables, such as their teaching qualifications and their training in assessment, which may also be associated with their skills in assessment. However, the variables ‘years of experience’ and ‘post’ were found to be associated not only with the Rasch score measuring their skills in assessment, but also with the classification of teachers into the four stages of assessment. Specifically, the Mann–Whitney test revealed that teachers who had managed to be promoted to the post of the deputy head were situated at higher stages of assessment skills (Z = 2.02, p = 0.043). Moreover, the Spearman correlation coefficient (r = 0.23, n = 178, p < 0.001) revealed a statistically significant positive relation between the stage at which each teacher was found to be situated and his/her years of experience.

The added value ofclassifyingteachers into stages ofassessmentskills: explaining variation in student mathematics achievement

This section examines the extent to which the classification of teachers into the four stages explains variation in achievement in mathematics. Due to the hierarchical structure of the data (students within classes within schools), multilevel analysis was carried out using MLwiN. The first step of the multilevel analysis of student achievement in mathematics at the end of the school year was to determine the variance at individual, class and school level without explanatory variables (i.e., baseline model). The empty model revealed that 74.3 per cent ofthe total variance was situated at the student level, 16.7 per cent of the variance was at the classroom level and 9.0 per cent was

google adsense

0 Response to "jurnal pendidikan"

Post a Comment