All entries for Thursday 14 October 2010
October 14, 2010
Notes on APP staff meeting 13/10/10
APP was introduced to our school last year primarily in a 'top down' manner (it came from senior management without preamble or discussion among staff) with some dissemination by subject leaders who had had brief training in order to pass on the procedural information to the rest of the staff.
In this session we were told how we were to continue with using APP, adding reading and writing to the maths we were doing last year. Teachers are to focus on 3 children per subject for their class and acquire a body of evidence for levelling those children. During the meeting we were provided with some written work from a y2 child and instructed to use the APP guidlines to derive a level for that child.
Issues arising in the meeting
During the meeting, several issues arose as bulletted below:
We are using the system for mainly summative purposes – that is we are to derive levels which are to be used for all our data, including 'high stakes' purposes such as judgements of teachers, cohorts and the school
Some mention is made of using APP assessments to help plan next steps (although little mention of feed back to pupils themselves) suggesting a confusion between formative and summative purposes
Levels must be generated each term
All children are to be given a 'teacher assessed' level but evidence need only be collected for 3 in each subject
Individual teachers are to make judgements about their pupils with some advice that we should work with year group partners
Interpretation of particular statements varied widely, e.g. 'with support' (support to keep going, support because it is work done during lesson, support because the work is done shortly after lesson)
In English writing, it is a requirement that 50% of the writing evidence must be in books other than Literacy books
Overall judgements about level are dictated by the number of assessment focuses (Afs) which have been ticked with a stated number of the required Afs in the guidance
There was no agreement about how much of the box in an AF needed to be ticked to constitute the level or the achievement of the AF
Teachers working in small groups or pairs came to significantly different decisions about the level of the work they were looking at (from high 2 to nearly a low 4)
If the system is for mainly summative purposes then a serious question arises about the use of this data to judge teachers' performance since it is very difficult to achieve reliability and comparability by assessing in this way. Individual teachers are required to gather evidence and to perform the assessment themselves. There is no opportunity for teachers to discuss at length the children and the evidence of their levels. Furthermore, the summative assessments for the rest of the cohort will not be backed up by any evidence but it will be taken as read that if the judgement about 3 children is 'accurate' then the rest can be assumed. It is difficult to see how this is a convincing argument, especially as there is no reliability in a system which requires each teacher to make a summative judgement of work based on criteria referencing. The recent SATs writing problems are a prime example of the likelihood of disagreement between individuals about judgements based on criteria. Is it acceptable or ethical to use data derived in this way to make judgements about staff and to use that in performance management?
Another question to consider is, if we are using APP for summative purposes, then why are levels to be derived 3 times a year, rather than at the end of the teaching period? It might be suggested that this is to track the progress of the children, as there is no requirement to report data to any outside agencies until the end of the year. Since the children can only cover a certain amount of material before the end of the year, the criteria can not be fully addressed until that material has been covered.
Staff found it very difficult to achieve consensus on many aspects of the guidelines, including some fundamental statements as bulletted above. They also failed to agree on a level for the work presented but this was not addressed as an issue. Staff were subsequently informed that the pupil was 'a secure 3' without any empirical evidence being presented to support that statement.
It seems like APP should have an 'assessment for learning' purpose – that is it is intended to be used primarily for formative assessment – in which case it needs to be applied to all the pupils rather than a selected 3. As such it has some obvious usefulness and can be a powerful tool. Criteria within the AFs provide some clarification of next steps in learning and can be helpful in guiding and motivating pupils as to what they hope to achieve. Used in this way, reliability is less of an issue. Problems arise when it is used in a 'high stakes' environment and no acknowledgement is made either of its unavoidable unreliability in determining levels and in the unmanageability of attempting to achieve reliability.
The images below represent annotation performed during the meeting. We identified evidence for the various AFs in the writing APP guidelines at the level we judged the work to be at.