Lecture 1 ASSESSMENT LITERACY By the end of

Lecture 1 ASSESSMENT LITERACY

By the end of this unit, you should be able to: define the fundamental terms and concepts in second language assessment, explain the relationship between teaching, learning, assessment and testing, identify and explain the various purposes and types of assessment in relation to second language teaching/learning models and goals, relate assessment to communicative language teaching, and classify the types of assessment you use. Presentation goals

Introduction to Assessment Literacy

Definitions of ‘literacy’? What do we mean/understand by the word ‘literacy’? lit·er·a·cy n. 1. The condition or quality of being literate, especially the ability to read and write. 2. The condition or quality of being knowledgeable in a particular subject or field: cultural literacy; academic literacy

Definitions of ‘academic literacy’? • the ability to read and write effectively within the college context in order to proceed from one level to another • the ability to read and write within the academic context with independence, understanding and a level of engagement with learning • familiarity with a variety of discourses, each with their own conventions •familiarity with the methods of inquiry of specific disciplines

An age of literacy (or literacies)? •‘academic literacy/ies’ •‘computer literacy’

‘Assessment literacy’ … ‘‘a basic grasp of numbers and measurement’ ‘interpretation skills to make sensible life decisions’ But why is it necessary or important? And who needs it?

Stakeholders Growing numbers of people involved in language testing and assessment – test takers – teachers and teacher trainers – university staff, e.g. tutors & admissions officers – government agencies and bureaucrats – policymakers – general public (especially parents)

‘Language testing has become big business…’ (Spolsky 2008, p 297)

But… … how ‘assessment literate’ are all those people nowadays involved in language testing and assessment across the many different contexts identified?

Assessment literacy involves… • an understanding of the principles of sound assessment • the know-how required to assess learners effectively and maximise learning • the ability to identify and evaluate appropriate assessments for specific purposes • the ability to analyse empirical data to improve one’s own instructional and assessment practices • the knowledge and understanding to interpret and apply assessment results in appropriate ways • the wisdom to be able to integrate assessment and its outcomes into the overall pedagogic /decisionmaking process

What makes assessment necessary? Teachers want to know what parts of the course cause difficulties and require special attention /what progress students have made Students want to know their strengths and weaknesses Parents want to know how their children are doing 12 PROSET - TEMPUS

What makes assessment necessary? Universities want to choose the best applicants for enrollment Teachers want to know whether the students are coping with the programme Employers want to choose the best applicant for the job 13 PROSET - TEMPUS

Testing & assessment 15 PROSET - TEMPUS EVALUATON TESTING

What forms of assessment to uses? Test Portfolio Project work Observation Case study Interview, oral questions Experimental work, etc 16 PROSET - TEMPUS

What is a test? Any procedure for measuring ability knowledge, or performance. (Longman Dictionary of Language Teaching and Applied Linguistics) 17 PROSET - TEMPUS

What is a test? A test is a measurement instrument designed to elicit a specific sample of individual’s behaviour, … … a test necessarily quantifies characteristics of individuals according to explicit procedures. (L.F. Bachman. Fundamental Considerations in Language Testing, 1990) 18 PROSET - TEMPUS

Test as an instrument of assessment (measurement) WHY WHAT HOW 23 PROSET - TEMPUS

What types of tests do we use? Test Purposes Proficiency tests Achievement tests Placement tests Diagnostic tests 24 PROSET - TEMPUS

What types of tests do we use? Direct vs Indirect Discrete Point vs Integrated Objective vs Subjective Low stakes vs High stakes Norm- Criterion- referenced vs referenced 27 PROSET - TEMPUS

Norm-referenced testing: the bell curve Number of students 0 25 50 75 100 Score (points) 28 PROSET - TEMPUS

The Assessment Cycle: where does a test begin? 29 PROSET - TEMPUS

Attitudes to tests, test scores and their experience of assessment? • Students • Language teachers • Educational boards • Politicians and policy-makers

The over-arching principle of test ‘utility’/‘usefulness’ Test purpose must be clearly defined and understood Testing approach must be ‘fit for purpose’ There must be an appropriate balance of ‘essential test qualities’ (such as validity and reliability)

The 5 Principles of Assessment There are five principles that we need to consider when creating assessments. They are: practicality reliability validity authenticity washback

Practicality Often, this is a very important principle to consider for classroom teachers. Is the assessment (test)... too expensive to implement? too time-consuming to design? too time-consuming to implement? too time-consuming to score? does it require too many people to implement? If the assessment is not practical for your teaching context, it will most likely need to be revised.

Reliability An assessment that is reliable will give the same score to the same type of student regardless of when the assessment is given or who scores it. For example, if Teacher A gives a test to Johnny on Monday morning and scores him with 90/100, a reliable test will give the same result to Johnny if he had taken the test on Monday afternoon with Teacher B or if a similar student to Johnny takes the test on Tuesday with Teacher C.

Reliability Problems An assessment can have problems with reliability if: it doesn‟t have clear administration instructions. it doesn‟t have clear scoring instructions. the test items are written in such a way that they are confusing or several answers are possible. the testing conditions in the classroom give a disadvantage to students (i.e., too much outside noise). students can do well even without knowing the information being assessed. Can you explain why? What can be the reliability solutions?

Validity Validity is concerned with if the assessment measures what it is intended to measure. For example, a valid assessment that is supposed to measure how well a student speaks English will actually indicate the student‟s skill in speaking. Is a speaking test that asks students to read a passage out loud valid? - Not very, because it assesses students ability to read and only minimally assesses speaking (pronunciation).

Validity - Direct testing is more valid than indirect testing. - Valid classroom assessments measure only what has been taught in class. - Valid classroom assessments utilize items or activities that are similar to what students have already practiced in class. Trick questions and tricky assessments are not valid. Can you explain why?

Authenticity Authentic assessments reflect natural uses of language. Test items are contextualized in authentic assessments (perhaps all relating to one picture or telling a story) There is a direct relationship between the students‟performance on the test/assessment and the ability to complete real-world tasks.

Washback Washbackrefers to the outcomes of the assessment for the learner, the teacher, and the teaching context. Positive washbackfrom an assessment can motivate the student to learn more, positively influence the teacher in what and how to teach, and can improve the classroom environment for more learning. Feedback after assessments improve washback. Tests or assessments that are designed to punish or fail many students do not result in positive washback.

Socio-Cognitive Approaches to Testing and Assessment PROSET - TEMPUS 40

How do examination boards operationalise criterial distinctions between the tests they offer at different levels on the proficiency continuum? (Prof. Cyril Weir) PROSET - TEMPUS 41

The CEFR describes language ability on a scale of levels from A1 for beginners up to C2 for those who have mastered a language. Common European Framework of Reference (CEFR)

Centre for Research in English Language Learning and Assessment Construct Validity

Cognitive Validity the extent to which the tasks we employ elicit the cognitive processing involved in task solving.

46

47 Cognitive demand at different levels In many ways the CEFR specifications are extremely limited in their characterisation of task solving ability at the different levels and we need to be more explicit for testing purposes about: The types of tasks demanded at each of the stages. How well calibrated the cognitive processing demands made upon candidates are in the design of the tasks. The cognitive load imposed by relative task complexity at each stage.

48 Lexical access Accessing the lexical entry containing stored information about a word’s form and its meaning from the lexicon. The form includes orthographic and phonological mental representations of an item and possibly information on its morphology. The lemma includes information on word class and the syntactic structures in which the item can appear and on the range of possible senses for the word.

49 Word Recognition Word recognition is concerned with matching the form of a word in a written text with a mental representation of the orthographic forms of the language.

50 Syntactic parsing Once the meaning of words is accessed, the reader has to group words into phrases, and into larger units at the clause and sentence level to understand the text message.

51 Establishing propositional meaning at the clause or sentence level An abstract representation of a single unit of meaning: a mental record of the core meaning of the sentence without any of the interpretative and associative factors which the reader might bring to bear upon it.

52 Inferencing Inferencing is necessary so the reader can go beyond explicitly stated ideas as the links between ideas in a passage are often left implicit. Inferencing in this sense is a creative process whereby the brain adds information which is not stated in a text in order to impose coherence. If there was no such thing as inferencing, writing a text which includes every piece of information would be extremely cumbersome and time consuming.

53 Establishing a mental representation across texts In the real world, the reader sometimes has to combine and collate macro-propositional information from more than one text. The need to combine rhetorical and contextual information across texts would seem to place the greatest demands on processing.

Cognitive processing at A2 to C2 PROSET - TEMPUS 54

55 Context Validity Context validity relates to the appropriateness of both the linguistic and content demands of the text to be processed, and the features of the task setting that impact on task completion.

56 Types of reading tested at levels A2 to C2

57 Contextual Parameters in Reading Task Setting: Text length Linguistic Demands: Discourse mode Lexical resources Structural resources Functional resources Nature of information

58 The cognitive demands imposed by relative text complexity at each stage

59 Text length

60

What kinds of information should be available about a test? WHO WHAT HOW 61 PROSET - TEMPUS

Test specification – the official statement about what the test tests and how it tests it … to be followed by test and item writers Multilingual Glossary of Language Testing Terms. CUP, 1998 Test design statement Blueprint Task and item specifications 62 PROSET - TEMPUS

Test design statement the purpose, the knowledge, skills or abilities it is intended to assess the resources available the uses of the results, the intended impact of its use 63 PROSET - TEMPUS

Blueprint content methods scoring 64 PROSET - TEMPUS

Test tasks Prompt (a reading/listening text, an essay question, a picture to describe) Response (ticking a box, giving a short answer, describing a picture, etc) 65 PROSET - TEMPUS

The building blocks of a test Instructions Item Stem Options Distractors Key 66 PROSET - TEMPUS

Responses Selected response Constructed response Personal response (Brown & Hudson, 1998) 67 PROSET - TEMPUS

Responses: EGE Selected response Выбор ответа (A1-A7, A8-A14, A15-A23) Short response Краткий ответ (B1, B2, B3, B4-B10, B11-B16) Constructed response Свободно конструируемый ответ (C1, C2) 68 PROSET - TEMPUS

What do you think is the relationship between assessment, testing, and teaching? Distinguish ‘test’ and ‘assessment’ Identify different types of assessment Identify different types of tests in terms of their purpose and format Analyze the assessment cycle in relation to different types of assessment Identify the structure and components of a test specification (design statement, blueprint, task and item specifications) Differentiate between cognitive processing at A2-C2 levels of CEFR. How can assessment motivate learners? How can it demotivate them? Can the skill of reading be tested in a direct way? Please, define 5 principles of assessement.

Define the types of the test: achievement test, criterion-referenced test, diagnostic test, direct test, discrete point test, high stakes test, indirect test, integrated test, low stakes test, norm-referenced testing, objective test, subjective test, placement test, proficiency test formative test summative tests

The building blocks of a test: define the terms: Task Instructions Prompt Response Item Stem Options Distractors Key

Find these building blocks in the following EGE tasks:

Task & item specifications (seminar) How many items are included in each task? What area of knowledge, skill or ability is assessed in each task? What should the test-taker do (instructions)? How are scores on each item and task determined? 73 PROSET - TEMPUS

EGE specification: (seminar) 11 pages 12 parts (sections) 4 tables an appendix 74 PROSET - TEMPUS

Your classroom test specification/plan purpose number of tasks and items what these items assess how they are scored time limit ??? 75 PROSET - TEMPUS