Genevieve's ELT Brainstormer: CHAPTER 5: ASSESSING LISTENING

Assessing Assessing Listening

The assessment of listening abilities is one of the least understood, least developed and yet one of the most important areas of language testing and assessment (Alderson & Bachman, 2001). In fact, Nunan (2002) calls listening comprehension “the poor cousin amongst the various language skills” because it is the most neglected skill area. As teachers we recognize the importance of teaching and then assessing the listening skills of our students, but - for a number of reasons - we are often unable to do this effectively.

One reason for this neglect is the availability of culturally appropriate listening materials suitable for EF/SL contexts. The biggest challenges for teaching and assessing listening comprehension center around the production of listening materials. Indeed, listening comprehension is often avoided because of the time, effort and expense required to develop, rehearse, record and produce high quality audio tapes or CDs.

Approaches to Listening Assessment

Buck (2001) has identified three major approaches to the assessment of listening abilities: discrete point, integrative and communicative approaches.

The discrete-point approach

It became popular during the early 1960’s with the advent of the Audiolingual Method. This approach identified and isolated listening into separate elements. Some of the question types that were utilized in this approach included phonemic discrimination, paraphrase recognition and response evaluation. An example of phonemic discrimination is assessing students by their ability to distinguish minimal pairs like ship/sheep. Paraphrase recognition is a format that required students to listen to a statement and then select the option closest in meaning to the statement. Response evaluation is an objective format that presents students with questions and then four response options. The underlying rationale for the discrete-point approach stemmed from two beliefs. First, it was important to be able to isolate one element of language from a continuous stream of speech. Secondly, spoken language is the same as written language, only it is presented orally.

The integrative approach

It started in the early 1970s called for integrative testing. The underlying rationale for this approach is best explained by Oller (1979:37) who stated “whereas discrete items attempt to test knowledge of language one bit at a time, integrative tests attempt to assess a learner’s capacity to use many bits at the same time.” Proponents of the integrative approach to listening assessment believed that the whole of language is greater than the sum of its parts. Common question types in this approach were dictation and cloze.

The third approach, the communicative approach

It arosed at approximately the same time as the integrative approach as a result of the Communicative Language Teaching movement. In this approach, the listener must be able to comprehend the message and then use it in context. Communicative question formats must be authentic in nature.

Issues in Listening Assessment

A number of issues make the assessment of listening different from the assessment of other skills. Buck (2001) has identified several issues that need to be taken into account. They are: setting, rubric, input, voiceovers, test structure, formats, timing, scoring and finding texts. Each is briefly described below and recommendations are offered.

Setting

The physical characteristics of the test setting or venue can affect the validity and/or reliability of the test. Exam rooms must have good acoustics and minimal background noise. Equipment used in test administrations should be well maintained and checked out beforehand. In addition, an AV technician should be available for any potential problems during the administration.

Rubric

Context is extremely important in the assessment of listening comprehension as test takers don’t have access to the text as they do in reading. Context can be written into the rubric which enhances the authenticity of the task. Instructions to students should be in the students’ L1 whenever possible. However, in many teaching situations, L1 instructions are not allowed. When L2 instructions are used, they should be written at one level of difficulty lower than the actual test. Clear examples should be provided for students and point values for questions should be included in the rubrics.

Input

Input should have a communicative purpose. In other words, the listener must have a reason for listening. Background or prior knowledge needs to be taken into account. There is a considerable body of research that suggests that background knowledge affects comprehension and test performance. In a testing situation, we must take care to ensure that students are not able to answer questions based on their background knowledge rather than on their comprehension.

Voiceovers

Anyone recording a segment for a listening test should receive training and practice beforehand. In large-scale testing, it is advisable to use a mixture of genders, accents and dialects. To be fair for all students, listening voiceovers should match the demographics of the teacher population. Other issues are the use of non-native speakers for voiceovers and the speed of delivery. Our belief is that non-native speakers of English constitute the majority of English speaking people in the world. Whoever is used for listening test voiceovers, whether native or non-native speakers, should speak clearly and enunciate

Coombe/Hubley 28

The speed of delivery

The speed of a listening test should be consistent with the level of the students and the materials used for instruction. If your institution espouses a communicative approach, then the speed of delivery for listening assessments should be native or near native delivery. The delivery of the test should be standard for all test takers. If live readers are used, they should practice reading the script before the test and standardize with other readers.

Test Structure

The way a test is structured depends largely on who constructs it. There are generally two schools of thought on this: British and the American perspectives. British exam boards generally grade input from easy to difficult in a test and mix formats within a section. This means that the easier sections come first with the more difficult sections later. American exam boards, on the other hand, usually grade question difficulty within each section of an exam and follow the 30/40/30 rule. This rule states that 30% of the questions within a test or test section are of an easy level of difficulty; 40% of the questions represent mid range levels of difficulty; and the remaining 30% of the questions are of an advanced level of difficulty. American exam boards usually use one format within each section. The structure you use should be consistent with external benchmarks you use in your program. It is advisable to start the test with an ‘easy’ question. This will lower students’ test anxiety by relaxing them at the outset of the test.

Within a listening test, it is important to test as wide a range of skills as possible. Questions should also be ordered as they are heard in the passage. Questions should always be well-spaced out in the passage for good content coverage. It is recommended that no content from the first 15-20 seconds of the recording be tested to allow students to adjust to the listening. Many teachers only include test content which is easy to test, such as dates and numbers. Include some paraphrased content to challenge students.

Formats

Perhaps the most important piece of advice here is that students should never be exposed to a new format in a testing situation. If new formats are to be used, they should be first practiced in a teaching situation and then introduced into the testing repertoire. Objective formats like MCQs and T/F are often used because they are more reliable and easier to mark and analyze. When using these formats, make sure that the N option is dropped from T/F/N and that three response options instead of four are utilized for MCQs. Remember that with listening comprehension, memory plays a role. Since students don’t have repeated access to the text, more options add to the memory load and affect the difficulty of the task and question. Visuals are often used as part of listening comprehension assessment. When using them as input, make certain that you use clear copies that reproduce well. Coombe/Hubley 29

Timing

The length of a listening test is generally determined by one of two things: the length of the tape or the number of repetitions of the passages. Most published listening tests do not require the proctor to attend to timing. He/she simply inserts the tape or CD into the machine. The test is over when the proctor hears a pre-recorded “this is the end of the listening test” statement. For teacher-produced listening tests, the timing of a test will usually be determined by how many times the test takers are permitted to hear each passage. Proficiency tests like the TOEFL usually allow one repetition whereas achievement tests usually repeat the input twice. Buck (2001) recommends that if you’re assessing main idea, input should be heard once and if you’re assessing detail, input should be heard twice. According to Carroll (1972), listening tests should not exceed 30 minutes.

It is important to remember to give students time to pre-read the questions before the test and answer the questions throughout the test. If students are required to transfer their answers from the test paper to an answer sheet, extra time to do this should be built into the exam.

Scoring

The scoring of listening tests provides numerous challenges to the teacher/tester. Dichotomous scoring (questions that are either right or wrong) is easier and more reliable. However, it doesn’t lend itself to many of the communicative formats such as note-taking. Other issues are whether points are deducted for grammar or spelling mistakes or non-adherence to word counts. When more than one teacher is participating in the marking of a listening test, calibration or standardization training should be completed to ensure fairness to all students.

Finding Suitable Texts

Many teachers feel that the unavailability of suitable texts is listening comprehension’s most pressing issue. The reason for this is that creating scripts which have the characteristics of oral language is not an easy task. Some teachers simply take a reading text and ‘transform’ it into a listening script. The transformation of reading texts into listening scripts results in contrived and inauthentic listening tasks because written texts often lack the redundant features which are so important in helping us understand speech. A better strategy is to look for texts that concentrate on characteristics that are unique to listening. If you start collecting texts that have the right oral features, you can then construct tasks around them. When graphics or visuals are used as test context, teachers often find themselves ‘driven by clip art’. This occurs when teachers build a listening script around readily. Coombe/Hubley 30

To produce more extemporaneous listening recordings, use available programs on your computer like Sound Recorder or shareware like Audacity and PureVoice to record scripts for use as listening assessments in the classroom.

Vocabulary

Research recommends that students must know between 90-95% of the words to understand a text/script. Indeed the level of the vocabulary that you utilize in your scripts can affect the difficulty and hence the comprehension of students. If your institution employs word lists, it is recommended that you seed vocabulary from your own word lists into listening scripts whenever possible. To determine the vocabulary profile of your text/script, go to http://www.er.uqam.ca/nobel/r21270/cgi-bin/webfreqs/web_vp.cgi for Vocabulary Profiler, a very user-friendly piece of software. By simply pasting your text into the program, you will receive information about the percentage of words that come from Nation’s 1000 Word List and the

Academic Word List.

Another thing to remember about vocabulary is that ‘lexical overlap’ can affect difficulty. Lexical overlap refers to when words used in the passage are used in the questions and response options. When words from the passage are used in the correct answer or key, the question is easier. The question becomes more difficult if lexical overlap occurs from the passage/script to the distractors. A final thought on vocabulary is that unknown vocabulary should never occur as a keyable response (the actual answer) in a listening test.

Final Recommendations for Listening Assessment

No matter what the skill area, as always test developers should be guided by the cornerstones of good testing practice when constructing tests.

• Validity (Does it measure what it says it does?)

• Reliability (Are the results consistent?)

• Practicality (Is the test “teacher-friendly”?)

• Washback (Is feedback channeled to everyone concerned?)

• Authenticity (Do the tasks mirror real life contexts?)

• Transparency (Are expectations clear to students? Do Ss and Ts have access to information about the test/assessment?)

• Security (Are exams and item banks secure? Can they be reused?)

Coombe/

Genevieve's ELT Brainstormer

Monday, December 5, 2011

CHAPTER 5: ASSESSING LISTENING

No comments:

Post a Comment