|mydreamnhung08||Date: Friday, 2011-10-14, 8:05 PM | Message # 1|
|The speaking tasks will be rated by judges; thus, there are at least two major concerns regarding the assessment of a component which consists of extended constructed responses rated by human judges: On the one hand, the extent to which the overall score on the speaking component may be lacking generalizability across task types (e.g. Miller & Linn, 2000) and, on the other hand, the extent to which ratings may be influenced by subjective rater judgments. |
Overall, current research on performance-based holistic assessments (e.g. Miller & Linn, 2000) indicates that rater variance is relatively small compared to examinee-by-task variance, which suggests that what distinguishes between the speaking performances of different test-takers is the quality of their responses to different task types rather than raters� bias. In terms of task types, some intriguing questions for teachers preparing students for the speaking component of the new TOEFL�iBT exam are whether performances on each of the three task types (given that each of them might be tapping a somewhat distinct aspect of speaking) is comparable across the speaking section as a whole and to what extent the speaking score might be negatively impacted by the heterogeneity of the tasks.
Regrettably, there is insufficient empirical research to date to answer these questions, though some studies (e.g. Lee, 2005) have found that while the tasks are, on average, comparable in difficulty, they are not uniformly difficult for all examinees. Interestingly, Lee (2005) found that, in comparison to reading-speaking (RS) or IS tasks, the percentage of the testee variance was the greatest in the LS tasks, which implies that this task type is distinguishing between examinees� speaking performance better than the RS or IS subsections. This comes to tentatively suggest that for test preparation purposes, putting more emphasis on the listening-speaking skills may pay off well in increasing test-takers� chances of getting a higher score on the overall speaking component of the new TOEFL�iBT.
There are several aspects in the assessment of the speaking component that the tasks are designed to measure but the ones immediately noticeable are comprehensibility, coherence, and the ability to combine the appropriate information from, at least, two sources in providing a complete answer. Also, the fact that content-wise all tasks are academically situated calls for specific attention to the use of language by college and university teachers and students in the prompts as a valuable source of information for constructing a response. Unfortunately, there are few studies (e.g. Cutting, 1999; Biber et al., 2004) describing the linguistic characteristics of different spoken registers common to university life, which in actuality would leave many EFL teachers and learners to rely mostly on their intuitions about how to approach the preparation for the speaking component. Based on a relatively large corpus (over 2.7 million words) representative of the range of spoken and written registers that students encounter at U.S. universities, Biber et al. (2004), for example, found that all university spoken registers are characterized by features like present tense verbs, first and second person pronouns, contractions, rare use of passive constructions etc., which notably distinguish them from the so-called informational formal registers.
This finding reveals that classroom teaching, at least in U.S. colleges and universities, is much more interactive and less fully scripted (including formal lectures) than the prepared discourse many EFL teachers and students might be culturally used to. At the same time, the features indicating a lower level of formality of academic discourse are well-represented in the integrated speaking task prompts and, respectively, successful responses should closely mirror these features. This becomes particularly important in light of analyses of student discourse showing that most of the L2 students use a more formal style of spoken language in academic settings (usually informed by the more formal nature of academic discourse in their own culture), while most of the L1 students speak more conversationally.
Therefore, it is evident that TOEFL test preparation should address the need for students to be able to deal with a reasonably general style of English in an academic context. Likewise, it is essential that we as teachers distinguish between EAP needs in the sense of academic English and English for communicating about academic topics because, as pointed out by Waters (1996), it is the latter, not just the former, that EAP involves.
A cursory look at the 2006 TOEFL�iBT sample prompts for the speaking tasks reveals that task 1 and 2 are IS tasks, asking test-takers to briefly describe (in 45 seconds) their own experience with an academic event (e.g. a favorite class) or express an opinion related to some aspects of academic life (e.g. dormitory life). Tasks 3 and 4 combine two prompts: a short reading passage (90�120 words) that contextualizes a dialogue or a lecture (150�200 words) on the same topic. Test-takers are given 45 seconds to read the text, which is then followed by a question prompting a 60-second response.
It is interesting to note here that lexically, the frequency of the vocabulary used in the reading and listening prompts is close to the word frequency typical of academic reading and speaking contexts. For example, a word frequency analysis, carried out by using The Compleat Lexical Tutor (v.4) (Cobb, n.d., available at <http://www.lextutor.ca/>) of the reading passages shows that 81% � 83% of the words there belong to the 2,000 most frequently used words (West, 1953), another 9% is vocabulary that can be found in the Academic Word List (Coxhead, 2000) (e.g. facilities, technology, research, commitment, approximately, domesticate, indicator etc.), and the rest of the vocabulary is more specialized and falls beyond these frequency bands (e.g. renovating, upgrading, mammals, herd etc.). The listening prompts, based on short dialogues taking place in academic contexts (e.g. library, lecture hall, office etc.), are very close in word frequency to everyday spoken language use, regardless of the fact that the prompts are academically situated. For example, the percentage of the 2000 most frequently used words increases to 90% � 92%, while the percentage of the academically frequent lexis decreases to 3% � 4% and limits itself to a few academic words specific to the topic of discussion (e.g. a history talk) rather than the register.
At the same time, the number of conversational collocations and formulaic expressions identified by using MonoConc Pro (v. 2.0) software (e.g. Well/ but I mean, you know, a bunch of, another thing that etc.) increases, which evidences the more conversational nature of academic spoken discourse in U.S. universities. In support of this conclusion, Biber at al. (2004) found that students seem to encounter generally the same structural linguistic features, regardless of their level of study or subject matter, where the physical mode of production (i.e. spoken or written) seems to be by far the most important predictor of linguistic variation in academic discourse. Therefore, a linguistically relevant spoken response should not only be comprehensible and coherent but should also be relevant to the register it reflects.
As far as the level of difficulty of the questions following the prompts is concerned, it is usually determined by the type of information the test-takers are requested to provide. In previous large-scale assessments of adults� and children�s literacy studies (e.g. Kirsch & Mosenthal, 1995), by using a 5-point scale to score the difficulty of the information variable, researchers identified that questions asking highly concrete information (e.g. to identify a person, animal, or thing) were the easiest to answer; hence, they were assigned the lowest value. Questions that required examinees to identify an unfamiliar term or phrase for which respondents had to give an interpretation or express an opinion were assigned the highest value, because they were judged to be the most abstract and difficult. Following the same 1 to 5 scale for evaluation of difficulty, it would be safe to say that the tasks are of the highest difficulty, since they require the examinees, for instance, to provide evidence that justifies a claim, to express an opinion reflecting the belief or perspective of a character in the prompts or the testee herself or himself, or give an explanation consisting of enumeration of causes or reasons associated with an identifiable effect, outcome, or condition.
In sum, communicative competence in oral academic language requires control of a wide range of phonological and syntactic features, vocabulary, oral genres and the knowledge of how to use them appropriately (Butler et al., 2000). At the same time, it is important to realize that success in spoken interaction is determined by at least three factors, i.e. the nature of the tasks the interaction involves, the conditions under which the participants are required to perform, and the resources individuals brings to the interaction (Butler et al., 2000).
In the new TOEFL�iBT test, the examinees are asked to demonstrate their oral communication skills across a variety of academic genres, functions, and situations. The tasks focus on the middle to upper range of ESL/EFL proficiency and aim at simulating realistic communicative situations by including integrated tasks � for example, ones involving listening and speaking, or reading, listening and speaking. This portion of the test is considered to be a very important one for assessing test-takers� speaking proficiency because it is hardly possible to test speaking apart from the other skills. In general, test-takers are called upon to speak about topics they are somewhat knowledgeable about, but they will also be expected to talk about subjects they are just learning about. This would allow for their responses to be rated based on a number of features of performance, such as accomplishment of task (in terms of discursive requirements, coherence etc.), sufficiency of response in terms of length and complexity, comprehensibility (including control of phonological and prosodic features), adequacy of grammatical resources, range and precision of vocabulary, fluency, and cohesion (e.g. Bachman, Lynch & Mason 1995; Butler et al., 2000).
Finally, there is an expectation that the introduction of an oral communicatiion component in the new TOEFL�iBT exam will have a positive effect on the ESL/EFL teaching and learning community. By using constructed-response items, which are less likely to be coachable, learners will be encouraged to learn to communicate orally (not to learn a skill simply to do well on a test) and teachers will be encouraged to teach skills integratively.