Full citation: Clifford, R. (2016) A Rationale for Criterion-Referenced Proficiency Testing. Foreign Language Annals, 49(2), 224–234.
Abstract: This article summarizes some technical issues that add to the complexity of language testing. It focuses on the criterion-referenced nature of the ACTFL Proficiency Guidelines—Speaking; and it proposes a criterion-referenced interpretation of the ACTFL guidelines for reading and listening. It then demonstrates how using criterion-referenced testing and scoring enhances the accuracy of reading and listening proficiency ratings while also providing informative feedback to learners.
Full citation: Cox, T. L. (2017). Understanding Intermediate‐Level Speakers’ Strengths and Weaknesses: An Examination of OPIc Tests from Korean Learners of English. Foreign Language Annals, 50(1), 84-113.
Abstract: This study profiled Intermediate-level learners in terms of their linguistic characteristics and performance on different proficiency tasks. A stratified random sample of 300 Korean learners of English with holistic ratings of Intermediate Low (IL), Intermediate Mid (IM), and Intermediate High (IH) on Oral Proficiency Interviews- computerized (OPIcs)—100 at each level—were analyzed by trained ACTFL raters to determine what was needed for the learners to progress to the next higher sublevel. The findings indicate that while ILs minimally met all the linguistic characteristics required of the Intermediate level, they needed to improve in the quantity and quality of all the linguistic characteristics they employed and improve their mastery of the types and variety of questions they could use when performing Intermediate tasks to move to the IM sublevel. In contrast, IMs demonstrated a pattern of strength when completing Intermediate tasks, but to move to the IH sublevel they needed to improve their ability to perform all Advanced-level tasks, especially in terms of accuracy when using paragraph-length discourse. Similar to the IMs, for the IHs to move to the Advanced Low sublevel, they needed to improve their accuracy with paragraph-length discourse and expand their content mastery to beyond the autobiographical.
Full citation: Tigchelaar, M., Bowles, R. P., Winke, P., & Gass, S. (2017). Assessing the Validity of ACTFL Can‐Do Statements for Spoken Proficiency: A Rasch Analysis. Foreign Language Annals, 50(3), 584-600.
Abstract: The NCSSFL-ACTFL Can-Do Statements describe what language learners can do at the various ACTFL proficiency sublevels. Unlike the European equivalent of the Can-Do Statements (the Common European Framework of Reference for Languages), few researchers have assessed the construct validity of the NCSSFL-ACTFL statements. Concerns have included whether the difficulty levels of the skills described in the statements match the statements’ assigned proficiency levels and whether each statement accurately indicates the underlying construct: language proficiency on the ACTFL subscales. This study addressed those two concerns. Undergraduate Spanish learners at an American university (N=382) self-assessed their speaking proficiency by responding to a selection of 50 NCSSFL-ACTFL Can-Do Statements. A Rasch analysis revealed 15 misfitting items that did not fit the model. The suspected reasons for the misfit were that the statements were vague, described experiences that the learners may not have had, or assessed multiple skills in a single statement. However, 35 statements fit the model. Discussed are how the NCSSFL-ACTFL Can-Do Statements can be used to self-assess proficiency and how the statements should be assessed for content validity and psychometric value.
Full citation: Thompson, G. L., Cox, T. L., & Knapp, N. (2016). Comparing the OPI and the OPIc: The Effect of Test Method on Oral Proficiency Scores and Student Preference. Foreign Language Annals, 49(1), 75-92.
Abstract: While studies have been done to rate the validity and reliability of the Oral Proficiency Interview (OPI) and Oral Proficiency Interview–Computer (OPIc) independently, a limited amount of research has analyzed the interexam reliability of these tests, and studies have yet to be conducted comparing the results of Spanish language learners who take both exams. For this study, 154 Spanish language learners of various proficiency levels were divided into two groups and administered both the OPI and OPIc within a 2-week period using a counterbalanced design. In addition, study participants took both a pre- and postsurvey that gathered data about their language learning background, familiarity with the OPI and OPIc, preparation and test-taking strategies, and evaluations of each exam. The researchers found that 54.5% of the participants received the same rating on the OPI and OPIc, with 13.6% of examinees scoring higher on the OPI and 31.8% scoring higher on the OPIc. While the results found that students scored significantly better on the OPIc, the overall effect size was quite small. The authors also found that the overwhelming majority of the participants preferred the OPI to the OPIc. This research begins to fill important gaps and provides empirical data to examine the comparability of the Spanish OPI and OPIc.
Full citation: Swender, E., Martin, C. L., Rivera‐Martinez, M., & Kagan, O. E. (2014). Exploring Oral Proficiency Profiles of Heritage Speakers of Russian and Spanish. Foreign Language Annals, 47(3), 423-446.
Abstract: This article explores the linguistic profiles of heritage speakers of Russian and Spanish. Data from the 2009–2013 ACTFL‐UCLA NHLRC Heritage Language Project included biographical information as well as speech samples that were elicited using the ACTFL Oral Proficiency Interview–computer and were rated according to the ACTFL Proficiency Guidelines 2012–Speaking by certified testers. The goal of the study was to better understand the multiple linguistic, educational, and experiential factors that contributed to the speaking proficiency of these heritage speakers as well as how those features affected the tasks and contexts in which the speakers could more appropriately communicate in the language. The data illuminate the linguistic strengths and weaknesses of speakers within certain ranges and highlight those language features that prevented the participants from being rated at the next higher level. The authors discuss implications for teaching and learning and make recommendations for both heritage speakers and their instructors.
Full citation: Di Silvio, F., Donovan, A., & Malone, M. E. (2014). The Effect of Study Abroad Homestay Placements: Participant Perspectives and Oral Proficiency Gains. Foreign Language Annals, 47(1), 168-188.
Abstract: Although the study abroad homestay context is commonly considered the ideal environment for language learning, host‐student interactions may be limited. Then present study explored how language development of students of Spanish, Mandarin, and Russian related to student and host family perspectives on the homestay experience. The study used pretest and posttest Simulated Oral Proficiency Interviews to investigate student oral proficiency gains and surveys to examine beliefs of these students (n=152) and their hosts (n=87). Students and families were generally positive about the homestay, with significant variation based on language. A significant relationship was found between students’ oral proficiency gains and their being glad to have lived with a host family. Significant correlations were also found between students’ language learning satisfaction and their satisfaction with the homestay.
Full citation: Kissau, Scott (2014). The impact of the Oral Proficiency Interview on One Foreign Language Teacher Education Program. Foreign Language Annals, 47(3), 527-545.
Abstract: The Oral Proficiency Interview (OPI) has been increasingly used in academia. However, while multiple studies have documented the growth in OPI implementation across the United States and the proficiency rates of its completers, few have focused specifically on foreign language teacher candidates, and even fewer have investigated the impact that this proficiency assessment may have on language teacher training programs. To better understand the impact of the OPI on foreign language teacher education programs and help guide programmatic decision making, a case study was conducted of one such program that recently implemented the OPI as part of its licensure requirements. The results confirmed earlier research with respect to expected proficiency outcomes of foreign language teacher candidates. The results also suggested that the OPI requirement did not negatively affect program enrollment, nor did teacher trainees negatively perceive the OPI requirement. Finally, the study provided evidence of the positive impact the OPI may have on a foreign language teacher education program. Recommended practices for implementing the OPI in teacher training programs and ways to support foreign language teacher candidates who must complete the assessment are discussed.
Full citation: Glisan, E. W., Swender, E., & Surface, E. A. (2013). Oral Proficiency Standards and Foreign Language Teacher Candidates: Current Findings and Future Research Directions. Foreign Language Annals, 46(2), 264-289.
Abstract: The renewed national focus on teacher quality and effectiveness has resulted in more rigorous standards that describe the knowledge and skills required of teacher candidates across all disciplines. In the area of foreign languages, three sets of professional standards address the oral proficiency of teachers in the target languages they teach across the career continuum. For teacher candidates, the ACTFL/NCATE Program Standards for the Preparation of Foreign Language Teachers (2002) establish minimum oral proficiency levels based on the ACTFL Proficiency Guidelines—Speaking (2012). Utilizing ACTFL Oral Proficiency Interview (OPI) data, this study examines to what extent candidates are attaining the ACTFL/NCATE Oral Proficiency Standard of Advanced Low in most languages or Intermediate High in Arabic, Chinese, Japanese, and Korean. Findings indicate that 54.8% of candidates attained the required standard between 2006 and 2012 and that significant differences emerged for language, year tested, and university program results. Further research that takes into account additional contextual information about candidates and programs will inform continuing professional dialogue about the oral proficiency of teacher candidates entering the profession.
Full citation: Dierdorff, E. C., Surface, E. A. (2003). Reliability and the ACTFL Oral Proficiency Interview: Reporting Indices of Interrater Consistency and Agreement for 19 Languages. Foreign Language Annals, 36(4), 507-519.
Abstract: The reliability of the ACTFL Oral Proficiency Interview (OPI) has not been reported since ACTFL revised its speaking proficiency guidelines in 1999. Reliability data for assessments should be reported periodically to provide users with enough information to evaluate the psychometric characteristics of the assessment. This study provided the most comprehensive analysis of ACTFL OPI reliability to date, reporting interrater consistency and agreement data for 19 different languages. Overall, the interrater reliability of the ACTFL OPI was found to be very high. These results demonstrate the importance of using an OPI assessment program that has a well-designed interview process, a well-articulated set of criteria for proficiency determination, a solid rater training program, and an experienced cadre of testers. Based on the data reported, educators and employers who use the ACTFL OPI can expect reliable results and use the scores generated from the testing process with increased confidence. Recommendations for future research are discussed.
Full citation: Thompson, I. (1995). A Study of Interrater Reliability of the ACTFL Oral Proficiency Interview in Five European Languages: Data from ESL, French, German, Russian, and Spanish. Foreign Language Annals, 28(3), 407-422.
Abstract: The widespread use of the Oral Proficiency Interview (OPI) throughout the government, the academic community, and increasingly the business world, calls for an extensive program of research concerning theoretical and practical issues associated with the assessment of speaking proficiency in general, and the use of the OPI in particular. The present study, based on 795 double‐rated oral proficiency interviews, was designed to consider the following questions: (1) What is the interrater reliability of ACTFL‐certified testers in five European languages: ESL, French, German, Russian, and Spanish? (2) What is the relationship between interviewer‐assigned ratings and second ratings based on audio replay of the interviews? (3) Does interrater reliability vary as a function of proficiency level? (4) Do different languages exhibit different patterns of interrater agreement across levels? (5) Are interrater disagreements confined mostly to the same main proficiency level? With regard to the above questions, results show: (1) Interrater reliability for all languages in this study was significant both when Pearson's r and Cohen's modified kappa were used. (2) When second‐raters disagreed with interviewer‐assigned ratings, they were three times as likely to assign scores that were lower rather than higher. (3) Some levels of performance are harder to rate than others. (4) The five languages exhibited different patterns of interrater agreement across levels. (5) Crossing of major borders was very frequent, and was dependent on the proficiency level. As a result of these findings, several practical steps are suggested in order to improve interrater reliability.
Full citation: Dandonoli, P. & Henning, G. (1990). An Investigation of the Construct Validity of the ACTFL Proficiency Guidelines and Oral Interview Procedure. Foreign Language Annals, 23(1), 11-22.
Abstract: This article reports on the results of research conducted by ACTFL on the construct validity of the ACTFL Proficiency Guidelines and oral interview procedure. A multitrait-multimethod validation study formed the basis of the research design and analysis, which included tests of speaking, writing, listening, and reading in French and English as a Second Language. Results from Rasch analyses are also reported. In general, the results provide strong support for the use of the Guidelines as a foundation for the development of proficiency tests and for the reliability and validity of the Oral Proficiency Interview. The paper includes a detailed description of the research methodology, instrumentation, data analyses, and results. A discussion of the results and suggestions for further research are also included.
Additional Bibliography on Oral Proficiency:
https://aelrc.georgetown.edu/Oral Proficiency Testing Bibliography
Full citation: Bernhardt, E., Molitoris, J., Romeo, K., Lin, N., & Valderrama, P. (2015). Designing and Sustaining a Foreign Language Writing Proficiency Assessment Program at the Postsecondary Level. Foreign Language Annals, 48(3), 329-349.
Abstract: Writing in postsecondary foreign language contexts in North America has received far less attention in the curriculum than the development of oral proficiency. This article describes one institution’s process of confronting the challenges not only of recognizing the contribution of writing to students’ overall linguistic development, but also of implementing a program-wide process of assessing writing proficiency. The article reports writing proficiency ratings that were collected over a 5-year period for more than 4,000 learners in 10 languages, poses questions regarding the proficiency levels that postsecondary learners achieved across 2 years of foreign language instruction, and relates writing proficiency scores to Simulated Oral Proficiency Interview ratings for a subset of students. The article also articulates the crucial relationship between professional development and writing as well as the role of technology in collecting and assessing writing samples.
Full citation: Tschirner, E. (2016). Listening and Reading Proficiency Levels of College Students. Foreign Language Annals, 49(2), 201-223.
Abstract: This article examines listening and reading proficiency levels of U.S. college foreign language students at major milestones throughout their undergraduate career. Data were collected from more than 3,000 participants studying seven languages at 21 universities and colleges across the United States. The results show that while listening proficiency appears to develop more slowly than other domains and that Advanced levels of reading proficiency appear to be attainable for college majors at graduation. The article examines the relationship between listening and reading proficiency and suggests reasons for the apparent disconnect between listening and reading, particularly for some languages and at lower proficiency levels.
Full citation: Cox, T. L., & Clifford, R. (2014). Empirical Validation of Listening Proficiency Guidelines. Foreign Language Annals, 47(3), 379-403.
Abstract: Because listening has received little attention and the validation of ability scales describing multidimensional skills is always challenging, this study applied a multistage, criterion‐referenced approach that used a framework of aligned audio passages and listening tasks to explore the validity of the ACTFL and related listening proficiency guidelines. Rasch measurement and statistical analyses of data generated in seven separate language studies resulted in significant differences in listening difficulty between the proficiency levels tested and confirmed the validity of the ACTFL proficiency assessment for listening.
Full citation: Hacking, J. F., & Tschirner, E. (2017). The Contribution of Vocabulary Knowledge to Reading Proficiency: The Case of College Russian. Foreign Language Annals, 50(3), 500-518.
Abstract: Literacy development in a second language (L2) is a key goal of college foreign language study; language programs aspire to graduate students with strong L2 reading ability. Research has shown a strong correlation between L2 vocabulary knowledge and L2 reading proficiency; however, much of this research has focused on English as a second language. Research among other language learning populations is needed to offer empirically grounded suggestions on how best to achieve high levels of L2 literacy. This article presents data on the reading proficiency and vocabulary knowledge of 48 native American-English-speaking college-level learners of Russian to determine whether there are identifiable lexical thresholds associated with moving from one level of reading proficiency to the next. Participants completed the ACTFL Reading Proficiency Test rated according to the ACTFL Proficiency Guidelines 2012—Reading and the Russian Vocabulary Levels Test, which measured knowledge of the 5,000 most frequent words in Russian. Results showed that there are statistically significant lexical minimums associated with different levels of reading proficiency. These findings suggest the utility of similarly designed studies for other languages and are discussed in terms of implications for the role of developing vocabulary knowledge in an undergraduate curriculum.
Full citation: Clifford, R. & Cox, T. L. (2013). Empirical validation of reading proficiency guidelines. Foreign Language Annals, 46(1), 45-61.
Abstract: The validation of ability scales describing multidimensional skills is always challenging, but not impossible. This study applies a multistage, criterion‐referenced approach that uses a framework of aligned texts and reading tasks to explore the validity of the ACTFL and related reading proficiency guidelines. Rasch measurement and statistical analyses of data generated in three separate language studies confirm a significant difference in reading difficulty between the proficiency levels tested.
Full citation: Edwards, A. L. (1996). Reading Proficiency Assessment and the ILR/ACTFL Text Typology: A Reevaluation. The Modern Language Journal, 80(3), 350-361.
Abstract: Two previous investigations into the validity of the Interagency Language Roundtable (ILR)/American Council on the Teaching of Foreign Languages (ACTFL) text typology concluded that the model did not accurately predict test performance for foreign language (FL) readers. However, these studies suffered from serious flaws in design and implementation that may have led to erroneous conclusions. The present study considered the validity of the pragmatic approach to text difficulty that was put forward by Child in 1987—the text typology underlying the ILR/ACTFL proficiency guidelines. The following question guided this research: Does the Child discourse-type hierarchy predict text difficulty for L2 readers? Test data were collected from 62 U.S. Government employees having some previously demonstrated French proficiency. Nine authentic French texts and a combination of testing methods were employed. The results suggested that the Child text hierarchy may indeed provide a sound basis for the development of FL reading tests when it is applied by trained raters and when such tests include an adequate sample of passages at each level to be tested.
Edited Volumes and Reviews
ACTFL and the Common European Framework of Reference for Languages (CEFR)
Full citation: Tschirner, E. (2012). Aligning Frameworks of Reference in Language Testing: The ACTFL Proficiency Guidelines and the Common European Framework of Reference for Languages. StauFFenburg Verlag.
Full citation: Papageorgiou, S. (2014). Book review: Aligning Frameworks of Reference in Language Testing: The ACTFL Proficiency Guidelines and the Common European Framework of Reference for Languages.
Full citation: Piccardo, E. (2014). Aligning Frameworks of Reference in Language Testing: The ACTFL Proficiency Guidelines and the Common European Framework of Reference for Languages by Erwin Tschirner (ed.). The Canadian Modern Language Review/La revue canadienne des langues vivantes, 70(2), 268-271.