Research Synthesis
CALL Comparison Studies by Language Skills/Knowledge

These are short syntheses of the comparison studies for seven skill/knowledge areas--grammar, pronunciation, reading, vocabulary, writing, communication, integrated skills -- based on the references in the database.

CALL Comparison Studies on Second/Foreign Language Grammar (1985-2006)
Seven studies compared the effectiveness of grammar teaching using CALL and other methods (McEnery, 1995; Nagata, 1996; Niwa & Aoi, 1990; Raschio, 1990; Teichert, 1985; Torlakovic & Deugo, 2004; Vinther, 2004). Four studies looked at learners of English as a foreign language (McEnery, Niwa & Aoi, Vinther) and second language (Torlakovic & Deugo), one at Japanese (Nagata) and two at Spanish learners (Raschio, Teichert). The number of participants ranged from fewer than thirty one (17-McEnery, 21-Torlakovic & Deugo, 26-Nagata, 30-Niwa & Aoi) to more than fourty (45-Teichert, 62-Raschio, and 91-Vinther). All studies investigated undergraduate students in the college setting where participants interacted with a CALL program usually developed by the author of the study. For example, the programs contained custom-made parsers (Nagata, Vinther) or grammar tutorials (Raschio). While the proficiency level differed among participants, all six studies looked at a component of grammatical accuracy (e.g., use of adverbs, pronouns, particles, infinitives, or gerunds) at the sentence level.

The majority of studies used random assignment of participants while non-random assignment was employed by Niwa & Aoi and Teichert. The participants interacted with a CALL program for periods ranging from 2 to 10 weeks at least 1 hour/class period a week except for the subjects in Niwa & Aoi’s study, who interacted with the program only three times in six months. The participants were divided into CALL (experimental) and traditional (non-CALL) groups.

The participants’ performance on grammar tasks was higher for the CALL groups in all but two studies which found no significant difference between groups (Raschio) or mixed results (better performance for the CALL group on the last two out of three experiments in Niwa & Aoi). These results by Raschio and Niwa & Aoi could be explained by the set- up of the study. For example, the choice of measurement tools might have influenced the study outcomes in Raschio. In case of Niwa & Aio, the research design did not take into account the participants’ need for computer training which may explain why the participants in the CAI group scored lower than non-CAI participants in the first experiment. After the training was administered before the second experiment, the CAI group outperformed the non-CAI group on subsequent tasks.

Nagata, Niwa & Aoi, and Torlakovic & Deugo connected better student performance to immediate feedback that CALL programs allowed for. Several studies also reported positive participants’ attitudes towards the CALL task (McEnery, Nagata, Niwa & Aoi, Raschio). Finally, two of the studies (Raschio, Vinther) used mixed methods and incorporated qualitative measures which helped the authors get the participants’ point of view as well as suggestions for the improvement of the CALL program used in the study.
back to top


CALL Comparison Studies on Second/Foreign Language Pronunciation (1992-1999)
The database contains three studies on CALL pronunciation training (Ostrom, 1997; Stenson et al, 1992; Taniguchi & Abberton, 1999). Except for Stenson et al. who had 36 participants, the other two studies examined smaller groups (12-Taniguchi & Abberton, 13-Ostrom) in the college setting. These learners used commercially available software to study English pronunciation. Two of the studies were done with speakers of Asian languages (Thai and Japanese).

The students were not assigned randomly except in Taniguchi & Abberton. The pronunciation training lasted 10 minutes (Ostrom) or 60 minutes per week (Taniguchi & Abberton). The complete training in Stenson et al. was 80 minute per student per semester. The CALL programs displayed visual feedback on the production of suprasegmentals (intonation and pitch).

Overall, the studies reported improvements in all groups which received pronunciation training. No statistical significance was found between a training given by a human instructor with or without a computer (Ostrom, Stenson et al). Taniguchi & Abberton found better performance of the CALL group when it comes to nativeness of pronunciation. These findings suggest that CALL pronunication training is as effective as non-computer training, if not better.
back to top

CALL Comparison Studies on Second/Foreign Language Reading (1987-2004)
All reading comparison studies (Chiappone, 2004; Dreyer & Nel, 2003; Hong, 1997; Kleinmann, 1987; Leffa, 1992; Son, 2003, Stoehr, 2000) looked at reading comprehension of college students except for Chiappone’s study where the participants were second grade elementary students. The number of participants in these seven studies ranged from 20 in Hong and in Leffa to 145 in Stoehr. The subjects in most of the studies were learning English, except for three studies with the learners of Chinese (Hong), Korean (Son) and German (Stoehr).

All of the studies examined reading comprehension after participants used computer reading programs (Chiappone, Hong, Kleinmann, Son, Stoehr) or online texts (Dreyer & Nel). To help learners comprehend materials, reading programs contained hypertext, online glossaries, paraphrases of sentences, definitions of words, or grammatical explanations. Participants who were randomly (4 studies) or non-randomly (3 studies) selected received CALL computer instruction in three formats: one-shot experiments lasting less than two hours (Chiappone, Hong, Stoeher), CALL instruction in addition to their traditional classes (Kleinmann, Son), or in a blended learning format over the course of the semester (Dreyer & Nel). Non-CALL groups used paper-based reading materials usually with a paper glossary or a dictionary.

The performance of non-CALL groups was significantly lower than those of CALL groups (Dreyer & Nel, Leffa, Stoeher) or no significant difference was found (Chiappone, Kleinmann). The no significant difference result could be attributed to the use of drill-and-practice programs which did not allow for communicative interaction and practicing different types of reading strategies (Kleinmann). Since reading comparison research has looked at the effectiveness of CALL software vs. paper-based materials, as the majority of the studies synthesized here show, future studies could continue to explore student reading performance on unsimplified online texts that allow for the use of help tools which make those texts comprehensible (multimedia dictionaries, glossaries, and different type of grammar and cultural explanations).
back to top


CALL Comparison Studies on Second/Foreign Language Vocabulary (1985-2004)
The following eleven vocabulary comparison studies are in the database: Aust et al,1993; Bowles, 2004; Duquette et al, 1998; de la Fuente, 2003; Hamerstrom et al, 1985; Kang, 1992, 1995; Kanselaar,1993; McCreesh,1986; Terhune & Moore,1991; and Tozcu & Coady, 2004. The studies can be divided into three groups based on the language learners they investigate: learners of English (Kang 1992, 1995, Kanselaar, McCreesh, Terhune & Moore, Tozcu & Coady); learners of Spanish, (Aust et al, de la Fuente, Bowles) and learners of French (Duquette et al, Hamerstrom et al). The learners in these studies are at different levels of language proficiency and their numbers vary from only 9- McCreesh, around 30-Hamerstrom et al, Kang, 1992; Kanselaar to more than 60-Terhune & Moore, Duquette et al, Aust et al, Kang, 1995. These studies examine learners in a number of settings: kindergarten (Kang, 1992), primary (Kang, 1995), secondary (Kanselaar, Hamerstrom et al), post-secondary setting (Duquette et al) and college (Aust et al, Bowles, de la Fuente, Terhune & Moore, Tozcu & Coady).

The majority of studies use CALL vocabulary programs, which were created by the researchers, in form of HyperCard applications (Aust et al, Kang 1992, 1995, Terhune & Moore), tutorials (Kanselaar, Tozcu & Coady), drill-and-practice (McCreesh) and multimedia applications (Duquette et al). These CALL programs were used just for the purpose of research without any integration into regular instruction (5 studies), in addition to regular instruction (4 studies), or as an integral part of instruction (1 study). The groups of learners who did not use these vocabulary programs studied the same texts and words on paper, in their workbooks, or using paper dictionaries. In the majority of studies, the participants were randomly assigned to their groups (7 studies) while in three studies they were non-randomly assigned (Hamerstrom, Kanselaar, McCreesh), paired on their proficiency level (Kanselaar) or exposed to both CALL and non-CALL instruction (McCreesh).

Overall, learner performance on vocabulary knowledge, recognition and production tasks was not significantly different from the performance of non-CALL groups in Aust et al, Bowles, Duquette et al, de la Fuente, Hamerstrom et al, Kang (1995), Kanselaar. Better performance of the CALL group was found by Terhune & Moore, but these authors did not report statistics. The positive difference favoring the CALL group in this study can be attributed to CALL pedagogy that informed the design of the program which presented the words in context of the authentic newspaper articles while providing their visual and aural annotations. Only Tozcu & Coady found statistically higher performance of the CALL group which used the program as a part of blended learning instruction while McCreesh found that the non-CALL group outperformed the CALL group on learning phrasal verbs, which is a finding contrary to the findings of other studies. A possible explanation for this result could be found in the small number of only 9 participants, short time they used the program (3 hours over 4 weeks) and lack of participants’ familiarity with computers.

In sum, it appears that CALL vocabulary instruction may be as effective as traditional instruction but that the many pedagogical factors in the design and use of vocabulary programs make it difficult to generalize. Future research should investigate how to fully integrate CALL vocabulary instruction into the classroom.
back to top


CALL Comparison Studies on Second/Foreign Language Writing (1992-2005)
Writing comparison studies represent the largest group with nineteen studies: Al-Jarf, 2002; Bogard,1999; Braine,1997; Brickman, 2005; Cahill & Catanzaro, 1997; Chen, 2005; Chuo, 2004; Felix & Lawson, 1996; Florez-Estrada,1995; Ghaleb,1993; Gonzalez Mendez, 2005; Ittzes, 1997; Lam & Pennington,1995; Levine et al,1999; Liou,1997; Liou et al, 1992; Odenthal,1992; Spelman, 2002; Sullivan & Pratt,1996. Fourteen out of these 19 studies investigate learners of English both in the ESL (8 studies) and EFL (6 studies) setting. The other 5 studies (Bogard, Cahill & Catanzaro, Felix & Lawson, Florez-Estrada, Ittzes) look at German and Spanish learners. In the studies, the number of participants varied greatly and ranged from less than twenty (Felix & Lawson, Lam & Pennington) to more than 100 (Al-Jarf, Chuo, Gonzalez Mendez, Odenthal). Sixteen studies investigated undergraduate university students and only 3 looked at high school students (Bogard, Lam & Pennington, Odenthal).

The majority of authors used intact classes or groups without random assignment (15 studies) while only 4 studies randomized participants (Bogard, Felix & Lawson, Liou 1997, Lam & Pennington). The studies covered a wide range of computer technologies from computer applications such as word processors (Word, ClarisWorks) and browsers (Netscape) that are not designed specifically for language learning to CALL writing programs (Daedalus, Timbuktu), course management systems (Blackboard, WebCT), CMC programs (InterChange) and the web (web quest, web texts). Similarly, CALL was administered in a variety of forms: as a class component (Al-Jarf, Bogard, Felix & Lawson, Ittzes, Liou 1997, Liou et al 1992; Spelman), blended into the regular class (Braine, Chuo, Florez-Estrada, Ghaleb, Lam & Pennington, Sullivan & Pratt), or as a stand-alone course (Brickman, Cahill & Catanzaro, Levine et al, Odenthal). Despite such a variety of technologies and their applications, the majority of the studies looked at writing quality (Al-Jarf, Bogard, Braine, Brickman, Cahill & Catanzaro, Florez-Estrada, Ghaleb, Ittzes, Lam & Pennington, Odenthal, Sullivan & Pratt) which was sometimes measured through grammatical and lexical accuracy. Moreover, the other common variable included student attitudes towards the CALL aspect and writing instruction (Al-Jarf, Chuo, Felix & Lawson, Ittzes, Levine et al 1992, Liou 1997, Odenthal, Spelman, Sullivan & Pratt).

Overall, there was the same number of studies that found significantly better performance of CALL groups and those that did not find statistically significant difference between groups. To more precisely determine factors that cause groups to perform same or differently, a more detailed look into the context of each single study is necessary.
back to top


CALL Comparison Studies on Communication in Second/Foreign Language (1995-2006)

There are fifteen studies in database on communication in second/foreign language: Abrams, 2001; Bearden, 2004; Bohlke, 2003; Colburn, 2002; Coniam & Wong, 2004; Fitze, 2006; Fernandez-Garcia & Arbelaiz, 2003; Ibarz & Monaghan, 2000; Kern, 1995; Lai & Zhao, 2006; Patterson, 2001; Payne & Whitney, 2002; Salaberry, 2000; Yildiz, 2004; Vandergriff, 2006. As can be seen, most of the studies were conducted after 2000 which is the time of huge development of synchronous computer mediated communication over the Internet. Also, unlike comparison studies covering other skills, most of communication studies also have a second-language acquisition aspect and investigate interaction, focus on form, noticing, output, and task type. Except for Coniam & Wong who looked at 7th through 10th grade high school students, all other authors examined higher education students, learners of English (Yildiz, Lai & Zhao, Fitze), Spanish (Bearden, Fernandez-Garcia & Arbelaiz, Ibarz & Monaghan, Patterson, Payne & Whitney, Salaberry), German (Abrams, Bohlke, Vandergriff), and French (Colburn, Kern). The number of participants varied from 4-Salaberry and 5-Yidiz to 46-Abrams and 58-Payne & Whitney.

All but two studies (Ibarz & Monaghan, Yildiz) investigated synchronous written communication through free-ware chat programs such as ICQ (Coniam & Wong), IRC (Colburn, Bearden), OTChat (Bohlke), Chatnet ( Fernandez-Garcia & Arbelaiz), and YahooMessenger (Lai & Zhao) or as part of course management systems such as WebCT (Fitze) or other CALL programs Daedalus (Kern, Patterson, Vandergriff). Ibarz & Monaghan examined asynchronous written communication through e-mail, while Yildiz looked at this type of communication within a course management system. The assignment of participants was always non-random while a lot of studies used within-subjects design so that students performed both CALL and non-CALL tasks. The type of CMC CALL tasks subjects worked on involved preparation for in-class discussions (Abrams, Kern), preparation for assignments (Abrams, Fitze) and chatting about target culture (Bohlke, Payne & Whitney). Moreover, participants worked on information gap and information exchange tasks (Bearden, Lai & Zhao) as well as consensus-building tasks (Vandergriff), which were chosen to encourage participation especially since many of the studies looked at the amount (Bearden, Bohlke, Kern, Patterson) and distribution of participation (Yildiz, Lai & Zhao) as well as participant roles (Abrams).

Most of the studies (9) reported differences between CALL and non-CALL groups on quantity of production (Bearden, Kern, Patterson), participant roles (Abrams), distribution of participation (Bohlke, Fitze), and noticing (Lai & Zhao) but only two found statistical significance between groups (Bearden, Payne & Whitney). The small number of studies with significant results could be partly due to the fact that some studies did not report statistics while some used mixed methods. There was also a smaller number of studies that found no difference between groups. For example, Coniam & Wong found no difference for improvement of grammatical accuracy through on-line chat and Vandergriff found no statistically significant difference in the use of reception strategies between groups. In conclusion, it appears that computer mediate communication has features different from face-to-face communication. Future studies should explore how those features should be used to their best to promote language learning.
back to top

CALL Comparison Studies on Integrated Second/Foreign Language Skills (1984-2006)
Eighteen studies covering integrated skills represent one of the largest groups of studies in the database: Adair-Hauck et al, 2000; Al-Juhani,1991; Cartez-Enriquez et al, 2004; Chenoweth & Murday, 2003; Chenoweth et al, 2006; Echavez-Solano, 2003; Garcia & Arias, 2000; Green & Youngs, 2001; Kettemann,1995; Kim,1993; King,1985; Klassen & Milton,1999; Kunz,1997; Mellgren,1984; Petersen, 1990; Scida & Saury, 2006; Smith,1990; Troia, 2004. The study was put into this group if it investigated three of more language skills. These studies investigated generally large number of participants with the smallest study having only twenty participants (Chenoweth & Murday) and quite a few studies with over a hundred (Kunz, Echavez-Solano, Smith). King, Chenoweth et al, and Kettemann with 235, 365 and 527 participants respectively have the largest number of participants of all the studies in the database because they followed several intact classes over several semesters (King, Chenoweth et al) and in a number of public secondary schools in Austria (Kettemann). The studies deal with students of English as a second language (Kettemann, Kim (1993), King, Petersen, Garcia & Arias, Troia), English as a foreign language (Al-Juhani, Cartez-Enriquez et al, Klassen & Milton), French (Adair-Hauck et al, Chenoweth & Murday, Chenoweth et al, Green & Youngs), German (Kunz, Green & Youngs), and Spanish (Echavez-Solano, Mellgren, Scida & Saury, Chenoweth et al, Smith). As in case with other areas, the most common setting is college followed by secondary (4 studies), primary (1 studies), kindergarten, primary and secondary (1 study), and adult literacy setting (1 study).

The technology used was predominantly CALL software (both commercially available and custom-made), web (electronic texts and resources) and course management systems such as WebCT. The studies using course management systems to deliver language classes are blended learning studies in which students meet face-to-face with the instructor but also complete an on-line course component (Chenoweth & Murday, Chenoweth et al, Echavez-Solano, Scida & Saury). The variables examined in this group of studies were participants’ performance on tests of language skills/knowledge (listening, reading, writing, vocabulary, grammar, speaking) in addition to student attitudes towards CALL instruction, cultural knowledge, and cognitive styles and strategies among others.

The findings show the lack of significant difference between performance of CALL and non-CALL groups in 9 studies (Adair-Hauck et al, Chenoweth & Murday, Chenoweth et al, Echavez-Solano, Green & Youngs, Kim, Klassen & Milton, Mellgren, Troia) and significantly better performance for the experimental group in 5 studies (Al-Juhani, Kettemann, Kunz, Petersen, Smith). Overall, it can be concluded that in the majority of cases both CALL and non-CALL groups perform equally well and that CALL instruction does not disadvantage students relative to their counterparts in face-to-face classrooms when it comes to development of language skills/knowledge.
back to top

 

Back to description of the database