Input vs. intake in formative assessment and explicit grammar teaching

The relevance of explicit grammar instruction in foreign language classrooms has been discussed widely in the past, but there is no consensus regarding what is the best approach or how much time should be spent on explicit grammar teaching. This paper presents the results of three studies which focus on students’ knowledge of explicit grammar, their understanding of metalinguistic terminology, and their ability to correct agreement errors in their texts as a response to formative assessment. In the first study, the effect of different types of formative feedback on the improvement in agreement marking accuracy was tested. As there were no statistically significant differences found, two follow-up case studies were conducted to test the possible causes of the observed lack of effect. The first of the case studies tested the effect of formative assessment in a process writing task, but there was only limited effect found. The second case study focused on testing explicit grammar knowledge and understanding of metalinguistic terminology in an inverted classroom setting. The results suggested that the understanding of metalinguistic terminology was rather low, and the knowledge of explicit grammar was varying. The students found the task difficult. The Norwegian English language curriculum gives the teachers freedom to choose their methods and only sets requirements for the results of the students. These three studies show that there is a need for a discussion of the relevance, methods, and extent of explicit grammar teaching and the use of metalinguistic terminology in formative assessment in English language classrooms in Norway.


Introduction
The recommended approaches to instructed second language acquisition have undergone a development from heavily drill-based to heavily communication-based over the past few decades (Burner, Carlsen, & Kverndokken, 2019;Drew & Sørheim, 2016;Munden & Sandhaug, 2017).
Nevertheless, despite the focus on communication in second and foreign language classrooms, some formal grammar instruction is usually present. The concrete form and frequency of grammar instruction are often under-communicated in national curricula, which results in great variation between and within institutions. While teacher autonomy to choose the methods and tools in the classroom is extremely important, it is also essential to scrutinize the results of various approaches to grammar instruction and feedback practices in order to provide researchbased advice to teachers and teacher educators.
It is often argued by the proponents of communicative approaches to language instruction that communicative competence should be a more important aim in language classrooms than explicit grammar knowledge. While communication is indisputably the core function of language, second and foreign language students often need a certain level of metalinguistic knowledge and explicit grammar understanding in order to become independent language users.
One of the areas where this knowledge is essential is in receiving feedback during the learning process. Teacher feedback often focusses on grammar and vocabulary corrections, and it is usually formulated using metalinguistic terminology. If such feedback is not comprehensible for the students, it cannot have an effect on their future linguistic behavior (Burner, 2016a). This paper examines the role of explicit grammar knowledge and metalinguistic understanding in formative feedback practices. Agreement errors were chosen as the focus of the feedback because they are very frequent in student writing and they are often labeled as feedback resistant by the teachers. Even though subject-verb agreement is communicatively redundant in English (Trudgill, 2002), correct agreement use often serves as a marker of prestige or in-group marker of the educated and well-spoken (Widdowson, 1994) and it is therefore important that the students are able to use the appropriate verb forms. Simultaneously, because these errors rarely impair communication, the feedback must be fairly explicit, i.e. the teacher must use metalinguistic terminology to explain what is wrong on the level of from, not meaning.
In the first part of the paper, the requirements of the Norwegian English language curriculum are briefly reviewed followed by an overview of the types of formative assessment commonly used in language learning. Three feedback approaches on written work are discussed in more detail and applied by three different teachers teaching English in the first year of high school. The impact of the feedback type on the subject-verb agreement marking accuracy in the writing of Norwegian high school students is evaluated in the second part of the paper. Based on the results, metalinguistic knowledge of the students is further examined in two follow-up case studies focusing on process writing and on using inverted classroom in agreement instruction. It is concluded that in order to achieve higher efficiency of formative feedback in second and foreign language instruction, metalinguistic knowledge should be targeted more in the language learning process.

The Norwegian English language curriculum
The Norwegian English language curriculum has generally been following the world-wide trends in foreign language instruction. Since English was introduced as a compulsory subject in schools in the 1959 curriculum, the main focus in the instruction has gradually shifted from heavily corrective grammar-translation and audio-lingual methods to communicative approaches with focus on the development of own learning strategies (cf. Drew & Sørheim, 2016). It is expected that communicative approaches to language instruction should result in higher communicative competence and fluency among the learners. However, several studies have shown that the popularity of communicative teaching also leads to lower accuracy in language production among the students (Ellis, 2011;Granger & Tribble, 1998;Loewen, 2015). It can be argued that the goal of foreign language instruction is to enable the students to communicate with other speakers of the language and not necessarily aim at the native speaker norm. Nevertheless, the instructions for the evaluators of the final examinations in the English language in Norway specify three equally weighted areas of evaluation: content, structure, and language (Norwegian Department of Education, 2012). This suggests that communicative competence without accuracy in the formal aspects of the language is not enough for the students to achieve good results at the exam.
It is also noteworthy that despite the focus on accuracy in the evaluation routines, all reference to explicit grammar or metalinguistic knowledge has gradually disappeared from the competence aims of the Norwegian English language curriculum. While the curriculum published in 1987, "Mønsterplan 87", recognized the role of "increased understanding of grammar and the formal foundation it provides" as useful and necessary (Kirke-og undervisningsdepartementet, 1987, pp. 205, 208, my translation), this reference to metalinguistic knowledge disappeared in the curriculum published in 1997, "Laereplan 97". Similarly, the current curriculum, "Knowledge promotion" (LK06), which was originally published in 2006, required the students to be able to "use basic terminology to describe grammar and text structure", but in the 2013 update of the curriculum, this sentence has been removed (Norwegian Department of Education, 2006Education, , 2013. It thus seems that the references to explicit knowledge of grammar and metalinguistic knowledge in general have been replaced by purely communicative aims in the Norwegian English language curriculum during the last three decades despite the requirement for language accuracy in the examiner instructions. This can, in the worst case, lead to a misalignment of the curriculum and the evaluation criteria. These facts are reflected in the new curriculum, "Fagfornyelsen 2020" (Norwegian Department of Education, 2019) which marks a return to explicit knowledge of grammar when it requires the students to "use the knowledge of word classes and sentence structure in their work with oral and written texts" (my translation). However, the students tested in the studies described in this paper were all taught under the 2013 version of the curriculum.
It is also noteworthy that the current curriculum (LK06) requires the students to be independent learners and achieve a high level of control of their own learning strategies and processes. The curriculum calls for an ability to "evaluate and use different situations, working methods and learning strategies to further develop one's English-language skills; evaluate own progress in learning English; evaluate different digital resources and other aids critically and independently, and use them in own language learning" (Norwegian Department of Education, 2013, p. 10). However, it is difficult to achieve these goals without a certain level of abstract metalinguistic knowledge, such as a familiarity with the terminology used in dictionaries and grammar books and some knowledge of the concepts it refers to. Furthermore, the teachers are required to utilize formative assessment to promote learning among the students. According to the Elementary and Comprehensive Education Act the students "need to understand what they are supposed to learn and what is expected of them," "they should receive feedback about the quality of their work," and "receive advice on how to improve" (Opplaeringslova, 1998, §3-1, §3-11, my translation). In order for the formative assessment to have an effect, the feedback must be comprehensible for the students (Burner, 2016b). If the students do not have some degree of explicit grammar knowledge or they are not familiar with the metalinguistic terminology, they cannot use the teacher's comments to improve their language. It is thus possible that the requirement for explicit grammar teaching and teaching of the metalinguistic terminology should be understood by the teachers based on these requirements. However, since neither is explicitly mentioned in the curriculum, it is not certain these requirements are understood as intended.

Writing and feedback on written work
Even though written communication is only one of the four focus areas of the English language curriculum, research shows that substantial classroom time is spent doing written tasks in the language classes in Norway (Burner, 2016a;Sandvik, 2012). However, despite the time invested in writing and written activities in the classroom and home assignments, a holistic approach to the process of developing and improving own texts is rarely present. It is possible that second and foreign language teachers assume a certain amount of general writing skills transfer from the first language instruction and thus choose not to spend time on it. Instead, writing is often seen as a way to learn grammar or improve vocabulary (Sandvik, 2012, p. 156).
Another issue is that the students often express that they do not receive sufficient supervision during the writing process (Burner, 2016a, p. 627). Havnes et al. investigated feedback practices in different subjects in six Norwegian high schools and found that there are only "sporadic and individual initiatives in actively attending to feedback" (Havnes, Smith, Dysthe, & Ludvigsen, 2012, p. 26). This lack of systematicity may result in discouraged students who do not find the feedback useful. Horverak (2015) conducted a survey of feedback practices in English in Norwegian high schools and found that even though teachers often see the advantages of thorough formative feedback during the learning process instead of feedback only when the grade is given, they usually do not have the capacity to provide feedback twice during the writing process.
On the other hand, both teachers and students report that feedback on texts is not always followed up when it is given. Burner found that "the main reason seems to be the negative form of the feedback, but also that they do not always understand the content of the feedback" (Burner, 2016a, p. 635). Feedback is also usually given only in connection with grades, and purely formative feedback is rare (Havnes et al., 2012, p. 23). It seems that if there is no requirement for the students to actively use the feedback, for example, to improve their draft before submitting the final version, they might choose not to attend to it at all.

Formative assessment as a tool for independent learners
Despite the clear requirement in the Elementary and Comprehensive Education Act to provide students with formative assessment, there are no specifications in the current curriculum regarding when or how the assessment should be carried out. This naturally leads to different practices in different schools. Shute (2008, p. 154) in her review article defines formative assessment as "information communicated to the learner that is intended to modify his or her thinking or behavior for the purpose of improving learning." Similarly, Black et al. found that "feedback functioned formatively only if the information fed back to the learner was used by the learner in improving performance" (Black, Harrison, Lee, Marshall, & Wiliam, 2004, p. 16). It is thus clear that feedback which does not lead to improved performance does not function formatively even if the intention behind it is formative. Given the results from the studies cited above, it seems that what is given as formative assessment in many Norwegian schools does not have a formative function.
As is mentioned above, the English language curriculum in Norway requires the students to be independent language learners and "use different situations, working methods and learning strategies to further develop one's language skills" (Norwegian Department of Education, 2013, p. 10). Sandvik (2012) argues that these competence aims refer to formative assessment.
Formative assessment should be a natural part of students' learning strategies because formative assessment is a tool to enhance their own learning (Sandvik, 2012, p. 155). However, this is not possible if the students have not been instructed and required to use the feedback to improve their language and they do not have some abstract knowledge of language structures, explicit grammar, and metalinguistic terminology to enable them to use the feedback.
An additional question is whether teachers have a clear concept of what formative assessment is and in which way it should be best applied. Havnes et al.'s (2012) and Burner's (2016a) studies suggest that perhaps the concept is understood differently by different teachers. Shute (2008) in her study provides a review of the available literature on the effect of formative feedback on learning. She divides formative feedback practices into six main categories: no feedback, verification, correct response, try again, error flagging, and elaborated feedback (Shute, 2008, p. 160). Shute's review includes feedback in many different contexts, which means that not all her categories are appropriate as feedback on learner texts. Categories such as "try again", which provides the learners with "repeat-until-correct" feedback, or "verification", which informs the learner about the percentage of correct responses, are more suited for computerized tests than for evaluation of essays. However, three of Shute's categories are highly relevant, and often used, in language classrooms: correct response, error flagging, and elaborated feedback.
These approaches are described more closely below and each of them was used by one of the three teachers in the first study of the effect of feedback on agreement errors. For the purposes of this study, I only focus on feedback regarding language errors.
Perhaps the most often used feedback type in response to learner texts is "correct response" feedback, i.e. correcting errors directly in the text without any additional information provided. This approach is usually combined with a formative or summative evaluation at the end of the text. This approach was adopted by Teacher A in this study. She corrected all grammatical errors in the text and, in addition, provided the students with formative feedback on style and content. Teacher B used "error flagging" as her main feedback approach. She marked all grammatical errors in the text without providing the correct answers. She provided both summative and formative comments on style and content. Both these teachers' feedback as a whole can be considered formative because they suggested to the students how to improve their texts. However, the main formative focus was on style and content, not language. Teacher C attempted to give purely formative feedback on both language and content. The feedback approach could be considered "elaborated" within the sub-category "informative tutoring" (cf. Shute, 2008, p. 160). Informative tutoring feedback should contain "verification, error flagging, and strategic hints on how to proceed" but without providing the correct answer (Shute, 2008).
Teacher C did not mark any errors in the text, but wrote a short paragraph at the end with examples and suggestions on how to improve the first draft of the text before a final submission.

Data material and methodology
The data material used in this paper comes from three sources. The primary investigation into the effect of three types of formative feedback on the agreement marking accuracy uses data from a published corpus study of agreement errors in Norwegian students' writing (Garshol, 2019). A subsection of the corpus data was selected based on the availability of teachers' feedback. Three student groups were included in this selection. Students in all three groups attended the same study track (general studies), were in the same year during the data collection (first year of high school), and used the same textbook (Targets, Aschehoug, 2009), but they attended three different schools. All texts submitted to the teacher for grading during the school year were collected. Subject-verb agreement accuracy was chosen as the variable to be compared among these three groups. Agreement errors were counted in three texts distributed throughout the school year for each student (only students with at least three texts delivered during the school year were included). The aim of this part of the study was to determine whether the specific feedback practices have any influence on the overall agreement marking accuracy. The accuracy scores for each student were calculated as the number of incorrect agreement marking instances divided by the number of potential occasions to mark agreement in each text (i.e. potential occasion analysis, cf. Thewissen, 2015). These scores were then compared over time to determine whether the agreement marking accuracy improved during the school year.
The first follow-up case study is based on the products of a process writing task conducted in one of the classes included in the corpus. The students were asked to write a short description of the main character from a novel they have read. They were allowed to work in groups or alone. They received feedback on their first draft in which the teacher described the grammatical problems in the text without marking where the problems are (elaborate feedback without error flagging). Together there were 10 groups or individuals who delivered both the draft and the final product. The students were encouraged to read the comments and then try to locate and correct the errors while proof-reading their texts before they submit a second draft.
The didactic intention of the exercise was to encourage attention to form and monitoring of own production (Krashen, 1985). The experimental intention was to check whether the feedback the students receive on their writing during the school year has an effect on their agreement marking accuracy.
The second follow-up case study was conducted in a different student group. These students also attended their first year of high school, and they were enrolled in the general studies track. The students were asked to complete an inverted classroom lesson on agreement which included a video explaining the theory and an exercise testing their explicit knowledge (choosing the correct verb form in short sentences). Afterward, the students were asked whether they perceived the topic as difficult. The aim of this case study was to test whether the explicit grammar knowledge of the students and their familiarity with the metalinguistic terminology used in teacher feedback are sufficient and automated enough to benefit from formative feedback on their grammar errors.

Results
There were 22 students in group A (correct response feedback), 19 students in group B (error flagging feedback), and 20 students in group C (elaborated feedback). Two of the groups (A and B) show similar agreement error scores throughout the school year while the third group (C) has significantly higher mean error score throughout the school year. However, when the differences in scores between the beginning of the school year and the end of the school year are compared, there are no significant differences (tested with Kruskal-Wallis test run in R (Field, Miles, & Field, 2012, pp. 674-686 The students from class C also participated in a process writing experiment in which the effect of elaborated feedback without error correction or flagging was tested. As is mentioned above, they submitted two versions of the same text with teacher's comments on the first draft. There were 10 groups and individuals who submitted both texts and could thus be included in the analysis. Even though the difference in agreement marking accuracy between the draft and the final text is statistically significant (t-test run in R: M(draft) = 0.106, M(final) = 0.058, t(9) = 3.326, p < 0.05) this is mainly due to the students delivering longer texts as their final products.
If the final text is longer than the draft and the added sentences do not contain agreement errors, the error score goes down even if no previous errors are corrected. In fact, only four groups found and corrected more than one error in their draft, two groups corrected one error, and three groups did not correct any agreement errors (one group did not have any agreement errors in their first draft). It is important to note, that this task was not graded so the students may not have been as motivated to improve their draft as with a graded essays. Figure 1 provides a visualization of the results.
1 The error scores in groups B and C are normally distributed, but the distribution in group A is not normal. Figure 1. Agreement error scores in a process writing task.
In order to further investigate whether the failure to monitor and correct own errors stems from a lack of explicit grammar knowledge, a lack of knowledge of the metalinguistic terminology used in the comments, or something else a third experiment was run. There were 20 students in this group. The students were asked to watch a short video (seven minutes) describing the basic rules for agreement marking in English. After the video, they were tested on their accuracy in choosing the correct verb form in example sentence in an online exercise. They were also asked whether they considered the task difficult. The results of the agreement exercise show that many students struggled with the task. No students achieved ceiling accuracy (over 90%) and only five students managed to answer with accuracy over 80% (mean accuracy for the group: M = 0.66, sd = 0.22). Figure 2 shows the accuracy of answers for all 20 students in this group.

Figure 2. Accuracy of answers in agreement exercise
Most of the students also found the topic difficult (10 students) or very difficult (four students). Five students considered the topic easy and one did not answer this question.
However, it is interesting to note that the students spent on average the same amount of time on the task regardless of whether they considered it easy or difficult. The students who considered the task very difficult spent the least amount of time. Some did not even watch the full length of the video before they started with the exercise. This suggests the students might have found the task overwhelming or not understandable. Figure 3 provides a visualization of the time in minutes spent on the task with the students divided into groups based on the perceived difficulty level. Based on these findings it seems clear that the students' knowledge of the metalinguistic terminology and their ability to use their explicit grammar knowledge in monitoring their own production is neither high nor automated. Furthermore, even though the students are not confident in using their explicit grammar knowledge, they show little willingness to invest time to improve their skills.

Summary and conclusion
The current Norwegian English language curriculum does not specifically require the students to have explicit grammar knowledge or be familiar with metalinguistic terminology. However, the students are expected to be active and independent learners, and they are expected to use various tools and aids in their language learning. In order to fulfill these requirements, the students must have some degree of understanding of metalinguistic terminology. Furthermore, the Elementary and Comprehensive Education Act demands that the students should receive formative assessment as part of their learning process. This requirement entails that some level of metalinguistic knowledge must be in place for the students to understand the feedback. The feedback cannot have an effect on their language behavior if the students do not understand the terminology the teachers use in the comments.
Previous research on feedback, and specifically formative assessment, in Norwegian schools shows that feedback is often insufficient (Burner, 2016b;Havnes et al., 2012). The students admit that they do not follow-up feedback, sometimes because they do not understand it.
It has also been pointed out that feedback is often closely connected to grading and purely formative feedback is rare. Even when the students receive formative comments on their work, they rarely resubmit an improved version before the final grade is given (Havnes et al., 2012, p. 23). Even though some teachers express a wish to engage in process writing practices with their students more often, they also state that they do not have the capacity to provide formative assessment on a draft and then grade and give feedback on the same paper again after the final version has been submitted (Horverak, 2015).
The aim of this paper was to investigate whether students understand explicit grammar instruction and formative feedback enough to use it in their own learning. More specifically, the students' ability to use formative feedback given to them to improve their subject-verb agreement marking accuracy was tested. Three types of formative assessment were investigated and their effects on the improvement in agreement marking accuracy over the period of one school year were compared. There were no statistically significant differences found among the three samples. Furthermore, neither of the three student groups showed any statistically significant improvement in their agreement marking accuracy between the first measurement in the fall and the last measurement in the spring. This suggests that despite specific references to agreement errors in the teachers' comments in all three student groups, the feedback did not have any effect on the incidence of agreement errors in the students' texts.
One of the reasons for this lack of improvement could be the fact that the collected written material was not a product of process writing tasks. All the texts were classroom tests or one-draft-only homework assignments. This means that several months would usually pass after the formative feedback had been given and until the next assignment was written. The students might not remember what they were supposed to focus on in the following assignment. In this sense, it might be questioned whether this type of assessment is formative according to Shute's (2008) categories because it does not seem to enable the teachers to modify the learners' behavior. One of the groups was therefore also tested in a process writing activity with formative assessment between the draft and the final product. In this group, less than half of the writers were able to correct more than one of their agreement errors. However, in contrast to the first study, the students in the process writing task were not graded, neither on the draft nor on the final version of their submission. This could have affected their motivation to invest time into the improvements of their language.
In order to check whether the lack of improvement in agreement marking accuracy could be caused by a lack of understanding of the feedback, a separate group of students was tested on their understanding of metalinguistic terminology and their explicit grammar knowledge. The students achieved on average 66% accuracy in the task. However, the individual answers ranged from 11% to 89% correct answers. In addition, the majority of the students found the task difficult or very difficult. This suggests that the students are not used to working with metalinguistics terminology and may not understand the feedback they normally receive from their teachers. It is also important to note that the students did not seem motivated to improve their explicit knowledge or their metalinguistic understanding in the tasks used. A possible explanation for this observation might be the lack of grades in the two case studies. Havnes et al.
(2012) point out that feedback is normally tied to grading in schools and the students might ignore the feedback which does not follow this pattern. A follow-up study would be required to clarify whether the lack of grades has an influence on the diligence with which the students approach feedback.