Abstract
A cohort of preservice mathematics students was exposed to a teaching strategy based on retrieval practice and testpotentiated learning. The aim of the study was to determine how high and low prior topic knowledge study participants compare in terms of their procedural fluency and conceptual understanding after exposure to the teaching strategy. A pretest and posttest repeated measures design was employed in the study to compare within groups. A revised taxonomy table based on Bloom’s taxonomy was utilised to categorise test items. Findings indicate significant differences between pretest and posttest scores within groups. Results from the independent samples ttest show a significant difference between the two groups. Outcomes confirm that the benefits of retrieval practice are greatest for unfamiliar content. Findings indicate that for low prior topic knowledge students, procedural fluency is enhanced and retained more than conceptual understanding whereas for the high prior topic knowledge students it was the reverse. The strategy was not as effective for improving conceptual understanding.
Contribution: How a teaching strategy based on retrieval practice and testpotentiated learning affects the mathematical competencies of procedural fluency and conceptual understanding has not been researched. There is also a dearth of studies that set out to investigate how retrieval practice and testpotentiated learning affect research participants with different levels of prior knowledge. The contribution of this study therefore is to contribute to understanding of how retrieval practice and testpotentiated learning can be utilised to improve learning and teaching of mathematics at the school level.
Keywords: preservice mathematics teachers; retrieval practice; testpotentiated learning; prior topic knowledge
Introduction
In South Africa, high school students are required to study either Mathematics or Mathematical Literacy from Grades 10 through 12. The Curriculum and Assessment Policy Statement (CAPS) (DBE, 2011, p. 8) defines Mathematical Literacy as a subject that does not concentrate on abstract mathematical ideas but is grounded in basic mathematical content. It is emphasised that calculations in the subject should only require a basic fourfunction calculator. Furthermore, it is required that all content should be context based since the focus of Mathematical Literacy is to make sense of reallife contexts. The development of procedural knowledge is therefore not a priority in Mathematical Literacy. Consequently, in this subject the development of the skill of algebraic manipulation does not enjoy a high priority. Given that the emphasis is on basic mathematical content without a focus on the growth of mathematical conceptual comprehension, it is anticipated that students of Mathematical Literacy will not possess the same degree of conceptual and procedural understanding (pertaining to the topics of the high school curriculum in South Africa) as those who pursued Mathematics. Accordingly, the expectation is that Mathematical Literacy students do not have welldeveloped conceptual understanding and procedural knowledge of the mathematical topics that form part of schoollevel mathematics. Students who enter university mathematics content courses with Mathematical Literacy therefore are perceived to have low prior knowledge as compared to students who enter with Mathematics.
At some universities in South Africa entrance criteria for preservice mathematics teachers allow students to enter with Mathematical Literacy. Upon graduation, these students are permitted to teach Mathematics in the Senior Phase (Grades 7 to 9). To provide highquality education, teachers at this stage require a robust understanding of both the procedures and concepts related to the mathematical topics covered in Grades 7 through 9. Since Mathematical Literacy students for example are not exposed to algebraic concepts beyond Grade 9 there is a high possibility that their understanding of these concepts is limited. It is therefore crucial that the curricula of universitylevel mathematics courses include in their objectives the development of reasoning, and procedural and conceptual knowledge of schoollevel topics. Therefore, educators of these prospective mathematics teachers need to devise teaching methods that would boost mathematical reasoning, procedural and conceptual understanding, as well as memory retention.
Literature review
Retention of knowledge is of central importance in mathematics learning. Therefore, since retention of knowledge plays such a crucial role in mathematics learning it is incumbent on mathematics educators to make certain that they expose students to learning and teaching strategies that enhance retention. Studies have indicated that retrieval practice is a learning technique that improves the preservation of knowledge (for instance, Abott, 1909; Glover, 1989; Roediger & Karpicke, 2006). The concept that the process of remembering information aids in its preservation has been recognised for quite some time (Abott, 1909; Roediger & Butler, 2011). This process of recalling information from memory without relying on external resources (such as during exams or while studying) is referred to as retrieval practice.
Retrieval practice can also have an implicit or facilitating effect on learning. Evidence suggests that efforts to recall test items can enhance subsequent restudy of those items, even if the initial retrieval attempt was unsuccessful and no feedback was provided (Arnold & McDermot, 2013; Grimaldi & Karpicke, 2012; Izawa, 1966; Wissman & Rawson, 2018).
Izawa (1966) was the first researcher to identify this phenomenon (known as testpotentiated learning). Arnold and McDermot (2013) were subsequently the pioneering researchers who first identified the intermediary impact of testing on restudying. They argue that test items that were successfully recalled in the test and after restudying could have gained advantages from both the effect of testing and the enhancement of learning potentiated by testing. In contrast, items that were not successfully recalled in the initial test but were successfully retrieved in a subsequent test (after restudying) could have only gained from the enhancing effect of the initial test. Research findings indicate that conducting more practice tests, as opposed to fewer, amplifies the efforts of restudying (also known as testpotentiated learning) (Wissman & Rawson, 2018). In the current study, there was no effort made to differentiate between the indirect and direct impacts of retrieval practice. Put differently, the impacts of both the testing effect and testpotentiated learning were assessed collectively. This is because in realworld mathematics education scenarios, the same material is often tested multiple times (through tests and examinations), and as a result, student performance is typically influenced by both the testing effect and testpotentiated learning.
Research evidence suggests that certain learning strategies may work well for some students, while they may not be as beneficial for others (Dunlosky et al., 2013; Xiaofeng et al., 2016).
Consequently, it is important to know how individual differences such as fluid intelligence, working memory capacity, level of prior topic knowledge, etc. influence the effectiveness of mnemonic enhancing strategies such as retrieval practice. The primary focus of this study is to evaluate the efficacy of retrieval practice and testpotentiated learning as strategies for enhancing memory, taking into account the influence of preexisting topic knowledge. The research question for this study is: How do students with high and low preexisting topic knowledge differ in their mathematical competencies, specifically procedural fluency and conceptual understanding, after being exposed to a teaching method that incorporates retrieval practice and testpotentiated learning? Data for the study were provided by research done for a doctoral thesis (May, 2017).
Prior topic knowledge and learning
Prior knowledge plays a major role in learning in practically all learning areas. It is therefore important for educators to understand how different learning and teaching strategies are affected by levels of prior knowledge. There is evidence that effectiveness of learning strategies is moderated by level of prior knowledge. The argument is made that the same learning materials and teaching strategy could yield different learning outcomes for students, depending on their varying levels of prior knowledge on the topic (Xiaofeng et al., 2016). Furthermore, students who already have a substantial understanding of the topic are more capable of grasping new content (Dunlosky et al., 2013), and they can retain more information compared to those with less preexisting knowledge (Thompson & Zamboanga, 2003).
Prior topic knowledge delays forgetting of novel subject material since it can be integrated into a preexisting knowledge framework (Dunlosky et al., 2013). Since it is integrated in existing knowledge structures with different information clues more hooks for retrieval are provided and hence retrieval is enhanced. Generally, it has been found that greater preexisting knowledge in a domain enhances student learning and achievement (Thomson & Zanboanga, 2004). The literature however is silent on which types of knowledge are retained better and understood better. Understanding this is important for the design of effective teaching strategies.
Prior topic knowledge and retrieval practice
Studies indicate that the practice of retrieval not only improves the retention of knowledge, but also aids in learning (Butler, 2010; Chan et al., 2018; Karpicke, 2012). It is therefore also important for mathematics educators to understand the relationship between level of prior topic knowledge and effectiveness of retrieval practice. This is because an essential goal of teaching is the enhancement of knowledge. A number of studies set out to investigate how levels of prior topic knowledge influence the effectiveness of retrieval practice. The following is a discussion of some of these studies.
Findings of some studies indicate that levels of prior topic knowledge have little or no influence on the effectiveness of retrieval practice (for e.g. Carroll et al., 2007; Cogliano et al., 2019; Xiaofeng et al., 2016) but benefits of retrieval practice are greatest for unfamiliar content (i.e. low prior knowledge) (Cogliano et al., 2019).
Learning materials used by some of the studies (on how level of prior knowledge influences the effectiveness of retrieval practice) included topics in educational psychology (Cogliano et al., 2019), texts from abnormal psychology (Carroll et al., 2007), and key psychology terms (Xiaofeng et al., 2016). Study participants in these studies included undergraduate educational psychology students (Cogliano et al., 2019), undergraduate and advanced psychology majors (Carroll et al., 2007), and undergraduate psychology students (Xiaofeng et al., 2016).
Outcome measures in these studies included responses to multiplechoice questions based on education psychology course content (Cogliano et al., 2019), student responses to shortanswer tests based on content from abnormal psychology (Carroll et al., 2007), and selection from a wordlist based on key psychology terms (Xiaofeng et al., 2016).
The fact that most of these studies were done in the domain of psychology is not surprising since most research regarding learning is done by cognitive psychologists. However, what is missing in all of the aforementioned research is how exposure to retrieval practice affects the different knowledge and reasoning types. Moreover, there is a dearth of research that utilised mathematics learning material and whose outcomes measure included knowledge and reasoning type. The author is not aware of any studies that included knowledge type and reasoning in their outcome measures.
Despite the fact that retrieval practice and its effects on learning have been studied extensively there are some areas where very few investigations have been done. An area of interest is how the impact of retrieval practice might be influenced by individual variances, such as preexisting knowledge on the topic (Dunlosky et al., 2013). Dunlosky et al. (2013) argue that this issue is significant because studies have demonstrated that learning strategies do not have a uniform effect on all students. This inconsistency in the literature is what the present study aims to resolve.
Theoretical framework
Procedural fluency and conceptual understanding
This study posits that it is crucial for future teachers to master the five types of mathematical competencies defined by the National Research Council (2001), as they are fundamental to successful mathematics learning. These competencies include: conceptual understanding, procedural fluency, strategic competence, adaptive reasoning, and a productive disposition. It is recognised that these five competencies are interconnected and mutually dependent.
Conceptual understanding is defined as the grasp of mathematical concepts, operations, and relations. Procedural fluency, on the other hand, is the ability to execute procedures in a flexible, accurate, efficient, and appropriate manner (National Research Council, 2001). Studies have indicated that these two aspects, namely conceptual understanding and procedural fluency, are crucial for the development of mathematical proficiency (Hiebert & Grouws, 2007). As a result, there is a demand for research in preservice mathematics education, particularly in the South African context, that explores which teaching strategies could enhance these competencies in preservice mathematics education students for schoollevel mathematics topics.
There exist four theoretical perspectives regarding the relationship between procedural and conceptual knowledge (RittleJohnson et al., 2015; Schneider et al., 2011). Each theory is backed by certain empirical evidence, but at the same time, other evidence contradicts them. The conceptsfirst theory posits that students initially gain conceptual knowledge in a field and then use this understanding to formulate procedures for problemsolving. Conversely, the proceduresfirst theory suggests that students initially learn problemsolving procedures in a field and then gradually build conceptual knowledge through repeated problemsolving (RittleJohnson et al., 2001).
This study aligns with the iterative model as suggested by RittleJohnson et al. (2001). A key tenet of this theory is that either procedural or conceptual knowledge can develop first, but it is not a rule that one type of knowledge always precedes the other. The authors maintain that it is common for a specific type of knowledge to be incomplete (for a topic). In other words, one type of knowledge might be more advanced at a certain moment, but this doesn’t mean that the other type of knowledge is entirely lacking. As a rule, initial knowledge in any field is quite limited. The argument is that the levels of preexisting knowledge in a field determine which type of knowledge will surface first, thereby initiating the learning process.
The argument is made that conceptual knowledge, being general and abstract, can be applied to new types of problems (Schneider et al., 2011). Procedural knowledge, on the other hand, is believed to be more specific to a particular problem type, as procedures are typically practised with a specific problem type. Therefore, it is suggested that if students have some preexisting knowledge of the material to be learned, then conceptual knowledge might have a larger role in the development of procedural knowledge, and not the other way around. Students with minimal prior knowledge in a domain tend to first develop procedural knowledge, which then aids in the development of conceptual knowledge. For instance, when students are introduced to the topic of linear functions, they often first learn the procedures (which includes some initial conceptual knowledge), and then, through exposure to different problem types in this domain, they develop their conceptual understanding. Hence, according to the iterative model, improved conceptual knowledge leads to improved procedural knowledge. This improved procedural knowledge, in turn, leads to improved conceptual knowledge, and this cycle continues.
As previously stated, when students first encounter a mathematical topic, their initial understanding of the topic is usually quite limited. As a result, it is common for students to have some knowledge about a topic, but not a complete understanding. Conceptual knowledge is crucial for the creation, selection, and proper application of procedures. However, practising established procedures is believed to assist students in developing and deepening their understanding of concepts. The primary argument is that both types of knowledge are necessary for effective learning in mathematics and, over time, each type of knowledge needs to reinforce the other. Furthermore, if a student’s conceptual knowledge about a mathematical topic hasn’t yet matured, they will likely rely heavily on this conceptual knowledge when devising a solution procedure for a given mathematical task. Therefore, mathematics instruction should aim to develop both procedural and conceptual knowledge.
A student with limited prior knowledge in a mathematical field typically possesses disjointed knowledge and lacks the ability to perceive how procedures and concepts interrelate within the field. As the knowledge within a domain expands, so does the ability to merge the pieces of conceptual and procedural knowledge into a unified knowledge structure (Eds. Baroody & Dowker, 2003; Linn, 2006; Schneider & Stern, 2009; Schneider et al., 2011). Theories commonly used to elucidate retrieval practice involve the concepts of storage and retrieval strength. Storage strength pertains to the permanency of knowledge, while retrieval strength indicates the ease or difficulty of recalling knowledge. A model grounded in these theories posits that retrieval strength has a negative correlation with increases in storage strength (Bjork & Bjork, 1992). This suggests that strenuous retrieval (low retrieval strength) bolsters storage strength and fosters longterm learning, whereas easy retrieval is linked with lower levels of storage strength. ‘Desirable difficulties’, such as extending the interval between retrieval practice sessions, are thought to exemplify instances of strenuous retrieval. In the present study intervals between retrieval practice sessions were approximately three weeks. Long retention intervals such as this introduce desirable difficulties since they require effortful retrieval.
The taxonomy table
A taxonomy table was employed to classify test and examination items based on their perceived level of difficulty. The table was initially based on Bloom’s original taxonomy. This taxonomy comprised six primary categories: knowledge, comprehension, application, analysis, synthesis, and evaluation. In this original taxonomy, the categories were arranged from the simplest to the most complex, and from the most concrete to the most abstract. The underlying assumption was that the mastery of learning objectives followed a hierarchical pattern. That is, understanding each simpler category was a necessary step before grasping the next, more complex category (Krathwohl, 2002).
In the original taxonomy, statements of learning objectives typically comprised a noun or noun phrase, and a verb or verb phrase (Krathwohl, 2002). These noun and verb elements were linked to the knowledge category in Bloom’s original taxonomy. This meant that the knowledge category had a dual nature, setting it apart from the other categories. To eliminate this inconsistency, the noun and verb elements were divided into two dimensions in Bloom’s revised taxonomy. In this revised version, the noun formed the foundation for the knowledge dimension, while the verb established the basis for the cognitive process dimension (Anderson et al., 2001). Using these two dimensions, a twodimensional table, referred to as the taxonomy table, was created. In this table, the knowledge dimension constituted the vertical axis, while the cognitive process dimension made up the horizontal axis. The cells of the table were formed by the intersections of the knowledge and cognitive process categories.
The taxonomy table’s knowledge dimension primarily consists of four categories: Factual Knowledge, Conceptual Knowledge, Procedural Knowledge, and Metacognitive Knowledge. Conversely, the Cognitive Process dimension is mainly composed of: Remember, Understand, Apply, Analyse, Evaluate, and Create. It’s crucial to note that within the taxonomy table, cognitive processes are perceived to function on knowledge structures during cognitive processing. For instance, one might refer to ‘understanding’ as it pertains to ‘conceptual knowledge’ when using the taxonomy table.
The categories of the Knowledge and the Cognitive Process labelled in the revised taxonomy table did not fully meet the requirements of the current study. As a result, modifications were made to the table to better align with the study’s needs. In my adapted version of the taxonomy table, I retained some of the Knowledge dimension categories, but eliminated most of the Cognitive Process dimension categories. The reason for this was to concentrate on the categories of cognition that are prevalent in mathematical cognition. Similarly, the knowledge categories I included are those that are most common in mathematics. In my version, subcategories are not included in the Knowledge dimension, but they are incorporated in the Cognitive Process dimension. Moreover, since this study focuses on the teaching and learning of mathematics, the knowledge categories and cognitive process categories are viewed from a mathematical perspective. The categories of my revised Knowledge dimension include: Factual Knowledge, Procedural Knowledge, Flexible Procedural Knowledge, and Conceptual Knowledge. Each category will be discussed in more detail in the following sections. Factual Knowledge encompasses understanding of terminology, the structure and syntax of symbol representation, permissible operations, and so on (Eisenhart et al., 1993).
Flexible procedural knowledge refers to profound procedural knowledge that enables a student who has it to apply the most appropriate mathematical procedures to a given familiar or new problem scenario (Star, 2005).
In the reasoning structure proposed by Lithner (2008), reasoning is characterised as the thought process necessary to formulate assertions and draw conclusions in the resolution of mathematical tasks. Within this structure, reasoning pertains to the cognitive processes, the outcome of these processes, or both. Thinking can be classified as reasoning within this framework, even if it is incorrect, as long as it is logical to the thinker.
Lithner’s (2008) framework identifies two distinct types of reasoning. Imitative Reasoning (IR) is when a student generates a solution procedure that they have memorised. Conversely, Creative Reasoning (CR) is a type of reasoning marked by flexibility and innovative approaches to solving mathematical problems (Bergqvist, 2007).
Lithner (2008) distinguishes between two primary categories of IR: Memorized Imitative Reasoning and Imitative Algorithmic Reasoning (AR). For a reasoning to be classified as memorised, it must meet two conditions: the reasoning should be based on the recall of a complete answer, and the implementation strategy should involve only writing down the answer.
If a reasoning sequence is based on recalling an algorithm, it falls under Imitative Algorithmic Reasoning (AR). To be classified as AR, it must meet two conditions: the strategic choice for the reasoning should involve recalling an algorithm as a solution, and no other reasoning should be required except to implement the algorithm.
Algorithmic Reasoning is further divided into two subcategories: Familiar Algorithmic Reasoning (FAR) and Delimiting Algorithmic Reasoning (DAR). If a mathematical task is identified as a familiar type and can be solved using a known algorithm, then it requires FAR. Conversely, if a task requires the student to select from a set of algorithms, then the reasoning required is DAR.
Creative Reasoning (CR) is defined as a type of reasoning that is not restricted by fixation, but rather is marked by flexibility and innovative methods for solving mathematical problems (Bergqvist, 2007). If a task primarily requires CR, the reasoning involved is categorised as Global Creative Reasoning (GCR). On the other hand, if a task is almost solvable using IR and only needs CR to adjust an algorithm, then the reasoning required is termed Local Creative Reasoning (LCR). The taxonomy table, which is based on the aforementioned discussion, is presented in Table 1.
Method
This study employed convenience sampling to choose its research participants. The participants were a group of secondyear preservice teachers from the 2014 cohort who were enrolled in a mathematics course taught by the researcher. The research was conducted over both semesters of 2014. Initially, 88 students were chosen to participate, but ultimately, only 63 students were included in the study. Data from students who missed tests and examinations were excluded. The study covered various topics, including but not limited to: analytical geometry, functions, remainder and factor theorems, Euclidean geometry, and matrices. With the exception of matrices all of the topics are high schoollevel mathematics topics in South Africa.
The research participants had a wide range of mathematical knowledge and skills upon leaving school. A portion of the participants had completed Mathematics up to Grade 12.
These students (n = 44) were perceived to have high prior topic knowledge whereas students who did Mathematical Literacy up to Grade 12 were considered to have low prior topic knowledge (n = 19). The sample was composed of 30 male and 33 female participants, with an average age of 23. The statistical analysis only included students who had completed all the assessment components. So for the statistical analysis high prior topic knowledge participants numbered 40 and low prior topic knowledge participants numbered 18.
The study’s quantitative data were derived from test and examination scores. The research followed a quasiexperimental method, meaning that the independent variable was not altered, and there was no random group assignment (Johnson & Christensen, 2012). While the quasiexperimental method is seen as providing less robust evidence of a causeandeffect relationship between variables, it remains crucial in educational research. This is because many educational research issues are not suitable for strict experimental methods (Kerlinger, 1986).
The study utilised a design that included pretests and posttests with repeated measures. Four class tests administered per semester collectively constituted a pretest, while the examinations (two per semester) written at the end of each semester were considered a posttest. The examinations were written about three and a half months after the first test. The first test covered only the content that had been taught prior to the test, while the second test included both the content taught before it and the content covered in the first test. Similarly, the third test covered the content from the first and second tests, as well as the content taught prior to it, and so on. As a result, the content taught before the first test was tested four times, while the content taught after the second test was tested three times, and so on. This indicates that the retrieval practice in the study was repeated. Given that the interval between tests was approximately three weeks, it suggests that the retrieval practice was also spaced (distributed). The fact that research participants had to revisit content covered in previous tests implies a high likelihood of testpotentiated learning occurring.
Questions from tests and examinations were sorted using a revised taxonomy table, which is grounded in Bloom’s taxonomy (Table 1). This revised taxonomy table served as the tool for measuring the knowledge and reasoning proficiency of the research participants. The classification of problems was influenced by the number of previous practice sessions, the timing, and the perceived primary knowledge and reasoning requirements of the problem. To clarify the above statement, consider this: a problem might initially be classified as DAR based on Flexible Procedural Knowledge (C1bii), but after students have encountered it multiple times, the knowledge requirements shift to FAR, and the reasoning type becomes Ordinary Procedural Knowledge. As a result, the new classification is FAR based on Procedural Knowledge (B1bi), a category that is perceived to be less challenging.
After the assessment pieces were marked, each participating student received a score in each category for every individual class test and examination. Subsequently, all the individual scores for each category were summed to determine a total. This summing process was carried out separately for class tests and examinations. For instance, all individual scores for the category FAR based on Procedural Knowledge (B1bi) were added together to yield a total sum for this category per student. Following this, the sums for the categories Memorised Reasoning based on Factual knowledge (A1a), Memorised Reasoning based on Procedural Knowledge (B1a), FAR based on Procedural Knowledge (B1bi), FAR based on Flexible Procedural Knowledge (C1bi), and DAR based on Flexible Procedural Knowledge (C1bii) were added sequentially to provide a grand total. This was done separately for pretests and posttests. This grand total was deemed to represent measures of the mathematical competency procedural fluency. In the statistical analysis, the grand total for the pretest was labelled as SKILLPRE and for the posttest, it was labelled as SKILLPOST.
In a similar manner, the total scores for the categories such as Memorised Reasoning based on Conceptual Knowledge (D1a), FAR based on Conceptual Knowledge (D1bi), DAR based on Conceptual Knowledge (D1bii), and Local CR based on Conceptual Knowledge (D2a) were combined to yield a new overall sum. This overall sum was then further added to generate a final sum total. This process was carried out separately for both pretests and posttests, similar to the previous case. This final sum total was interpreted as a measure of the mathematical competency known as Conceptual Understanding. In the statistical analysis, the final sum total for the pretest was labelled as CONCPRE and for the posttest, it was labelled as CONCPOST.
It is imperative that the difficulty level of test items in the posttest matches or exceeds that of the pretest. If this is not the case then a false measurement of improvement may be determined. As indicated, the difficulty level of all questions in tests and exams was determined using the taxonomy table. The researcher used the taxonomy table to set standards for tests and exams. For example, 15,55% of test items of the pretest of the first semester consisted of the category of question DAR based on Conceptual Knowledge (D1bii) whereas for the posttest the percentage was 34%. This category is perceived to be of a higher difficulty level. Posttest items on average were set at a higher difficulty level than for pretests to mitigate for the possibility of false measurement of improvement.
The following exemplars based on the topic of matrices (linear algebra) are used to illustrate the arguments regarding item difficulty level:
An example of an item categorised as FAR based on Procedural Knowledge (perceived to be less difficult):
Given the following system of linear equations:
Write an augmented matrix for the system.
Use GaussJordan elimination to solve the system of equations.
An example of a more difficult item based on the same topic. This item would be categorised as DAR based on Flexible Procedural Reasoning:
Use a system of equations to find the quadratic function f(x) = ax^{2} + bx + c that satisfies the equations: f(1); f(2) = −1; f(3) = −5
An example from the category DAR based on Conceptual Knowledge (perceived to be more difficult since it requires welldeveloped conceptual knowledge):
State whether the following statement is true or false. Provide a justification for your answer.
If two columns of a square matrix are the same, then the determinant of the matrix will be zero.
The teaching strategy
In this section the teaching strategy that was employed is discussed. Arguments are presented to motivate why this particular teaching strategy was employed.
Yeo and Fazio (2019) argue that the best learning approach is contingent on both the type of knowledge being acquired (whether it’s stable facts or adaptable procedures) and the pertinent learning processes (whether it’s schema induction or memory and fluency enhancement). The most effective learning strategy is determined by the learning goals.
These include learning to retain new information (memory building), learning a new problemsolving strategy (schema induction), learning to solve analogous problems (fluency building), and inducing a mathematical concept (schema induction).
It was the intention with the design of the teaching strategy to optimise learning opportunities for participants. As mentioned previously, the second test included content that was tested in the first test and the third test included content that was tested in the second and first tests, etc. This way of doing things allowed for the inclusion of analogous problems in tests which facilitated fluency building (i.e. enhancing procedural fluency). Testing on the same content repeatedly also offered the opportunity to participants to learn from previous tests (testpotentiated learning). For example, if a student could not provide correct responses to a test item in a test the student would be compelled to determine what the correct response was in preparation for the following test since test items would be included on the same content. The presentation of analogous problems requires the repeated retrieval of schemata regarding the same procedural and conceptual knowledge. Such cognitive exercises enhance retention since memory is ‘refreshed’ with each retrieval attempt. In addition, since the problems do not include exactly the same information, it will be stored in a different way. For example, students were exposed to the remainder and factor theorems of polynomial functions. The factor theorem they were exposed to was given as follows:
A polynomial f(x) has a factor (x – h) if and only if it f(h) = 0
In classroom exercises they were required to solve problems based on the factor and remainder theorems. This included problems such as the following two examples:
Show that x – 3 is a factor of f(x) = 2x^{2} + 3x^{2} – 11x – 6
For which values of m will x – 3 be a factor of x^{3} + m^{2}x^{3} – 11x – 15m?
In a subsequent test students were presented with the problem:
Find p if k + 2 is a factor of k^{50} – p^{25}
The solution to this problem requires an understanding that since the factor theorem is an ifandonlyif mathematical statement it consists of two converse statements. The one statement being:
If a polynomial has a factor of x – h, then f(h) = 0
Its converse is:
If f(h) = 0, then x – h is a factor of polynomial f(x)
Both of these statements are required for the solution. That is, since k + 2 is a factor, −2 should be substituted:
f (– 2) = (– 2)^{50} – p^{25}.
Next, the function should be equated to 0. Using the other statement:
(−2)^{50} – p^{25} = 0
The remainder of the solution requires an understanding of exponential laws. Now if a student provided a correct response to the problem it would imply that the student made a connection with the implicit conceptual information provided in the factor theorem. That is, that the theorem consists of two converse statements both of which are required for the solution procedure. The next part of the solution procedure requires a connection with exponential laws and their manipulation. This in turn implies that the student cognitively connected information concerning the factor theorem in a new way. In other words, schemata concerning the factor theorem were changed (schema induction). That is, the test item has resulted in learning. This is contrary to the conventional belief that tests are only used to assess learning.
Now let us suppose a student could not provide a solution. The student therefore would have identified a lack of knowledge regarding the solution procedure to such problems in themself. It is highly probable that the student would then proceed to seek help to find a solution. This implies that testpotentiated learning has taken place.
Statistical analysis
IBM SPSS version 28.0 was utilised for the statistical analysis which follows. The focus of this study, as mentioned earlier, is to evaluate the differences in knowledge and reasoning types between students with high and low prior knowledge on the topic, after they have been subjected to a teaching method that emphasises retrieval practice and testenhanced learning.
Statistical analysis was done as follows:
The comparison of scores from pretests and posttests was conducted to assess the impact of the teaching method on students’ knowledge and reasoning abilities. This was done separately for low and high prior topic knowledge participants. As previously mentioned, all class tests were collectively used as the pretest, while the endofsemester exams were combined to form the posttest. The updated taxonomy table was employed to classify items from the tests and exams. The categories within this taxonomy table served as the primary measurement tool in this study. The dependent variable in this study is the scores from the pretests and posttests, while the independent variable is the teaching approach that is grounded in retrieval practice and testenhanced learning.
Initially, descriptive statistics were employed to scrutinise the data and identify any breaches of the assumptions inherent in statistical tests. Subsequently, a paired samples ttest was conducted to ascertain if significant disparities existed between the pretest and posttest scores. This was done separately for participants with low and high prior knowledge of the topic, and also separately for procedural fluency and conceptual understanding.
An independent samples ttest was carried out to compare the average scores of participants with low and high prior knowledge on the posttest. This comparison was made separately for both procedural fluency and conceptual understanding. The purpose of this was to identify if a significant variance exists between the scores of participants with low and high prior topic knowledge after they were exposed to the teaching method.
Paired samples ttest (low prior topic knowledge)
As mentioned previously, to determine if there is a significant difference between low and high prior topic knowledge scores after exposure to the teaching strategy a paired samples test (t statistic) that is based on the overall mean difference (μ_{D}) was utilised.
The null hypothesis is that there is no significant difference after exposure to the teaching strategy. In other words, the mean difference of the pretest and posttest scores for the population is zero:
H_{0} : μ_{D} = 0
The alternative hypothesis is that the intervention caused the posttest scores to be higher or lower than the pretest scores. In other words, the mean difference is not zero:
H_{1} : μ_{D} ≠ 0
The level of significance is set at α = 0.05 for a twotailed test.
The paired samples statistics for the SKILL and CONC variables (low prior topic knowledge participants) is presented in Tables 2, 3 and 4.
TABLE 2: Paired samples statistics (low prior topic knowledge participants) (N = 18). 
TABLE 3: Paired samples test (low prior topic knowledge participants). 
TABLE 4: Paired samples effect sizes (low prior topic knowledge participants). 
A paired samples ttest was conducted to evaluate the impact on low prior topic knowledge participant scores of a teaching intervention based on retrieval practice and testpotentiated learning. There is a statistically significant increase in scores for the variable SKILL (procedural fluency) from pretest (M = 53.32, SD = 19.65) to posttest (M = 68.73, SD = 11.7) t(18) = 3.139, p = 0.006 (twotailed). The mean increase in scores for the skill variable is 15.41 with a 95% confidence interval ranging from 5.05 to 25.78. The value for Cohen’s d is 0.74 which is close to a large effect size. The null hypothesis for the SKILL variable is rejected. In other words H_{1} : μ_{D} ≠ 0.
Next we discuss the statistics for the competency conceptual understanding. There is a statistically significant increase in scores for the variable CONC (conceptual understanding) from pretest (M = 35.90, SD = 12.81) to posttest (M = 44.29, SD = 13.56), t(18) = 3.634, p = 0.002 (twotailed). The mean increase in scores for the CONC variable is 8.39 with a 95% confidence interval ranging from 3.52 to 13.26. The value for Cohen’s d is 0.86 which is a large effect size. The null hypothesis for the CONC variable is rejected. That is H_{1} : μ_{D} ≠ 0.
Paired samples ttest (high prior topic knowledge)
The paired samples statistics for the high prior topic knowledge group is presented in Tables 5, 6 and 7.
TABLE 5: Paired samples statistics (high prior topic knowledge) (N = 40). 
TABLE 6: Paired samples test (high prior topic knowledge). 
TABLE 7: Paired samples effect sizes (high prior topic knowledge). 
A paired samples ttest was conducted to evaluate the impact on high prior topic knowledge participant scores of a teaching intervention based on retrieval practice and testpotentiated learning. There is a statistically significant increase in scores for the SKILL variable (procedural fluency) from pretest (M = 73.38, SD = 18.56) to posttest (M = 84.71, SD = 13.29), t(40) = 5.87, p < 0.001 (twotailed). The mean increase in scores for the skill variable is 11.33 with a 95% confidence interval ranging from 7.43 to 15.25. The value for Cohen’s d is 0.93 which is a large effect size. The null hypothesis for the SKILL variable is rejected. In other words H_{1} : μ_{D} ≠ 0.
The statistics for the CONC variable are as follows. There is a statistically significant increase in scores for the variable CONC (conceptual understanding) from pretest (M = 56.36, SD = 17.05) to posttest (M = 69.09, SD = 13.98), t(40) = 7.14, p < 0.001 (twotailed). The mean increase in scores for the CONC variable is 12.66 with a 95% confidence interval ranging from 9.08 to 16.25. The value for Cohen’s d is 1.13 which is a large effect size. The null hypothesis for the CONC variable is rejected. That is, H_{1} : μ_{D} ≠ 0.
Independent samples ttest
As indicated, an independent samples ttest was performed to compare the mean scores of the low and high prior knowledge participants’ posttests. This was done separately for the competencies procedural fluency and conceptual understanding. The statistics are shown in Tables 8 and 9.
For the independent samples ttest the null hypothesis is that there is no significant difference after exposure to the teaching strategy. In other words, the mean difference of the low and high prior topic knowledge groups is zero:
H_{0} : μ_{D} = 0
The alternative hypothesis is that the intervention caused the mean difference to be nonzero:
H_{1} : μ_{D} ≠ 0
The level of significance is set at α = 0.05 for a twotailed test.
Levene’s test for equality of variances tests whether the variance of scores for the two groups is the same. The outcome of this test determines which of the t values is the correct one to use. If the significance value for Levene’s test is larger than 0.05 the first row is used, otherwise the second row is used.
An independent samples ttest was conducted to compare SKILL scores (after exposure to the teaching strategy) between low and high prior topic knowledge groups. There was a significant difference in scores for low prior topic knowledge participants (M = 63.74, SD = 11.70) and high prior topic knowledge participants (M = 84.71, SD = 13.29); t(56) = 4.39, p < 0.001 (twotailed).
The magnitude of the differences in the means is 15.98 with a 95% confidence interval of 8.68–23.27. Therefore, the null hypothesis is rejected. That is H_{1} : μ_{D} ≠ 0.
An independent samples ttest was conducted to compare CONC scores (scores after exposure to the teaching strategy) between low and high prior topic knowledge groups. There was a significant difference in scores for low prior topic knowledge participants (M = 44.29, SD = 13.56) and high prior topic knowledge participants (M = 69.03, SD = 13.78); t(56) = 6.36, p < 0.001 (twotailed). The magnitude of the differences in the means is 24.73 with a 95% confidence interval of 8.68–23.27. Consequently, the null hypothesis is rejected. In other words, H_{1} : μ_{D} ≠ 0.
It is possible that the data might be conflated in terms of those topics that were new to both groups. To investigate this possibility, data for only those topics that were new to both high and low prior knowledge groups were analysed separately and were then compared with an analysis of data for the ‘familiar’ topics. A paired samples test was utilised for this purpose. It should be noted that not all the statistics are presented in the tables other than those that are relevant for the discussion. Table 10 shows the analysis for the low prior knowledge group whereas Table 11 shows the analysis for the high prior knowledge group and Table 12 shows the analysis for all participants.
TABLE 10: Low prior topic knowledge paired samples statistics. 
TABLE 11: High prior topic knowledge paired samples statistics. 
TABLE 12: All participants paired samples statistics. 
A comparison of Tables 10, 11 and 12 indicates that there was a net increase for the unfamiliar topics. Likewise, if the mean differences of the unfamiliar topics are compared to those of the familiar topics, then the increase for the unfamiliar topics was greater. Moreover, the low prior topic knowledge participants had a higher mean score than their high prior knowledge counterparts for the unfamiliar topics. This holds true for both the pretest and the posttest. However, for the familiar topics the reverse is true.
Another possible confounding concerns the amount of retrieval practice for the different topics. The concern is that topics covered near the beginning of the study would enjoy more retrieval practice than topics dealt with near the end of the study. The topics at the end would therefore not be influenced as much by the retrieval practice. To investigate this issue, only data from topics that were tested two or more times were used in an additional statistical analysis. Table 13 shows the analysis for the low prior knowledge participants whereas Table 14 shows the statistics for high prior knowledge participants. It should be noted that that not all the statistics are presented, only those that are relevant to the discussion.
TABLE 13: Low prior topic knowledge paired samples statistics (N = 18). 
TABLE 14: High prior topic knowledge paired samples statistics (N = 40). 
A comparison of the statistics of the new analysis with the previous analysis shows that for the low prior knowledge group there were slight increases for both SKILL and CONC variables. A similar scenario holds for the high prior knowledge group. The exception being a small decrease for the SKILLPOST variable. Since there is no major difference when the analysis is done without the data where less retrieval practice was done, one can conclude that the analysis was not conflated to the extent that all previous arguments based on the original analysis must be discarded.
Discussion and conclusion
The statistical analysis clearly shows that students with low prior topic knowledge started the course with a significantly low base (53%) in the competency of procedural fluency, compared to their high prior topic knowledge counterparts where the competency was at 73% preintervention (a 20% difference). For the competency conceptual understanding the low prior topic knowledge students entered with an even lower base at 36% compared with 56% (a 20% difference) for the high prior topic knowledge students.
Postintervention procedural fluency improved for both low and high prior topic knowledge students. There was an increase from 53% to 69% (30 percentage points) for the former and an increase from 73% to 85% (16 percentage points) for the latter. For conceptual understanding postintervention there were increases from 36% to 44% (22 percentage points) and from 56% to 69% (23 percentage points). Postintervention for the competency procedural fluency the difference between the two groups decreased to 16% whereas for the competency conceptual understanding it increased to 24%.
It is not easy to attribute causality using a pretest posttest quasiexperimental design. Also as argued earlier since the independent variable was not manipulated and there was no random assignment to groups it is not easy to attribute causality. However, if all possible rival explanations can be eliminated then one can make a case for causality. One rival explanation for the findings could be that difficulty level of posttest items was lower than that for pretest items. However, it was indicated previously that the percentage of more difficult items was more for the posttest than for the pretest. Hence the improvement in scores cannot be attributed to difficulty level. Furthermore, if a student performs badly in a test then there is a chance to make up in the next test. However, should a student perform badly in the exam there is no second chance. It could be argued therefore that anxiety for the exam would be higher. Since the exam covers all the work and the difficulty level was higher the expectation was that posttest scores would be lower than that of the pretests. Since this was not the case there has to be some variable that is attributable to the improvements.
Another possible rival explanation could be good studying methods. Since tests covered less material and were closer to when it was presented the expectation is that performance would be better in the pretests. Since performance in posttests was better this argument is rejected.
It can also be argued that exceptionally good lecturing caused the improvement. However, the argument can again be presented that performance in tests should be better if this was the case. Since the exams took place long after some of the lecturing sessions a major amount would be forgotten. This argument therefore is also rejected.
When the data regarding content that were tested only once (or not at all) was ignored, the scores improved. There was therefore weaker performance in those tests where there was less retrieval practice. Since there are no plausible rival explanations for the findings, the contention is that the improvements are attributable to retrieval practice and testpotentiated learning.
The research question for the study was how do high and low prior topic knowledge students compare in terms of the mathematical competencies of procedural fluency and conceptual understanding after exposure to a teaching strategy based on retrieval practice and testpotentiated learning? The findings show that the teaching strategy enhanced procedural fluency for both groups but the low prior topic knowledge group showed a higher increase. This finding is consistent with the literature. That is, that benefits of retrieval practice are greatest for unfamiliar content (Cogliano et al., 2019). The Mathematical Literacy students entered the course with weakly developed procedural knowledge and hence would have found much of the course content unfamiliar.
The findings also indicate that there was significant enhancement of the competency of conceptual understanding for both groups postintervention. The low prior topic knowledge participants however still had a mean below the pass cutoff (44%) whereas the high prior topic knowledge group improved to close to 70%. The difference between the groups also increased for this competency signifying that the low prior topic knowledge group was still at an early developmental stage for this competency.
The iterative model for the development of procedural and conceptual knowledge, proposed by RittleJohnson et al. in 2001, suggests that students with low prior knowledge in a particular field are likely to first develop procedural knowledge, which then aids in the cultivation of conceptual knowledge. The findings of this study support this proposition. That is, the findings show that the development of procedural knowledge of low prior topic knowledge students is ahead of their development of conceptual knowledge. This can be seen in the fact that the improvement for procedural fluency is nearly double that of their conceptual understanding (15% versus 8%). This shows that for the low prior topic knowledge students the ability to integrate conceptual understanding and procedural fluency is at an early developmental stage. Evidence for this is the low base conceptual understanding. The fact that it is increasing is evidence that it is developing. This aligns with existing literature which suggests that as competencies in procedural fluency and conceptual understanding improve, the capacity to merge conceptual and procedural knowledge structures into a unified knowledge structure also enhances (Eds. Baroody & Dowker, 2003; Linn, 2006; Schneider & Stern, 2009; Schneider et al., 2011). Hence although the development is still low it is moving in the right direction. This was the ultimate objective of the study: the development of procedural fluency and conceptual understanding of participants.
As indicated previously, findings of some research indicates that students with different levels of prior topic knowledge can have different learning outcomes after exposure to the same learning materials and teaching strategy (Xiaofeng et al., 2016). There is also evidence that students with high prior topic knowledge can develop better understanding of new content (Dunlosky et al., 2013) and are able to remember more than individuals with less prior knowledge (Thompson & Zamboanga, 2003). Findings from the present study corroborate these findings. The low and high prior topic knowledge students had different levels of development of procedural fluency and conceptual understanding after exposure to the teaching strategy. The high postintervention mean scores for the high prior topic knowledge students are evidence that these students are able to develop better understanding of new content and are able to remember more.
The foregoing discussion provides answers to the questions as to which type of knowledge is retained better and understood better after exposure to a teaching strategy based on retrieval practice and testpotentiated learning. The statistical results show that for low prior topic knowledge students procedural fluency is enhanced more and retained more than conceptual understanding. As indicated previously the competency of procedural fluency is predicated on Memorised Reasoning based on Factual Knowledge, Memorised Reasoning based on Procedural Knowledge, FAR based on Procedural Knowledge, FAR based on Flexible Procedural Knowledge, and DAR based on Flexible Procedural Knowledge. Therefore, these categories of reasoning and knowledge were privileged by the teaching strategy for the low prior topic knowledge group. The findings were similar for the high prior topic knowledge group.
Likewise as indicated previously, the competency of conceptual understanding is based on Memorised Reasoning based on Conceptual Knowledge, FAR based on Conceptual Knowledge, DAR based on Conceptual Knowledge and Local CR based on Conceptual Knowledge. Although both groups showed improvements in these categories, it was much higher for the high prior topic knowledge group. The improvement for these categories for the high prior topic knowledge group was also higher than their procedural fluency categories, whereas it was the reverse for the low prior topic knowledge group (procedural fluency greater than conceptual understanding). The question is: what can be possible explanations for these findings?
Present understanding of conceptual and procedural knowledge indicates that these two types of knowledge are not held as completely separate systems (Hiebert & Lefevre, 1986; RittleJohnson et al., 2001). It’s possible that one type of knowledge may be more developed at a certain point in time, but this doesn’t mean that the other type of knowledge is entirely lacking (RittleJohnson et al., 2001). A welldeveloped mathematical knowledge base includes important and fundamental cognitive links between conceptual and procedural knowledge. Importantly, the formation of suitable cognitive links between conceptual and procedural knowledge is believed to aid in efficient memory storage and successful retrieval of procedures in relevant situations (Hiebert & Lefevre, 1986).
Procedural knowledge is generally believed to be more associated with a particular type of problem, as procedures are typically practised in relation to a specific problem type. Hence, if students already have some familiarity with the material to be learned, conceptual knowledge might have a greater influence on the development of procedural knowledge, and not the other way around. On the other hand, students who have little prior knowledge in a field are likely to first develop procedural knowledge, which then aids in the cultivation of conceptual knowledge (Schneider et al., 2011). This argument is an explanation of why the low prior knowledge participants of the current study showed a lower increase in the development of their conceptual knowledge. That is these students entered with low levels of procedural knowledge and hence the development of their conceptual knowledge was slower than that of their high prior topic knowledge counterparts.
Hiebert and Lefevre (1986) propose several reasons for the more successful storage and retrieval of procedures when they are linked to conceptual knowledge. When procedures are tied to conceptual knowledge, they become part of an information network bound by semantic relationships, which are less likely to degrade over time as memory tends to last longer for meaningful connections. Since the procedures are integrated into a knowledge network, a variety of cognitive links can be employed to access and retrieve them. It is believed that conceptual knowledge also serves an executive control function, as it is used not only to oversee the selection and application of a procedure, but also to evaluate the appropriateness of the procedural result.
Familiar procedures can lessen the cognitive load involved in problemsolving, thereby improving the ability to tackle more intricate problems. The rationale for this is that automated procedures free up cognitive resources. These resources can then be used, for instance, to identify connections between new aspects of problems or to apply pertinent conceptual knowledge (Hiebert & Lefevre, 1986). This argument provides explanations why the low prior topic knowledge participants’ improvement of conceptual knowledge is much lower than that of the high prior topic knowledge participants. Since the low prior topic knowledge students were still developing their procedural knowledge their cognitive resources were occupied with applying their procedural knowledge in problemsolving. This effort inhibited their ability to identify relationships between novel features of problems and hence they could not provide correct solutions to the more complex problems where conceptual knowledge was required.
The theories of retrieval and storage strength (Bjork & Bjork, 1992) provide explanations for the findings based on the implemented teaching strategy. As previously stated, storage strength pertains to the durability of knowledge, while retrieval strength signifies the ease or challenge of recalling that knowledge. Additionally, there is a negative correlation between retrieval strength and increases in storage strength. Given that the gap between tests (or retrieval practice) was roughly 3 weeks, it necessitated strenuous retrieval as the retained knowledge would have significantly decayed over this period. The effortful retrieval in turn enhanced storage strength. That is longterm retention of knowledge was promoted.
The fact that analogous test items were presented to participants in each subsequent test meant that the same knowledge would be retrieved more than once. Moreover, since analogous items were not exactly the same in order to produce a solution, one needed to use and view stored knowledge in new ways. That is the retrieved knowledge was changed. In other words the participant learnt from the test. More complex problems enhanced the development of conceptual understanding. This is these kinds of problems required that connections are made between disparate pieces of information. Making connections between pieces of knowledge implies that conceptual knowledge is enhanced. This is of cardinal importance in the learning of mathematics.
The fact that South African students struggle with mathematics is well documented. Therefore research is required to determine which teaching and learning strategies could help to alleviate the problem. This study is a contribution in this regard.
Possible confounding
For parametric techniques it is assumed that the population from which the samples are taken have normally distributed scores. For large samples (e.g. 30+) the violation of this assumption should not cause any major problems. Since the number of participants for the high prior topic knowledge group was greater than 30 it does not present a problem for them. It could however be a problem for the low prior topic knowledge students since they numbered only 18. However, the data were checked for severe violations of the normality criteria. Histograms together with skewness and kurtosis parameters indicated that the normality requirement was not violated too severely.
In quantitative research, the four primary forms of validity typically examined are internal, external, construct, and statistical conclusion. In a design that involves a single group undergoing pretesting and posttesting, factors such as history, maturation, testing, instrumentation, and regression could potentially impact internal validity.
In this research, there were no historical events (known to the investigator) that influenced the results. Maturation is defined as any physical or psychological transformations that might happen within the participants of the study, which could potentially impact their performance on the dependent variable. Given that the majority of the participants in this study were mature individuals, no maturationrelated threats were anticipated.
In this research, the pretest and posttest scores were derived from multiple tests, and each subsequent test had different items. Also since students wrote tests individually it negated the possibility of intergroup influence. Therefore, the risk to internal validity was diminished. Since the same instrument was utilised in all parts of the study and test items were equivalent, no instrumentation validity threat is expected. The researcher therefore contends that since no rival plausible explanation was found that explained the findings it is attributed to the independent variable.
External validity is the extent to which a study’s results can be extrapolated to other groups, environments, and situations. Achieving external validity in quantitative research involves two steps: first, identifying the target population, and second, randomly selecting a sample from that population. However, due to various practical constraints, these steps may not always be feasible. Often in educational research, an accessible population is used in place of the target population. A random sample is then chosen from this accessible population. In this study, the students enrolled in the author’s class were chosen as the accessible population. Rather than randomly selecting from this group, the entire class was included in the study. The participation of the whole class increased the likelihood that the accessible population is a good representation of the target population.
Acknowledgements
Competing interests
The author declares that they have no financial or personal relationship(s) that may have inappropriately influenced them in writing this article.
Authors’ contributions
B.M.M. declare that they are the sole author.
Ethical considerations
Ethical clearance to conduct this study was obtained from the University of the Western Cape Senate Research Committee (No. 14/1/5). Informed consent was obtained from all participants.
Funding information
This research received no specific grant from any agency in the public, commercial or notforprofit sectors.
Data availability
The data that support the findings of this study are available from the corresponding author, B.M.M., on reasonable request.
Disclaimer
The views and opinions expressed in this article are those of the author and do not necessarily reflect the official policy or position of any affiliated agency of the author.
References
Abott, E.E. (1909). On the analysis of the factor of recall in the learning process. The Psychological Review: Monograph Supplements, 11(1), 159. https://doi.org/10.1037/h0093018
Anderson, L.W. (Ed.), Krathwohl, D.R. (Ed.), Airasian, P.W., Cruikshank, K.A., Mayer, R.E., Pintrich, P.R., Raths, J., & Wittrock, M.C. (2001). A taxonomy for learning, teaching, and assessing: A revision of Bloom’s Taxonomy of Educational Objectives (complete edition). Longman.
Arnold, K.M., & McDermott, K.B. (2013). Testpotentiated learning: Distinguishing between direct and indirect effects of tests. Journal of Experimental Psychology: Learning, Memory, and Cognition, 39(3), 940. https://doi.org/10.1037/a0029199
Baroody, A.J., & Dowker, A. (Eds). (2003). The development of arithmetic concepts and skills: Constructing adaptive expertise. New York, NY: Routledge.
Bergqvist, E. (2007). Types of reasoning required in university exams in mathematics. Journal of Mathematical Behavior, 26, 348–370. https://doi.org/10.1016/j.jmathb.2007.11.001
Bjork, R.A. & Bjork, E.L. (1992). A new theory of disuse and an old theory of stimulus fluctuation. In A. Healy, S. Kasslyn, & R. Shiffrin (Eds), From learning processes to cognitive processes: Essays in honor of William K. Estes (Vol. 2, pp. 35–67). Hillsdale, NJ: Erlbaum
Butler, A.C. (2010). Repeated testing produces superior transfer of learning relative to repeated studying. Journal of Experimental Psychology: Learning, Memory, and Cognition, 36(5), 1118. https://doi.org/10.1037/a0019902
Carroll, M., CampbellRatcliffe, J., Murnane, H., & Perfect, T. (2007). Retrievalinduced forgetting in educational contexts: Monitoring, expertise, text integration, and test format. European Journal of Cognitive Psychology, 19(4–5), 580–606. https://doi.org/10.1080/09541440701326071
Chan, J.C., Meissner, C.A., & Davis, S.D. (2018). Retrieval potentiates new learning: A theoretical and metaanalytic review. Psychological Bulletin, 144(11), 1111–1146. https://doi.org/10.1037/bul0000166
Cogliano, M., Kardash, C.M., & Bernacki, M.L. (2019). The effects of retrieval practice and prior topic knowledge on test performance and confidence judgments. Contemporary Educational Psychology, 56, 117–129. https://doi.org/10.1016/j.cedpsych.2018.12.001
Department of Basic Education. (2011). National curriculum statements Grades R12. Retrieved from http://www.education.gov.za
Dunlosky, J., Rawson, K.A., Marsh, E.J., Nathan, M.J., & Willingham, D.T. (2013). Improving students’ learning with effective learning techniques: Promising directions from cognitive and educational psychology. Psychological Science in the Public Interest, 14(1), 4–58. https://doi.org/10.1177/1529100612453266
Eisenhart, M., Borko, H., Underhill, R., Brown, C., Jones D., & Agard, P. (1993). Conceptual knowledge falls through the cracks: Complexities of learning to teach mathematics for understanding. Journal for Research in Mathematics Education, 24(1), 8–40. https://doi.org/10.5951/jresematheduc.24.1.0008
Glover, J.A. (1989). The ‘testing’ phenomenon: Not gone but nearly forgotten. Journal of Educational Psychology, 81(3), 392. https://doi.org/10.1037//00220663.81.3.392
Grimaldi, P.J., & Karpicke, J.D. (2012). When and why do retrieval attempts enhance subsequent encoding? Memory & Cognition, 40(4), 505–513. https://doi.org/10.3758/s1342101101740
Hiebert, J., & Grouws, D.A. (2007). The effects of classroom mathematics teaching on students’ learning. In F.K. Lester (Ed.), Second handbook of research on mathematics teaching and learning (pp. 371–404). Information Age Publishing.
Hiebert, J., & Lefevre, P. (1986). Conceptual and procedural knowledge in mathematics: The case of mathematics: The case of mathematics (pp. 1–27). Lawrence Erlbaum Associates.
Izawa, C. (1966). Reinforcementtest sequences in pairedassociate learning. Psychological Reports, 18(3), 879–919. https://doi.org/10.1037/e469452008011
Johnson, B., & Christensen, L. (2012). Educational research: Quantitative, qualitative, and mixed approaches. SAGE.
Karpicke, J.D. (2012). Retrievalbased learning: Active retrieval promotes meaningful learning. Current Directions in Psychological Science, 21(3), 157–163. https://doi.org/10.1177/0963721412443552
Kerlinger, F.N. (1986). Foundations of behavioural research (3rd ed.). New York, NY: Holt, Rinehart & Winston.
Krathwohl, D.R. (2002). A revision of Bloom’s taxonomy: An overview. Theory Into Practice, 41(4), 212–218. https://doi.org/10.1207/s15430421tip4104_2
Linn, M.C. (2006). The knowledge integration perspective on learning and instruction. In R.K. Sawyer (Ed.), The Cambridge handbook of the learning sciences (pp. 243–264). Cambridge: Cambridge University Press.
Lithner, J. (2008). A research framework for creative and imitative reasoning. Educational Studies in Mathematics, 67(3), 255–276. https://doi.org/10.1007/s1064900791042
National Research Council. (2001). Adding it up: Helping children learn mathematics. In J. Kilpatrick, J. Swafford, & B. Findell (Eds.), Mathematics learning study committee, center for education, division of behavioral and social sciences and education. National Academy Press, pp. 5–6
RittleJohnson, B., Schneider, M., & Star, J.R. (2015). Not a oneway street: Bidirectional relations between procedural and conceptual knowledge of mathematics. Educational Psychology Review, 27, 587–597. https://doi.org/10.1007/s106480159302x
RittleJohnson, B., Siegler, R.S. & Alibali, M.W. (2001). Developing conceptual understanding and procedural skill in mathematics: An iterative process. Journal of Educational Psychology, 93(2), 346–362. https://doi.org/10.1037/00220663.93.2.346
Roediger, H., & Karpicke, D. (2006). The power of testing memory: Basic research and implications for educational practice. Perspectives on Psychological Science, 1(3), 181–210.
Roediger III, H.L., & Butler, A.C. (2011). The critical role of retrieval practice in longterm retention. Trends in Cognitive Sciences, 15(1), 20–27. https://doi.org/10.1016/j.tics.2010.09.003
Schneider, M., RittleJohnson, B., & Star, J.R. (2011). Relations among conceptual knowledge, procedural knowledge, and procedural flexibility in two samples differing in prior knowledge. Developmental Psychology, 47(6), 1525. https://doi.org/10.1037/a0024997
Schneider, M., & Stern, E. (2009). The inverse relation of addition and subtraction: A knowledge integration perspective. Mathematical Thinking and Learning, 11, 92–101. https://doi.org/10.1080/10986060802584012
Star, J.R. (2005). Reconceptualising procedural knowledge. Journal for Research in Mathematics Education, 36(5), 404–411.
Thompson, R.A., & Zamboanga, B.L. (2003). Prior knowledge and its relevance to student achievement in introduction to psychology. Teaching of Psychology, 30(2), 96–101.
Thompson, R.A., & Zamboanga, B.L. (2004). Academic aptitude and prior knowledge as predictors of student achievement in introduction to psychology. Journal of Educational Psychology, 96(4), 778.
Wissman, K.T., & Rawson, K.A. (2018). Testpotentiated learning: Three independent replications, a disconfirmed hypothesis, and an unexpected boundary condition. Memory, 26(4), 406–414. https://doi.org/10.1080/09658211.2017.1350717
Xiaofeng, M., Xiaoe, Y., Yanru, L., & AiBao, Z. (2016). Prior knowledge level dissociates effects of retrieval practice and elaboration. Learning and Individual Differences, 51, 210–214. https://doi.org/10.1016/j.lindif.2016.09.012
Yeo, D.J., & Fazio, L.K. (2019). The optimal learning strategy depends on learning goals and processes: Retrieval practice versus worked examples. Journal of Educational Psychology, 111(1), 73. https://doi.org/10.1037/edu0000268
