A challenge encountered when monitoring mathematics teaching and learning at high school is that taxonomies such as Bloom's, and variations of this work, are not entirely adequate for providing meaningful feedback to teachers beyond very general cognitive categories that are difficult to interpret. Challenges of this nature are also encountered in the setting of examinations, where the requirement is to cover a range of skills and cognitive domains. The contestation as to the cognitive level is inevitable as it is necessary to analyse the relationship between the problem and the learners’ background experience. The challenge in the project described in this article was to find descriptive terms that would be meaningful to teachers. The first attempt at providing explicit feedback was to apply the assessment frameworks that include a content component and a cognitive component, namely knowledge, routine procedures, complex procedures and problem solving, currently used in the South African curriculum documents. The second attempt investigated various taxonomies, including those used in international assessments and in mathematics education research, for constructs that teachers of mathematics might find meaningful. The final outcome of this investigation was to apply the dimensions required to understand a mathematical concept proposed by Usiskin (2012): the skillsalgorithm, propertyproof, useapplication and representationmetaphor dimension. A feature of these dimensions is that they are not hierarchical; rather, within each of the dimensions, the mathematical task may demand recall but may also demand the highest level of creativity. For our purpose, we developed a twoway matrix using Usiskin's dimensions on one axis and a variation of Bloom's revised taxonomy on the second axis. Our findings are that this twoway matrix provides an alternative to current taxonomies, is more directly applicable to mathematics and provides the necessary coherence required when reporting test results to classroom teachers. In conclusion we discuss the limitations associated with taxonomies for mathematics.
In the current global educational climate, some degree of regulation is deemed necessary in both the curriculum document prescription and in systemic assessment (Kuiper, Nieveen & Berkvens, 2013). If teachers are to be judged by the outcomes of systemic assessments then at least the components making up the curriculum and the assessment tasks should be made explicit, so that the classroom activities may be aligned and reasoned judgments may be made by teachers concerning their classroom focus.
In the first part of this article, we propose a model for assessment that integrates both external and classroombased educational functions. In order for this model to function optimally, there is a need for coherence in the description of educational objectives, classroom activities and assessment; we therefore need a common language across all three educational processes.
In the second part of the article, we provide an overview of the main cognitive categories in Bloom's taxonomies, both the original and revised versions, the various frameworks from the Trends in International Mathematics and Science Study (TIMSS), as well as the recent South African curricula.
Bloom's Taxonomy of Educational Objectives was initially conceptualised to assist curriculum planners to specify objectives, to enable the planning of educational experiences and to prepare evaluative devices (Bloom, Engelhart, Furst, Hill & Krathwohl, 1956, p. 2). Because the educational objectives are phrased as general cognitive processes, including activities such as remembering and recalling knowledge, thinking and problem solving, it is necessary to rephrase the particular statements in terms of the subject under consideration (Andrich, 2002). In fact, the taxonomy may be ‘validated by demonstrating its consistency with the theoretical views’ that emerge in ‘the field it attempts to order’ (Bloom et al., 1956, p. 17). The process of thinking about educational objectives, defining the objectives in terms of the mathematical tasks and relating these tasks to the teaching activities and assessment tasks is an important exercise for the policymakers, curriculum designers, test designers and teachers.
An overview of the other taxonomies in use in TIMSS and in the various South African curriculum documents provides the background to the planning, communication and feedback processes for the Grade 911 monitoring and evaluation project with which the authors are currently engaged. The broad question arising from the project needs is: Can the (three) essential elements, an externally designed monitoring component, a classroombased formative assessment component and a professional development component, be logically and coherently aligned for the purpose of informing teaching and learning?
The subquestion is: How may we best design assessment frameworks (the design tool specifying the purposes, structure and content of an assessment instrument) in such a way that there is coherence from the mathematical knowledge to be taught and learned, through the design of a set of assessment instruments, to providing diagnostic and practical feedback to teachers about learner performance and needs?
The congruence of curriculum, in the sense of what subject knowledge is to be learned, pedagogy, in terms of how particular concepts and skills are to be learned, and assessment, how the two former elements of the educational experience are to be assessed, is the central concern of this article.
Teaching and learning
Good, and especially excellent, teachers cannot all teach to the same recipe. Of course, the same mathematics canon underpins their teaching and student learning and the ultimate goal is attainment of the abstract and powerful predicative knowledge of mathematics. But, the route to this end goal along a developmental path is through the operationalisation of mathematics in terms that can be grasped by the young and aspirant mathematicians (see also Vergnaud, 1994). Also, because learners are able to draw from appropriate contexts the mathematical understanding underpinning the formal mathematics, the creative teacher draws on contexts pertinent to the learners and appropriate for generating mathematical understanding.
When teachers plan assessments for their classes, and even for the clusters of classes in a school, the assessment is generally geared to what the learners have been taught. The contents of the test will not be unexpected. The language will be familiar. But in the case of external assessment for qualification purposes, or national systemic studies in which school performance or teacher performance is monitored, or largescale assessments in which many different countries are involved, the attainment of coherence of language across countries, schools and individual teachers is more difficult to achieve. Countervailing these limitations of an external systemic type of testing is the view that the outcomes of systemictype assessment should not be the only, nor the primary, source of information for a school evaluation (Andrich, 2009).
The lynchpin of coherence here is attention to the validity of the test components, consistency across the collection sites, so as to generate thus generating reliable test data, and attention to the overall validity of the assessment programme, including the purpose for which the assessment outcome is to be used (Messick, 1989). Adherence to these requirements is not easy to attain. The attainment demands clear communication about the curriculum contents and about the expected responses from learners. For example, the expectation from the teacher may be that individual concepts, with associated procedures, are acquired; in contrast, the examiner may require the learner to apply problem solving skills to a mathematical task requiring multiple concepts. Bloom et al. (1956) refer to the need to understand the educational context of the learner in order to correctly align educational assessment and correctly categorise the cognitive levels.
Some criticism of external systemictype testing is noted here. Schoenfeld (2007) warns that the type of assessment items generally given in tests of larger scale may often work against the type of problem solving process, extended and thorough in nature, advocated by Polya (1957). Others point to socioeconomic factors that impact on the school culture, and therefore on learning and teaching, that warrant deeper consideration (Nichols & Berliner, 2005, 2008; Usiskin, 2012; Wolk, 2012). Questions about teacher autonomy and professionalism, and about who has the professional authority to monitor professional teachers, are of paramount importance. These critiques are noted here but are not the concern of this article.
Webb (1992), in response to dissatisfaction with what he perceives as inadequate testing processes, proposes that mathematics education requires a specific assessment programme. He argues that the thencurrent assessment models had been based on outdated psychological models designed for purposes no longer relevant. Aligned with this view, we explore later in the article a taxonomy proposed by Usiskin (2012) that has been operationalised in the University of Chicago Schools Mathematics Project textbooks (UCSMP; see http://ucsmp.uchicago.edu/). See for example the Algebra textbook, Teacher's Edition (McConnell et al., 2002).
In answer to the critique of current assessment practices and problems experienced in practice, Bennett and Gitomer (2009) propose a model that provides articulation between three components: systemic assessment (monitoring), formative assessment (classroombased diagnostics and classroom teaching) and professional development (see Figure 1; see also Bennett (2010, 2011).
For the external monitoring component we propose along with Bennett and Gitomer that any mode of assessment should be aligned with cognitive models that are currently acknowledged as supporting learning. The implication is that when a test is designed for monitoring purposes that both the critical subject knowledge and the associated requirements from a cognitive development perspective are to be considered. Here we note that modern scientific techniques for the generation and analysis of test data may be used to provide information about the individual student, and to ensure reflection on the test instruments themselves and their constituent items. Suitably supported, these methods also permit the tracking of individual needs and performance in the classroom and evidence for the extent of change, progress and redress of performance for the specific child. These techniques, critical to the model, are explored elsewhere. See Dunne, Long, Craig and Venter (2012) and Long, Dunne and Mokoena (2014) for discussion on techniques for analysis and reporting of assessment results.
The classroombased formative assessment component of the model requires that teachers be provided with information obtained through the monitoring component. This information should reflect both apparent learner proficiency and item performance characteristics. The feedback needs to be sufficiently specific to enable the teachers to reflect on how best to meet the emerging needs of the learners as detected within the assessment. We acknowledge that there will be circumstances in which this reflection will have to be accompanied by improvement of teacher mathematical skills, which is the intent of the professional development component.
The professional development component of this system of interventions should be informed by a deeper insight into the nature of the knowledge domains. In essence, the professional development component is required to build with the teachers a model of mathematical development against which teachers may gauge the progress of their learners. The intended curriculum constitutes an essential but incomplete part of this professional development function. The component also involves identifying with the education role players and reexamining the various necessary factors involved in acquiring mathematical proficiency. Teachers and decisionmakers together explore the reasons for these critical factors being absent from the school classroom and address strategies to address that absence.
In order to promote congruence at the three sites, an explicit model of conceptual development from the perspective of mathematics and of cognitive development from the perspective of learning is required (see Vergnaud, 1988). These two components have both a hierarchical trajectory and horizontal breadth encompassing both related mathematics concepts and the required cognitive engagement. In order to make explicit at any one time the breadth and depth of knowledge and the responses required of individuals, an explicit description of the particular knowledge field is required.
The purpose of Bloom's taxonomy and associated challenges


When Bloom gathered a group of assessment specialists together in the mid20th century, his purpose was to provide the assessment community with a common language about learning goals which would facilitate communication across subject matter, persons and grade levels. In an attempt to ensure development of ‘higher mental processes’ Bloom (1994, p. 2) proposed a common framework for the setting of examinations and for the assessment of these examinations (cited in Andrich, 2002, p. 40). As noted previously, this framework was initially conceptualised as an assessment tool which could aid in the classification of items for item banking purposes.
The educational objectives explicated in the taxonomy could then be translated into behaviours that would provide evidence that the objective had been achieved (Andrich, 2002, p. 41). The aim of the common framework was to help curriculum designers ‘specify objectives so that it becomes easier to plan learning experiences and prepare evaluation devices’ (Bloom et al., 1956, p. 2).
This common language and vocabulary was to serve as a basis for determining the specific meaning of broad educational goals that informed both the local and the international community. It was also a means for ‘determining the congruence of educational objectives, activities and assessments’ (Krathwohl, 2002). The establishment of a broad base of descriptions that could describe a range of educational experience was to guard against the limitations of any curricula that had been narrowly conceptualised. For the assessment community a bank of items covering a range of question types was to provide a solution to the increasing demand for the construction of assessment items.
Bloom's original taxonomy embraced cognitive, affective and psychomotor skills. The cognitive processes included six major components: knowledge, comprehension, application, analysis, synthesis and evaluation. The affective aspect included five major components: receiving, responding, valuing, organising and characterising. There was a third component named psychomotor skills (Bloom et al., 1956). This conceptualisation of educational objectives, embracing a broader view of knowledge and the inferred cognitive responses, was groundbreaking at the time and the effect on education has been an exponential growth in taxonomy use.
It is of interest here that though knowledge is specified as a component, defining this component does not prove that straightforward. Whilst there is an element of memory involved, in that recalling facts, terms, basic concepts and answers forms part of this component, this component also embraces knowledge of specifics (terminology and specific facts), knowledge of ways and means of dealing with specifics and knowledge of the universal and abstractions in a field (principles and generalisations, theories and structures). The idea behind the taxonomy is that it does not only specify breadth but also unfolds a depth of engagement within a particular topic. In this respect the elements of the taxonomy have been regarded as hierarchical, moving from simple to complex, concrete to abstract, so creating a cumulative hierarchy of knowledge and skills (Bloom et al., 1956; Krathwohl, 2002).
Although this idea of hierarchy has been acknowledged as groundbreaking, there has been critique from a number of sources. One of these critiques is that the elements do not necessarily form a hierarchy (Usiskin, 2012, and others). Another view is that whilst the first three elements of the taxonomy are somewhat hierarchical, the last three, in contrast, can be conceptualised as distinct but parallel (Anderson & Krathwohl, 2001).^{1}
Another critique is that the act of cognition is so highly interrelated and connected across its features that any attempt to classify and confine the thinking process is bound to fail. Here we note that Bloom et al. (1956) were acutely conscious of the danger of the fragmentation arising from the use of particular focuses and advocated a degree of classification that did least violence to the construct under investigation. This critique is partly addressed by the revised Bloom's taxonomy that arranges the existing elements into two dimensions, placing types of knowledge on the vertical dimension and the cognitive process dimensions on the horizontal dimension (see Table 1) (Krathwohl, 2002). Three types and six cognitive processes permit a twoway 3 × 6 array of classifications.
TABLE 1: Revised Bloom's taxonomy manifesting two dimensions used to classify assessment items in the mathematics monitoring and evaluation project. 
A further observation that results from the use of the taxonomy rather than the original conceptualisation is that for each subject area, the observable behaviours and the different levels of thinking and performance will manifest differently. The abstract nature of the taxonomy requires that for each subject area, the six levels have to be recontextualised by curriculum developers, examiners and classroom teachers who know the subject discipline (Andrich, 2002).
The reconceptualisation of the taxonomy into two dimensions makes the adaptation for the different subject knowledge domains somewhat easier. The inclusion of metacognitive strategies as a separate category in the revised Bloom's taxonomy is regarded by some as a major advance in that without metacognition, it is argued, learning cannot be claimed (Anderson et al., 2001).
The difficulty of classifying items in terms of Bloom's taxonomies
It is at this point that we reflect on two sets of three items, designed as formative assessment resources, for the formative assessment component of the project. The first set focuses on Algebra (see Worksheet 2 in Appendix 1).
Our attempts to classify the items that were originally created to cover a range of cognitive processes proved difficult. The individual items have been given a temporary home populating the cells. But how does one distinguish ‘remember X conceptual knowledge’, ‘understand X procedural knowledge’, and ‘apply factual knowledge X’ (represented in Table 1)?
We note here that the selection of a cell or cells within which to locate the item depends not only on the understanding of the mathematics involved but also on the approach that the learner may take to solving the problem. An issue arises for a class within which a particular problem has been discussed: reuse of the problem will be classified as ‘remember and apply’, whereas if the particular topic has not been dealt with in class the learner may be required to analyse the problem and apply conceptual understanding. This difficulty confirms the statement that by Bloom et al. (1956) that ‘it is necessary to know or assume the nature of the examinees prior educational experience’ (p. 20), in order to classify test questions.
Similar conceptual and taxonomic efforts in mathematics


Similar work in mathematics education, in parallel or in conjunction with the work of Bloom, Krathwohl and colleagues, has been conducted in an attempt to achieve congruence from the curriculum, through the pedagogical domain, and into assessment.
Distinctions between types of mathematics knowledge, relational understanding and instrumental understanding by Skemp (1976) describe the theoretically distinct though practically linked constructs. He describes relational understanding as the ability to deduce specific rules and procedures from more general mathematical relations. Instrumental understanding describes the ability to apply a rule to the solution of a problem without understanding how it works. This contrast, however, refers to the learner's understanding and may be an objective for teaching, but cannot easily be distinguished in an assessment item.
The somewhat different terms conceptual knowledge and procedural knowledge are identified by Hiebert and Lefevre (1986) following Scheffler (1965). The distinction is made between conceptual knowledge in which relations are established between concepts and procedural knowledge elements which are sequential in character. Conceptual knowledge is attained by ‘the construction of relationships between pieces of information’ or by the ‘creation of relationships between existing knowledge and new information that is just entering the system’ (Hiebert & Lefevre, 1986, p. 4). Hiebert and Lefevre make a secondary distinction between primary level relationships and the reflective level constructs. The primary level refers to elements of knowledge that are at the same level of abstraction, whilst the reflective level refers to a higher level of abstraction that occurs when two pieces of knowledge initially conceived as separate pieces of knowledge are abstracted to become a principle or concept that is generalisable to other situations. These levels of abstraction align with the purpose of mathematics education expressed by Vergnaud (1988), which is to transform current operational thinking into more advanced concepts that are generalisable across varied situations.
Procedural knowledge, in Hiebert and Lefevre's (1986) definition is described as knowing the formal language, or the ‘symbol representation system’, knowing algorithms and rules for completing tasks and procedures and knowing strategies for solving problems. In practice, the two perhaps conceptually distinct knowledge types are intricately linked and cannot be distinguished (Long 2005; Usiskin, 2012; Vergnaud, 1988).
Subsequently, Kilpatrick, Swafford, and Findell (2001) included conceptual understanding and procedural fluency, similar in essence to the terms used by Hiebert and Lefevre (1986), as two of five strands necessary for mathematical proficiency. The other three strands are adaptive reasoning, strategic competence and a productive disposition (Kilpatrick et al., 2001, p. 141).
In essence, the Kilpatrick strands focus on features of learner activity in the mathematics classroom to which a teacher may properly attend. Whilst these strands are useful for the purpose of planning learner activity, they do not function as a taxonomy or typology for purposes of categorising curriculum knowledge or for guiding the design of a test instrument, nor as an instrument to judge teacher competence.
In an attempt to make the design of curriculum, the stating of objectives, the educational activities and the assessment thereof coherent and iteratively cyclical, Usiskin (2012) and colleagues at the UCSMP have conceptualised an elaborated view of what it means to understand mathematics, which comprises five dimensions: skillsalgorithm understanding, propertyproof understanding, useapplication understanding, representationmetaphor and historyculture understanding. This elaborated view of understanding mathematics is conceived from the learner's perspective and as such should be useful in terms of teaching and learning. This taxonomy of understanding will be discussed in connection with the project to which it was applied in a later section.
Given the above alternative distinctions made, we reflect on the process of categorising and describing items for the purpose of communication and for providing feedback to the teachers in our project. We turn to the items on Worksheet 2 and Worksheet 4 (in Appendix 1). Are these items easily classifiable as conceptual knowledge or understanding, or procedural knowledge or procedural fluency? A partial answer from Bloom et al. (1956) is that the classification depends on knowledge of or an assumption about the learners’ prior knowledge. Nevertheless, largescale studies and national systemic programmes require guiding assessment frameworks.
Trends in International Mathematics and Science Study frameworks: A taxonomy


The TIMSS frameworks have been used to inform curricula and provide a framework against which tests may be constructed and results reported. Though with no direct evidence it appears that the Third International Mathematics and Science Study, as it was known in 1995, TIMSSRepeat (1999) and the Trends in International Mathematics and Science Study 2003, 2007 and 2011 have all engaged with Bloom's taxonomy, both the original and revised versions, and with the various categorisations made in mathematics education literature. In order for the international largescale studies to make a claim for both reliability and validity, it is essential that they make explicit the frameworks informing the design of the assessment instrument, including both the content domains and the cognitive domain. The early TIMSS studies, in 1995 and 1999, used the term performance expectations to provide the second dimension. These expectations were as follows: Knowing, Using routine procedures and Problem solving at Grade 4; Representing situations mathematically, Using more complex procedures, Generalising and Justifying at Grade 8 (Schmidt, McKnight, Valverde, Houang & Wiley, 1996; see also Table 2).
TABLE 2: Conceptual and cognitive domains: Bloom's (original and revised), TIMSS 1995/1999, 2003, 2007/2011, RNCS and CAPS. 
The categories used in TIMSS 2003 for the cognitive domain were: Knowing facts and procedures (recall, recognise or identify, compute, use tools), Using concepts (know, classify, represent, formulate, distinguish), Solving routine problems (select, model, interpret, apply, verify) and Reasoning (logical, systematic thinking, including both inductive and deductive thinking) (Mullis et al., 2003).
In 2007 and 2011, the categories changed somewhat to Knowing (recall, recognise, compute, retrieve, measure, classify or order), Applying (select, represent, model, implement, solve routine problems) and Reasoning (analyse, generalise, synthesise and integrate, justify, solve nonroutine problems) (Mullis et al., 2005, 2009) (Table 2). Without going into detail, one may observe broad similarities across the TIMSS frameworks with the Bloom's taxonomies, both original and revised.
We note here that our items from Worksheet 2 and Worksheet 4 (in Appendix 1) may be allocated to TIMSS content domains fairly easily as the topics in the framework are elaborated to a fine level of detail. The difficulty still remains with assigning a cognitive domain to the items or, to phrase the challenge differently, to assign the expected response of the learner.
Both the TIMSS frameworks and the Bloom's taxonomies (both original and revised) have influenced curricula planning in many participating countries, including South Africa, over recent decades.
RNCS and CAPS taxonomies in international perspective


In this section we comment, in relation to Bloom's taxonomy and TIMSS, on the South African curricula, the Revised National Curriculum Statement (RNCS) introduced in 2002 (Department of Education, 2002), though only fully implemented some 5 years later, and the Curriculum and Assessment Policy Statement (CAPS) (Department of Basic Education, 2011), introduced in 2011 and implemented from 2012 to 2014.
The categorisations – knowledge, routine procedures, complex procedures, and problem solving – in CAPS (DBE, 2011) are similar to the TIMSS 1995 and 1999 categories. The earlier RNCS curriculum used the same content categories, but had more elaborated cognitive dimensions, which were more akin to the Bloom's categories and the TIMSS 2007 categories. Table 3 provides a summary with the RNCS and CAPS categories somewhat roughly aligned.
TABLE 3: Curriculum framework for test design purposes with CAPS percentages for Grade 9. 
Applying the matrix of content domain categories (mathematical concepts) and the cognitive domain categories (the responses expected from individuals) allows test designers to cover the broad range of knowledge requirements expected by the curriculum. Of course such a matrix of content and learner activity level inevitably sets up artificial distinctions between subject topics and between the responses expected.^{2} There are likely to be many occasions when the test designer will be in a quandary as to which category to assign a particular item. Table 3 provides an example of a standard framework providing content and cognitive domains. The cells would then be populated according to the curriculum requirements, for example as laid out in CAPS.
The requirement is for the designer to populate the tabular framework with suitable types and numbers of items for each cell in the matrix. In the case of our Worksheet 2 and Worksheet 4 items, they may all be allocated to the category Application, which includes routine procedures and complex procedures (Table 3), although some may argue that the items all belong in the Knowledge category.
The authors acknowledge that the depth of description in these taxonomies and frameworks has not been provided in this article. They are listed rather to show how there are similarities and differences across these frameworks which then point to the complexity of constructing such taxonomies.
It is necessary for the taxonomy and the various frameworks to be transformed into subjectspecific descriptions (Andrich, 2002; Van Wyke & Andrich, 2006). The TIMSS descriptions have achieved this requirement (see Mullis et al., 2003, 2005, 2009). An interesting divergence to be explored is that whilst Bloom's original and revised taxonomies claim a hierarchy of cognitive processes, the TIMSS framework claims only minimal hierarchy of cognitive domains with a range of difficulty within each cognitive domain (Mullis et al., 2003, p. 32).
Challenges in 21st century assessment


The challenge to test developers in the 21st century is to achieve some congruence between tests used for monitoring or summative purposes, for the active classroom and classroombased assessment. We also propose that in addition to the alignment required for these modes of assessment, there is also critical engagement in a professional development cycle. The congruence of educational objectives, teaching and learning activities and assessment envisaged by Bloom (see Krathwohl, 2002) is difficult to achieve. However, given the importance of aligning assessment practices with classroom practices, it is necessary to have a framework that is explicit and is in some respects common to both settings.
The current monitoring and evaluation project under consideration in this article has an external monitoring component: there is also in the design a feedback component provided to teachers in the interest of improving teaching and learning. The model for this project based on the work of Bennett and Gitomer (2009), and Bennett (2010, 2011), has been explained earlier in the article. The content of the assessment programme requires reviewing and making decisions about substantive mathematics knowledge.^{3} In addition, the fact that there is feedback to the teachers means that there should be a common conceptual language and some congruence of expectations across all three sites, the curriculum, classroom teaching and external assessment.
In this project the problem emerged of making explicit the content of the curriculum framework for learners in Grade 9. (The project also encompassed Grades 8, 10 and 11; the focus in this article is Grade 9.) The research team believed that making the framework explicit would serve three purposes: firstly, it would provide some direction to the constructors of the test items and provide an overview of the test; secondly, the explicit descriptions could provide feedback for teachers; thirdly, in the interests of democratic participation, the design of the test would be transparent (see Appendix 2 and Appendix 3).
The broad question, as stated earlier, is: Can the three essential elements, a monitoring component, a formative assessment component and a professional development component be logically and coherently aligned for the purpose of informing teaching and learning?
The subquestion is: How may we best design assessment frameworks (the design tool specifying the purposes, structure and content of an assessment instrument) in such a way that there is full coherence from the mathematical knowledge to be taught and learned, through the set of assessment instruments to providing diagnostic and practical feedback to teachers about learner performance and needs?
In the early phases of the project, we explored alternatives to the current practice which draws on Bloom's taxonomy and variations as described in the national curriculum documents. In particular, we examine the function of taxonomies in guiding a monitoring and developmental process.
There is the obvious difficulty of assigning mathematics items to one particular mathematics category, but when assigning an item to a cognitive category, the problem is one of presuming how the learner will respond to the item. Whether an item is categorised as knowledge, routine procedures or complex procedures and problem solving (as in the CAPS, Table 2) depends on the level of knowledge acquired by the learner; this level relates directly to what has been taught (Bloom et al., 1956; Usiskin, 2012). A possible solution to this predicament of interpretation is to limit the categories to mathematics components rather than attempting to secondguess how the generic learner will respond. This approach, focusing on the mathematical content of the question, may circumvent the difficulty test designers have in manipulating the mathematics to fit a cognitive category. A second criterion for the selection of a taxonomy, or criteria for categorising test items, is for the categories to align with teaching and learning. The question to consider here is whether or not feedback from a particular category will provide information to the teacher about needs and interventions.
The first approach we took in this project was to describe in detail what we expected of the learner responding to the item. This approach is exemplified in the patterns, functions and algebra component of a Grade 9 test illustrated in the table in Appendix 2. The cognitive requirements form the horizontal headings across the table.
The purpose of making this content aspect explicit was to inform teachers of the contents of the test so that they could make a reasoned judgement about the performance of their classes in relation to the test during this external monitoring programme. In other words if the teachers knows that Item X covered probability, and she also knows that she made a judgement call to leave probability out of the Grade 9 work plan with the view to having an intense focus in Grade 10, she would understand her students lack of performance in this section.
A second approach categorised the items in terms of the dimensions of understanding identified by Usiskin (2012; see also Appendix 3). Using three criteria for a useful taxonomy, that is, firstly to stay true to the mathematics, secondly to guide a balanced assessment and thirdly to provide useful feedback to teachers, we thought to explore the potential of the five dimensions of understanding proposed by Usiskin, which are also operationalised in the UCSMP high school textbooks. He proposes that for a full understanding of concepts, five dimensions are necessary:

The skillsalgorithm dimension of understanding deals with the procedures and algorithms required to achieve answers. This dimension includes the understanding of procedures and algorithms, which Usiskin (2012) and others assert is much deeper than what has been called procedural understanding or procedural fluency (see Kilpatrick et al., 2001; Long, 2005). The understanding and ability to carry out a skill invariably involves at base the understanding of the associated concept and requires all sorts of skills. This dimension of understanding mathematics concepts is what is mostly addressed in school classrooms and found in systemic type tests.

The propertyproof understanding of concepts deals with the principles underlying, for example, the number system and operations. It may be argued that a procedure is only really understood when one can identify the mathematical properties that underlie the procedures. Knowledge of the properties and being able to ‘prove’ that the procedure works enables one to more confidently generalise the procedure to other problems. Here we may contrast conceptual understanding with procedural understanding, although as argued previously this distinction has to be qualified.

The useapplication understanding of mathematics deals with the applications of mathematics in real situations. A person may understand how to perform some procedure and may know why his method works, but they cannot fully understand unless they know when, why and how to use the skill and procedure in applications. Applications are not necessarily higher order thinking, but rather a different type of thinking according to Usiskin (2012).

Usiskin (2012) avers that the three types of understanding previously described do not give a complete picture: to fully understand a concept a person must be able to represent the concept in different ways. The representationmetaphor understanding refers to the pictures, graphs or objects that illustrate concepts and that can be used interchangeably with symbolic representation. Such analogies may need to be in one or more of verbal, figural, graphical or tabular modes and may need to invoke more than a linear ordering or more than a single static dimension. They may also require a location in time.

The fifth is the historyculture dimension. Whilst this theme is an important dimension of understanding, it cannot easily be tested where responses require only short answers. It reflects a sense of the interrelatedness of mathematical content and its embedding in the social fabric of experience. Some key consequences of this dimension of understanding include an appreciation of the utility and creativity associated with mathematical thinking and problem solving at the level of the learner and an insight into the proximity of mathematics. We suggest it has motivational consequences.
In applying this revised taxonomy we faced two dilemmas. The first was that we had to conform in some degree to the status quo. The CAPS document, the legal framework guiding teachers in their everyday teaching and assessment, requires strict adherence. In that document the four levels applied are Knowledge, Routine procedures, Complex procedures and Problem solving (DBE, 2011). We generally use three categories, Knowledge, Applications and Problem solving.
The second dilemma was where to include problem solving. Usiskin's (2012) focus is on understanding a concept. Does problem solving form part of the dimensions of understanding, so, for example, could we place problem solving into the category useapplication, or should it have a category of its own?
The process ‘problem solving’ has many different interpretations. In some sectors problem solving means a ‘word sum’; to others the term means encountering a problem never seen before by the learner cohort. This salient but inherently unverifiable definition is very difficult in practice because a teacher or test designer may never know whether a learner has seen a particular problem type previously or not. The good teacher, enthusiastic parent or grandparent and the Internet could all have a part to play in rendering a really good problem routine, in that it becomes something the child has seen and perhaps solved before. Problem solving according to Polya (1957) has distinct phases. The problem solver when confronted with a problem they have not seen previously needs to firstly understand the problem, then think about the strategy to use, then ‘generate a relevant and appropriate easier related problem’, then ‘solve the related problem’, and finally ‘figure out how to exploit the solution or method to solve the original problem’ (Schoenfeld, 2007, p. 66).
Taking this process seriously means that problem solving is not possible in a standard testing situation. We have to acknowledge here that our tests are omitting a very significant part of mathematics. In fact, Schoenfeld (2007) asserts that the types of questions and answers common in many mathematics classrooms work against the generation of good problem solvers in those classrooms. In the case of problem solving we have compromised and included the notion of problem solving as a separate category, although knowing that the items allocated to that category are only shadows of what Polya would describe as a real problem.
So, in Table 4, we have assigned the items from Worksheet 2 and Worksheet 4 (see Appendix 1) to one of the dimensions, knowing that the allocation to another single dimension may be argued and that a single item may well span two categories, true to the nature of mathematics applications. Each of these four dimensions of understanding, skillsalgorithms, propertyproof, useapplication and representationmetaphor, has aspects that can be memorised; they also have potential for the highest level of creative thinking, for example the invention of a new algorithm (Usiskin, 2012). Each of the dimensions is relatively independent of the others. Each of the understandings has proponents who teach mathematics largely from that single perspective. Usiskin (2012) claims however that the understanding of mathematics is multidimensional, with each of these dimensions contributing some elements of the notion of understanding.
TABLE 4: Dimensions of understanding, levels of processing, and possible weightings. 
We argue that this taxonomy of dimensions provides, firstly, a necessarily mathematicsspecific taxonomy and, secondly, that these dimensions support good teaching practice and that therefore feedback to teachers in terms of these dimensions may be helpful.
An interesting observation is that by including an additional somewhat hierarchical dimension the taxonomy becomes threedimensional: the mathematical knowledge as listed in the curriculum, the dimensions of understanding and the levels of complexity involved.
Note that the explicit weightings in Table 4 are aligned to the South African curriculum documents and would differ depending on the content domain and on the constitution of the class and their aspirations. Having a class of aspiring engineers may warrant more emphasis on problem solving and the creative application of mathematics, whilst also not neglecting routine algorithms that are an important component of the engineer's tool box.
At the heart of the matter for the curriculum designer, the teacher and the assessment specialist is an advanced understanding of mathematics that takes into account the interconnections between the current school mathematics topics, the connections to the earlier concepts and the progression in subsequent years to more advanced topics (Usiskin, Peressini, Marchisotto & Stanley, 2003). Also required is the exploration of alternate definitions, the linking between concepts, knowledge of a wide range of applications and alternate ways of approaching problems (Usiskin et al., 2003). This background knowledge informs the designer in any systemic testing programme. In a model such as the one envisaged by Bennett & Gitomer (2009), there is the potential that attention be paid to the critical areas of mathematics and that these areas are aligned with the classroom.
The fact that we are constrained by existing test programmes, which serve some purpose in the current system, implies that we need to find way of progressively adapting the existing requirements. We must simultaneously bear in mind both that administrators and teachers are change weary and that any changes need to be thoroughly debated and explicit consensus reached about the role and forms of systemic tests.
We note here that we have developed formative assessment resources (Worksheet 2 and Worksheet 4 in Appendix 1) that are linked to the monitoring component, and are designed in sets with each of the items covering different dimensions and ranging in difficulty. The purpose of these products is for teacher use in the classroom so that the teacher does not have to rely only on external monitoring for feedback about teaching and learning, but will also have useful resources at his or her disposal. Monitoring and accountability purposes can accommodate time lags that intervention strategies cannot afford.
As has already been observed, professional development that does not relate to the classroom experience may not be useful. In addition, systemic assessment that gives no thought to its diagnostic relevance in the classroom must be questioned. The dilemma here, as with the levels advocated by Bloom, is how to operationalise these levels or components of understanding, in such a way that they manifest evidence that the objectives of the curriculum have been met or that learner proficiency is being developed and exhibited. Bloom's levels or TIMSS cognitive domains convey very little in themselves unless they can be interpreted for a specific mathematical context (as they have been in the TIMSS frameworks). For these systems of categories to be useful, they have to be further elaborated by the subject specialist. The Usiskin taxonomy serves this purpose.
Devices such as Table 4 guide the designer of an assessment programme towards appropriate balance and coverage of curriculum and attempts to cover different types of cognitive engagement for the context of a specific grade. However, further mathematical insight is required to populate such a multidimensional framework with appropriate items. These insights include the apparent difficulty level of items within each cell in the table. A normreferenced instrument can emerge, suitable for diagnostic and intervention purposes, which will require some form of marking memo. Such an instrument can be valuable in every classroom but perhaps at different times and stages to suit the progress of the learners in each context. This variability suggests the importance of collaborative projects that construct comparable assessment instruments using a common design framework across the targeted curriculum and then share access to the resulting variety of classroomfocused instruments.
Any additional criterionreferencing, as may be desired for adjudicating individual learner attainment in a classroom summative assessment or in a systemic testing programme, will require some external specification of explicit outcome criteria for various levels of performance quality. These criteria require judgments about the extent to which each of the constituent items are indicators of the required performance levels. These judgments should also be explicitly recorded and may influence memo mark allocations. The related matter of conditions for the legitimacy of addition of marks to establish a single overall performance total is a separate nontrivial issue, but is not discussed further in this article.
The challenge presented to the mathematics education community by Vergnaud (1994) is that the analysis of concepts and processes must be from a mathematical perspective. He asserts that no linguistic or logical system or natural language description, or levels of abstraction, such as Bloom's taxonomy, can provide the ‘concepts sufficient to conceptualise the [mathematical] world and help us meet the situations and problems that we experience’ (Vergnaud, 1994, p. 42).
It is the precision of symbolic representation and welldefined concepts in mathematics that conveys both the essential aspects of the mathematical situation and the schemes used by the learner of mathematics. This somewhat radical stance challenges educational researchers and practitioners, whilst being pragmatic in the current policy environment, to keep in mind the essential mathematics.
A related challenge is to maintain the distinction between a learning environment, which requires extensive investigation and engagement with meaningful contexts, and an external assessment programme, which inevitably focuses on the outcomes of a process. Shortcircuiting the learning process with obsessive testing may be counterproductive. Here we are reminded of two modes of evaluation, that of the connoisseur, a genuine appreciation of the art of teaching, and that of the critic, an inspector that moves in with a checklist (Eisner, 1998). Bloom et al.'s (1956) aim in formulating the taxonomy of educational objective was to extend the repertoire of teaching through engagement with the taxonomy. We envisage that the ideas expressed in this article will provide the impetus for further discussion.
The ideas in this article have been generated whilst working on a project conducted by the Centre for Evaluation and Assessment at the University of Pretoria. The project was funded by the Michael and Susan Dell Foundation. The theoretical developments have been the work of the authors.
Competing interests
The authors declare that they have no financial or personal relationship(s) that may have inappropriately influenced them in writing this article.
Authors’ contribution
C.L. (University of Pretoria) conceptualised the article, with major contributions from T.D. (University of Cape Town) and H.d.K. (independent consultant). Most of the writing has been the responsibility of the first author, with critical review and insights into the curriculum provided by the coauthors. All three authors have been involved in the monitoring and evaluation project, and have contributed to the conceptualisation of the products, that is, the frameworks and tables.
Anderson, L., & Krathwohl, D.A. (2001). Taxonomy for learning, teaching and assessment. A revision of Bloom's taxonomy of educational objectives. New York, NY: Longman.
Anderson, L.W., Kwathwohl, D.R., Airasian, P.W., Cruikshank, K.A., Mayer, R.E., Pintrich, P.R., et al. (2001). A taxonomy for learning, teaching and assessing: A revision of Bloom's educational objectives (abridged edition). New York, NY: Longman.
Andrich, D. (2002). A framework relating outcomes based education and the taxonomy of educational objectives. Studies in Educational Evaluation, 28, 35–59. http://dx.doi.org/10.1016/S0191491X(02)000111
Andrich, D. (2009). Review of the curriculum framework for curriculum, assessment and reporting purposes in Western Australian schools, with particular reference to years Kindergarten to Year 10. Perth: University of Western Australia.
Bennett, R. (2010). Cognitively based assessment of, for, and as learning (CBAL): A preliminary theory of action for summative and formative assessment. Measurement, 8, 70–91.
Bennett, R.E. (2011). Formative assessment: A critical review. Assessment in Education: Principles, Policy & Practice, 18(1), 5–25. http://dx.doi.org/10.1080/0969594X.2010.51367
Bennett, R.E., & Gitomer, G.H. (2009). Transforming K12 assessment: Integrating accountability testing, formative assessment and professional development. In C. WyattSmith, & J.J. Cumming (Eds.), Educational assessment in the 21st century (pp. 43–62). Dordrecht: Springer. http://dx.doi.org/10.1007/9781402099649_3
Bloom, B.S., Engelhart, M.D., Furst, E.J., Hill, W.H., & Krathwohl, D.R. (1956). Taxonomy of educational objectives: The classification of educational goals. Handbook I: Cognitive domain. New York, NY: David McKay Company.
Department of Basic Education. (2011). Curriculum and assessment policy statement Grades R12. Mathematics. Pretoria: DOE.
Department of Education. (2002). Revised national curriculum statement Grades R9 (Schools) Mathematics. Pretoria: DOE.
Dunne, T., Long, C., Craig, T., & Venter, E. (2012). Meeting the requirements of both classroombased and systemic assessment of mathematics proficiency: The potential of Rasch measurement theory. Pythagoras, 33(3), Art. #19, 16 pages. http://dx.doi.org/10.4102/pythagoras.v33i3.19
Eisner, E.W. (1998). The enlightened eye: Qualitative inquiry and the enhancement of educational practice. Upper Saddle River, NJ: Merrill.
Hiebert, J., & Lefevre, P. (1986). Conceptual and procedural knowledge in mathematics: An introductory analysis. In J. Hiebert (Ed.), Conceptual and procedural knowledge: The case of mathematics (pp. 1–27). Hillsdale, NJ: Erlbaum.
Kilpatrick, J., Swafford, J., & Findell, B. (Eds.). (2001). Adding it up: Helping children learn mathematics. Washington, DC: National Academy Press.
Krathwohl, D.R. (2002). A revision of Bloom's taxonomy: An overview. Theory into Practice, 41(4), 212–218. http://dx.doi.org/10.1207/s15430421tip4104_2
Kuiper, W., Nieveen, N., & Berkvens, J. (2013). Curriculum regulation and freedom in the Netherlands  A puzzling paradox. In W. Kuiper, & J. Berkvens (Eds.). Balancing curriculum freedom and regulation across Europe. CIDREE Yearbook 2013 (pp. 139–162). Enschede: SLO & Jan Berkvens.
Long, C. (2005). Maths concepts in teaching: Procedural and conceptual knowledge. Pythagoras, 62, 59–65. http://dx.doi.org/10.4102/pythagoras.v0i62.115
Long, C. (2011). Mathematical, cognitive and didactic elements of the multiplicative conceptual field investigated within a Rasch assessment and measurement framework. Unpublished doctoral dissertation. Faculty of Humanities, University of Cape Town, Cape Town, South Africa. Available from http://hdl.handle.net/%2011180/1521
Long, C., Dunne, T., & Mokoena, G. (2014). A model of assessment: Integrating external monitoring with classroom practice. Perspectives in Education, 32(1), 158–178.
McConnell, J.W., Brown, S., Usiskin, Z., Senk, S.L., Widerski, T., Anderson, S., et al. (2002). Algebra, teacher's edition. Glenview, IL: Prentice Hall.
Messick, S. (1989). Meaning and values in test validation: The science and ethics of assessment. Educational Researcher, 18(2), 5–11. http://dx.doi.org/10.3102/0013189X018002005
Mullis, I.V.S., Martin, M.O., Smith, T.A., Garden, R.A., Gregory, K.D., Gonzalez, E.J., et al. (2003). TIMSS assessment frameworks and specifications 2003. Chestnut Hill, MA: Boston College.
Mullis, I.V.S., Martin, M.O., Ruddock, G.J., O'Sullivan, C.Y., Arora, A., & Erberber, E. (2005). TIMSS 2007 assessment frameworks. Chestnut Hill, MA: TIMSS & PIRLS International Study Center, Boston College.
Mullis, I.V.S., Martin, M.O., Ruddock, G.J., O'Sullivan, C.Y., & Preuschoff, C. (2009). TIMSS 2011 assessment frameworks: TIMSS & PIRLS. Chestnut Hill, MA: International Study Center, Lynch School of Education, Boston College.
Nichols, S., & Berliner, D. (2005). The inevitable corruption of indicators and educators through high stakes testing. Tempe, AZ: Education Policy Studies Laboratory, Arizona State University.
Nichols, S., & Berliner, D. (2008). Why has high stakes testing slipped so easily into contemporary American life? Phi Delta Kappan, 89(9), 672–676. http://dx.doi.org/10.1177/003172170808900913
Polya, G. (1957). How to solve it: A new aspect of mathematical method. Princeton, NJ: Princeton University Press.
Scheffler, I. (1965). The conditions of knowledge. Glenview, IL: Scott, Foresman & Company.
Schmidt, W., McKnight, C., Valverde, G., Houang, R., & Wiley, D. (1996). Many visions, many aims. Dordrecht: Kluwer Academic Publishers.
Schoenfeld, A.H. (2007). What is mathematical proficiency and how can it be assessed? In A.H. Schoenfeld (Ed.), Assessing mathematical proficiency (pp. 9–73). Mathematical Sciences Research Institute Publications, Vol. 53. New York, NY: Cambridge University Press. Available from http://library.msri.org/books/Book53/contents.html
Skemp, R. (1976). Relational understanding and instrumental understanding. Mathematics Teaching, 77, 20–26.
Usiskin, Z. (2012, July). What does it mean to understand school mathematics? Paper presented at the 12th International Congress on Mathematical Education, COEX, Seoul, Korea.
Usiskin, Z., Peressini, A., Marchisotto, E., & Stanley, D. (2003). Mathematics for high school teachers: An advanced perspective. Upper Saddle River, NJ: Prentice Hall.
Van Wyke, J., & Andrich, D. (2006). A typology of polytomously scored items disclosed by the Rasch model: Implications for constructing a continuum of achievement. Perth: University of Murdoch University.
Vergnaud, G. (1988). Multiplicative structures. In J. Hiebert, & M. Behr (Eds.), Number concepts and operations in the middle grades (pp. 141–161). Hillsdale, NJ: National Council of Teachers of Mathematics.
Vergnaud, G. (1994). Multiplicative conceptual field: What and why? In G. Harel, & J. Confrey (Eds.), The development of multiplicative reasoning in the learning of mathematics (pp. 41–59). Albany, NY: State University of New York.
Webb, N.L. (1992). Assessment of students’ knowledge of mathematics: Steps toward a theory. In D.A. Grouws (Ed.), NCTM handbook of research on mathematics teaching and learning. (pp. 661–683). New York, NY: Macmillan Publishing Company.
Wolk, R.A. (2012). Common Core vs. Common Sense. Education Week, 32(13), 35–40.
Worksheet 2: Grade 9: Patterns, functions and algebra
Determine the general term for this pattern:
Factorise fully: 20x^{2} – 45y^{4}
A, B and C show different representations of a linear function. Which one of the following three representations does not represent the same linear function?
Worksheet 4: Grade 9: Measurement
The area of a square is 4 m^{2}. Calculate the area of the shape if one side of the original square is doubled.
Calculate the volume of the cylinder. Use 22/7 as an approximation for π.
An airplane flies 300 km due north. However the pilot ignored the constant side wind which took him off course. The flight path is shown in the figure below. How far is he from his original destination?
TABLE 1−A2: Adapted cognitive domain categories (Grade 9). 
TABLE 1−A3: Items allocated to Usiskin's dimensions (Grade 9). 
1. Interesting connection here with the Van Hiele levels.
2. See Long (2011, p. 234) for a detailed discussion.
3. In the early stages of the project these tasks were performed by the research team. In later stages of the project this task became a joint function of both researchers and teachers.
