Abstract
There is an articulation gap for many students between the literacy practices developed at school and those demanded by higher education. While the school sector is often well attuned to the schoolleaving assessments, it may not be as aware of the implicit quantitative literacy (QL) demands placed on students in higher education. The National Benchmark Test (NBT) in QL provides diagnostic information to inform teaching and learning. The performance of a large sample of schoolleavers who wrote the NBT QL test was investigated (1) to demonstrate how schoolleavers performed on this QL test, (2) to explore the relationship between performance on this test and on cognate schoolleaving subjects and (3) to provide school teachers and curriculum advisors with a sense of the QL demands made on their students. Descriptive statistics were used to describe performance and linear regression to explore the relationships between performance in the NBT QL test and on the school subjects Mathematics and Mathematical Literacy. Only 13% of the NBT QL scores in the sample were classified as proficient and the majority of schoolleavers would need support to cope with the QL demands of higher education. The results in neither Mathematics nor Mathematical Literacy were good predictors of performance on the NBT QL test. Examination of performance on selected individual items revealed that many students have difficulty with quantitative language and with interpreting data in tables. Given that QL is bound to context, it is important that teachers develop QL practices within their disciplinary contexts.
Introduction
Many students leaving the school system in South Africa, while prepared for leaving school, are to some extent unprepared for higher education. There is therefore great inequality in the experiences of and outcomes for students in the higher education sector. According to Scott, Yeld and Hendry (2007, p. 42) ‘the educational factor to which poor performance is perhaps most commonly ascribed across the higher education sector is student underpreparedness for standard undergraduate programmes’. This ‘underpreparedness’ is to a large extent the result of difficulties experienced in the area of academic literacies:
The real key to whether a student will pass or fail relates to the literacy practices she brings with her to the University from her school and home environments, and the extent to which these have commonalities with the literacy practices of her chosen discipline. (McKenna, 2009, p. 8)
The importance of quantitative literacy (QL) for higher education is widely recognised (see for example, Steen, 2004) and there is also an increasing awareness that many academic disciplines make complex quantitative demands that are often very different from those that are the focus of traditional mathematics courses. This ability to work with quantitative information in academic contexts is one of the academic literacies and presents particular difficulties for many students. Results from the National Benchmark Tests Project (NBTP) illustrate this clearly. For example, in 2014, of the 76 693 prospective applicants to higher education who wrote the NBTP QL test, only 11% performed at the ‘Proficient’ QL level. The remaining 89% were expected to experience academic challenges arising from their low levels of QL proficiency. Just over 40% performed at the ‘Basic’ level, which means that they would be severely challenged academically in higher education (Centre for Educational Testing for Access and Placement, 2015, p. 26).
One of the goals of the NBTP is to provide lecturers and curriculum developers in higher education with information about the capabilities of students ‘to inform the nature of foundation courses and curriculum responsiveness’ (Griesel, 2006, p. 4). This is because the higher education sector recognises that university teaching needs to address the ‘articulation gap’ (Scott et al., 2007, p. 42) between the level of many students’ quantitative (and other) literacies and the demands of higher education. However, it is recommended that teachers and curriculum advisors in schools also make use of the diagnostic information provided by the NBTP, in order that they can better prepare students for higher education.
A second goal of the NBTP is to ‘assess the relationship between higher education entry level requirements and schoollevel exit outcomes’ (Griesel, 2006, p. 4). The National Senior Certificate (NSC) and National Benchmark Test (NBT) assessments have two complementary purposes. While the NSC largely determines whether schoolleavers are ready to leave the school system, the NBTs are based on the assumption that prospective higher education students are ready to leave the school system with a higher education pass, and the tests attempt to determine to what extent these prospective students are ready for the demands of higher education. NSC candidates have to take either Mathematics or Mathematical Literacy as subjects and both of these subjects are cognate to the NBT QL domain, but there are substantial differences between them. The nature of these NSC subjects is probably widely understood, but the nature of QL for higher education (the NBT QL domain) may be less familiar. Examples of the QL demands of higher education curricula are outlined by Frith and Prince (2009), Frith and Gunston (2011) and Prince and Simpson (2016). The relationship between performance in the NBT QL test and subsequent higher education performance has been investigated by Prince (2016).
Since QL involves dealing with and communicating quantitative information in various academic contexts, it is not only in the school subjects of Mathematical Literacy and Mathematics that QL is developed. Teachers in many other subjects, for example Geography, the sciences or Economics, have a very important role to play in developing this vital literacy by, for example, expecting learners to interpret a variety of representations of quantitative information, by using (and expecting learners to use) correct and precise language when expressing quantitative ideas and by assessing learners’ ability to practise and express quantitative reasoning in context.
In this article, we provide an overview of the results of the NBTP QL test written by a large sample of successful schoolleavers across South Africa who were intending to apply to enter higher education institutions in 2015. These learners all achieved NSC results that would allow them to enter higher education. We also report on the associations between performance on the NBT QL test and the NSC subjects Mathematics and Mathematical Literacy. We further analyse the performance of these students on subgroups of items and on a selected number of key items. This analysis is not intended to give a comprehensive exposition of the QL competencies relevant to higher education, but is intended to alert school teachers and curriculum specialists to the requirements of higher education and to provide some examples of areas where lack of student competence will have detrimental effects on their success in higher education, if not addressed.
Quantitative literacy assessment in the NBTP
Higher Education South Africa commissioned the National Benchmarks Tests Project in 2005 with its main aim being to assess the academic proficiency of prospective students wishing to enter higher education. The tests assess proficiency in QL, as well as academic literacy (AL) for all students and mathematics (MAT) for those students intending to enter courses or programmes that have a significant mathematical component. These tests are designed by academics in higher education to provide information complementary to that provided by the NSC, to assist with selection and placement of students into appropriate courses and programmes. Another function of the NBTP is to provide information to inform curriculum development in higher education.
The NBTs provide a measure of students’ readiness for higher education and the competencies that are assessed in the NBTs are regarded as key areas in which students entering higher education should have minimum levels of proficiency. The content and competencies assessed in each domain are described by Griesel (2006). These tests are ‘constructed to provide information about the level of a testtaker’s performance in relation to clearly defined domains of content and/or behaviours (e.g. reading, writing, mathematics) that requires mastery’ (Foxcroft, 2006, p. 9). Minimum benchmark scores for three different proficiency levels are established through a rigorous standard setting process. The proficiency levels are thus defined in terms of the percentage of the content specified in the test construct that a student has mastered. This is done according to the judgement by higher education academics as to the requirements of their disciplines, with reference to each item in the test. Associated with each proficiency level there are recommendations for the kind of educational provision that is appropriate for a student whose performance is at that level.
In particular, the NBTP QL test aims to measure the levels of proficiency in QL of schoolleavers who are aspiring to enter higher education. The construct informing the design of this test is outlined by Frith and Prince (2006, p. 28). This test is written during or at the end of their Grade 12 year, which is considered to be a time when students could realistically be expected to be ready for the QL demands of higher education study.
In practical terms the NBTP QL test assesses students’ ability to competently interpret and reason with quantitative information that is presented in a variety of modes. For example, the test specifications include that they must understand and use a range of quantitative terms and phrases, read and interpret tables, graphs, charts, diagrams and texts and integrate information from different sources. The test also assesses the ability to apply quantitative procedures in various situations, to do simple calculations and estimations which may involve multiple steps and to formulate and apply simple formulae. Students are also required to identify trends and patterns in various situations, interpret twodimensional representations of threedimensional structures and to reason logically. The questions are designed to assess QL practices and do not assume that students have the knowledge of any particular school subject.
Theoretical framework for the NBTP quantitative literacy test
In this section, we will outline the theoretical considerations about the nature of QL that underpin the construct of the NBTP QL test. There has been an ongoing debate about what constitutes QL, especially in England and Australia (where it is usually referred to as ‘numeracy’) and in the United States (where it is usually called ‘quantitative literacy’). One aspect of this debate is to do with the relationship of QL with mathematics. HughesHallett (2001) sums up the difference as follows:
Mathematics focuses on climbing the ladder of abstraction while quantitative literacy clings to context. … Mathematics is about general principles that can be applied in a range of contexts; quantitative literacy is about seeing every context through a quantitative lens. (p. 94)
Some authors prefer to conceptualise QL as a social practice (Street, 2005; Street & Baker, 2006), in line with many AL practitioners. In the South African school context, the subject Mathematical Literacy (which comprises the same kinds of competencies as QL) is defined as ‘a subject driven by liferelated applications of mathematics’ (Department of Education, 2003, p. 9). Many authors focus on the aspect of QL that has to do with thinking critically about the use of numbers in society (Johnston, 2007) and some prefer to think of it as part of multiple academic literacies (Chapman & Lee, 1990). However, all the definitions of QL stress that it is fundamentally concerned with mathematics and statistics used in context:
At the very least then, the definitions garnered from this debate would agree that numeracy is to do with ‘using maths in context’ and that to be numerate is to have the ‘capacity to use maths effectively in context’. (Johnston, 2002, p. 4)
It follows that QL cannot be taught as a generic skill and that learned rules will not be sufficient to enable the solution of QL problems. Thus, in almost all of the questions in the NBTP QL test, writers have to apply quantitative methods and reasoning within a realistic (mostly relevant to higher education) context. There is a wide range of both the contexts used for the items and the kinds of competencies required by them, in order that the test has both face validity and relevance for all disciplines in higher education.
The definition of QL that is the foundation of the construct of the NBTP QL test, is strongly influenced by the definition of numerate behaviour underlying the assessment of numeracy in the Adult Literacy and Lifeskills (ALL) Survey (Gal, Van Groenestijn, Manly, Schmitt & Tout, 2005, p. 152) and the New Literacies Studies view of literacy as social practice (Kelly, Johnston & Baynham, 2007; Street, 2005; Street & Baker, 2006):
Quantitative literacy is the ability to manage situations or solve problems in practice, and involves responding to quantitative (mathematical and statistical) information that may be presented verbally, graphically, in tabular or symbolic form; it requires the activation of a range of enabling knowledge, behaviours and processes and it can be observed when it is expressed in the form of a communication, in written, oral or visual mode. (Frith & Prince, 2006, p. 30)
The construct informing the test design is based on the idea that each item can be described in terms of three dimensions of what it assesses: the main mathematical and statistical ideas, the underlying reasoning and behaviours (competencies) and the level of cognitive complexity. The construct does not specify the contexts for the items, but rather that there is a range of different kinds of contexts. This is necessary for face validity and because familiarity (or unfamiliarity) of the context could affect the manner in which a candidate responds to it.
Mathematics and Mathematical Literacy assessment in the National Senior Certificate
With the introduction of the NSC in 2008, one national set of Grade 12 examination question papers was introduced. After a review of the NSC curricula, the Curriculum and Assessment Policy Statement (CAPS) was introduced and implemented in 2012 in Grade 10, with this cohort being the first to write the NSC examinations based on the CAPS in 2014.
The scoring of the NSC assessments, administered by the Department of Basic Education (DBE), is norm referenced and therefore the rating codes associated with them cannot easily be used to assess whether candidates meet a certain standard in a subject or domain. For the NSC, the final subject score is made up of the course mark and the examination mark and then the scores are ‘standardised’ or ‘normed’ to the fiveyear rolling average score for each subject. While a candidate may perform well compared to the norm, they may still fail to meet a particular standard in the domain being tested.
The achievement scale for NSC subjects is shown in Table 1. The descriptions against the rating codes are not benchmarks or standards, but rather descriptive categories of what a percentage score range means in terms of a candidate’s test achievement.
TABLE 1: The achievement scale for the National Senior Certificate. 
On completing the NSC, a candidate can qualify for higher certificate, diploma or degree study. The criteria in Table 2 are used to determine these entry requirements.
TABLE 2: Criteria for higher certificate, diploma and degree study. 
All NSC candidates must write the examinations for either Mathematics or Mathematical Literacy, which are both cognate with, but not the same as QL, as can be seen from their descriptions in the NCS CAPS documents.
The NSCS CAPS document for the subject Mathematics defines Mathematics as:
a language that makes use of symbols and notations for describing numerical, geometric and graphical relationships. It is a human activity that involves observing, representing and investigating patterns and qualitative relationships in physical and social phenomena and between mathematical objects themselves. (DBE, 2011a, p. 8)
The CAPS document claims that studying Mathematics will develop a student’s ability to think logically, critically and creatively and to be able to solve problems, in order to obtain a better understanding of the world around us. This focus on problemsolving and critical thinking in order to understand realworld phenomena has strong similarities with the definition of QL, but the main focus of the subject is in fact on learning the discipline of mathematics itself in order to ensure ‘access to an extended study of the mathematical sciences and a variety of career paths’.
On the other hand, the NCS CAPS for Mathematical Literacy states that the competencies it develops should:
allow individuals to make sense of, participate in and contribute to the twentyfirst century world – a world characterised by numbers, numerically based arguments and data represented and misrepresented in a number of different ways. (DBE, 2011b, p. 8)
It suggests that these competencies, which include the ability to reason, solve problems, interpret information and use technology, should be developed by exposing learners to both elementary mathematical content and authentic reallife contexts. This exposure is intended to enable the learner to be a ‘selfmanaging person, a contributing worker and a participating citizen in a developing democracy’ and an ‘astute consumer of the mathematics reflected in the media’. The emphasis on using mathematical knowledge and skills in context is what makes this subject similar to QL, but in higher education the contexts are academic disciplinary contexts, not necessarily everyday liferelated contexts, as emphasised in the NSC CAPS document for Mathematical Literacy.
This brief overview of the NSC Mathematics and Mathematical Literacy subjects and the method of assessment used for the NSC is important in order to reveal the complementary nature of the information derived from standardised benchmark or criterionreferenced assessments, such as the NBTs and a norm referenced assessment, such as the NSC. This complementary information about student competence is particularly useful for making decisions about whether students should be placed in extended or regular degree curriculum structures and for providing information to inform teaching and learning in higher education.
Methods
The NBT QL test results were analysed for a large sample (N = 7 464) of schoolleavers from across South Africa who wrote one version of this test in 2014, indicating that they were intending to apply to higher education institutions for study in 2015. The sample only includes data for those who then went on to write the NSC and who obtained a result that allowed them to progress to some kind of higher education. Since these are therefore prospective higher education students, for the sake of brevity we will from now on refer to them as ‘students’ (see Table 5 for some demographic characteristics of the sample).
The structure, administration and scoring of the NBTP quantitative literacy test
There are 50 multiple choice items in the QL test, which are selected in accordance with the specification table (Frith & Prince, 2006, p. 32). This specifies the proportions of items that should address each of the competencies, mathematical and statistical ideas and levels of cognitive complexity deemed to be representative of the QL demands of the first year of higher education and defined in the test construct.
The NBTs are administered at test centres under standardised conditions by specially trained invigilators. The items in the QL test comprise two out of the seven sections of the NBTP Academic and Quantitative Literacy test which is administered in paper and pencil format. Thirty minutes is allocated for the completion of each of the 25item sections that make up the QL test. Calculators are not used, but students are only required to calculate with simple numbers, for example, with fractions that can easily be simplified by cancellation. Many questions can be answered by estimation.
Students writing the tests record their responses on markreading (bubble answer) sheets which are then scanned using optical scanner technology. Responses are dichotomised (either 1 for right or 0 for wrong). The unidimensional three parameter Item Response Theory (IRT) model (Yen & Fitzpatrick, 2006) is used to determine a student’s ability and generate a score for the candidate on a scale of 0% to 100%. Results for different versions of the QL test are linked and equated using the Stocking and Lord method (Holland & Dorans, 2006) to ensure that a candidate’s score is independent of the version of the test that they wrote.
The benchmarks for the NBTP quantitative literacy test
Table 3 provides the score ranges of the three main QL proficiency categories for degree study, and the recommendations for appropriate responses by institutions for students whose scores place them in these categories. The scores defining the proficiency categories are established at standardssetting workshops by panels of diverse South African academics who teach courses relevant to the domain. This process is carried out using the ‘modified Angoff’ method (Hambleton & Pitoniak, 2006) and to date has been led by a senior psychometrician from the Educational Testing Service (ETS), Princeton, New Jersey. Thus, the proficiency categories are not described in terms of particular QL knowledge and competencies, but in terms of academic teachers’ expectations for student performance on the test items, based on their knowledge of their own curricula.
TABLE 3: National Benchmark Tests Project quantitative literacy test benchmarks for degree study, set in 2012. 
It has also been useful to differentiate between different levels of support that would be most appropriate for students with scores in the ‘Intermediate’ category and so this level is divided into ‘Upper Intermediate’ and ‘Lower Intermediate’ bands, as shown in Table 4. This differentiation is not done through the standardssetting workshops but is effective for pragmatic reasons, as the majority of scores are in the ‘Intermediate’ category.
TABLE 4: National Benchmark Tests Project quantitative literacy test degree Intermediate benchmarks and how they should be interpreted. 
Measuring performance on subgroups of items
In addition to the analysis that is routinely done to generate scores for NBT results, for this article each item in the QL test was assigned to a discrete subgroup of items based on the main competency that the item was designed to assess, as follows:
 Computing: interpreting problem statements and calculating or estimating (e.g. calculating areas or percentages).
 Knowing: recalling simple facts and applying them (e.g. mean and median of a distribution).
 Translating: identifying alternative representations (e.g. identifying the graphical representation of a relationship described verbally).
 Using data: deriving information from data representations (e.g. reading required value/s off charts or tables).
 Reasoning: reasoning and synthesising (e.g. identifying, reading off and calculating with more than one appropriate value from tables or charts; reasoning about rates).
 Extrapolating: predicting and visualising (e.g. recognising patterns or predicting terms in a sequence; visualisation in three dimensions).
Performance on each subgroup was then calculated separately so as to obtain more detailed diagnostic information about students’ competencies. For this analysis, the scores used were not generated using the three parameter IRT model as for the overall NBT score, but merely using the percentage of the sample who answered each item correctly.
Measuring performance on individual items
In order to examine how students responded to individual items, the proportion of students who chose each alternative answer was recorded. This was done for the whole cohort and separately for the students whose scores for the whole test were in each of the proficiency bands.
Statistical packages
The statistical package R (R Core Team, 2016) was used to do the data analysis and the R package ggplot2 (Wickham, 2009) was used to create the graphical representations.
Ethical considerations
Ethical clearance for this research was obtained from the Research Ethics Committee of the Faculty of Higher Education Development at the University of Cape Town. This included approving the consent declaration signed by prospective students writing the NBT, which allows the use of their results for research purposes and assures anonymity in the use of these data.
Results and discussion
The results are presented for a large sample of schoolleavers who wrote one version of the NBTP QL test and who qualified to progress to some kind of higher education. These results are therefore not representative of all NSC candidates, but only of those who were qualified to enter higher education.
We first present some information about the characteristics of the students in the sample. We then present the overall distribution of scores on the entire QL test for the whole sample as well as for NSC Mathematics candidates and NSC Mathematical Literacy candidates separately. This will be followed by some results showing the relationship between performance on NSC Mathematics or Mathematical Literacy and on the NBT QL test. In addition, we report on the performance of the whole sample on subgroups of items in the NBT QL test, defined in terms of the competencies they assess. Finally, we discuss the performance of the students in the different proficiency bands on individual example items, in order to highlight some particular areas of difficulty they experienced.
Characteristics of the sample
Some characteristics of the students in the sample are shown in Table 5. Approximately 60% of the students were African and the majority did not have English as their home language. English home language speakers however formed the largest language group, comprising 40% of the students. There were considerably more female students than males in this sample (about 60% and 40%, respectively). The preponderance of female students is also generally observed in the larger cohorts of all NBTP test writers.
TABLE 5: Demographic characteristics of the total sample of National Benchmark Tests Project quantitative literacy test candidates (N = 7 464). 
Distributions of scores for the whole test
The data in Table 6 and shown in Figure 1, show the descriptive statistics for the distributions of NBT QL scores for NSC Mathematics candidates, NSC Mathematical Literacy candidates and for the whole sample.

FIGURE 1: Distributions of National Benchmark Test quantitative literacy scores for National Senior Certificate Mathematics, National Senior Certificate Mathematical Literacy candidates and for the whole sample. 

TABLE 6: Descriptive statistics for the distributions of the quantitative literacy performance of the whole sample (N = 7 464) and of subsets defined by National Senior Certificate mathematics subject written. 
The most obvious observation to be made is that the NBT QL scores of the NSC Mathematical Literacy candidates are considerably lower than those of the NSC Mathematics candidates. The median score for the whole sample (44%) is approximately in the middle of the ‘Lower Intermediate’ proficiency band, while the median score for NSC Mathematics candidates is within the ‘Lower Intermediate’ band, but the median for NSC Mathematical Literacy candidates is well within the ‘Basic’ band. The third quartile for all the distributions illustrated in Figure 1 are below the top of the ‘Intermediate’ band, showing that in all cases less than 25% of the scores are in the ‘Proficient’ category.
Table 7 and Figure 2 show the same comparisons but in terms of distribution of scores according to proficiency bands defined by the benchmark scores.

FIGURE 2: Percentage distribution of National Benchmark Test quantitative literacy scores by proficiency category for the whole sample, National Senior Certificate Mathematics and Mathematical Literacy candidates. 

TABLE 7: Numbers and percentage distribution across quantitative literacy performance categories for the total sample and of subsets defined by National Senior Certificate mathematics subject written. 
The majority of the scores are in the ‘Basic’ and ‘Lower Intermediate’ bands and less than 15% of the scores are classified as ‘Proficient’. The most striking difference is seen when comparing the distributions for those students who wrote NSC Mathematics and NSC Mathematical Literacy: the proportion of scores that are ‘Basic’ is more than twice as large for Mathematical Literacy as for Mathematics candidates (over 60% and about 30%, respectively). Less than 1% of the scores for Mathematical Literacy candidates fall in the ‘Proficient’ category. From this it is clear that the NSC Mathematical Literacy subject does not prepare students for the quantitative demands of higher education study. It cannot however be concluded that the somewhat better performance on the NBT QL test of students who did NSC Mathematics can necessarily be ascribed to their having taken this subject, as QL is also developed in other subjects such as in the physical, earth and life sciences.
Comparison of performance in NSC subjects and NBTP quantitative literacy test
In this section, we will further explore the relationship between performance in the NSC mathematical subjects and in the NBT QL test. Table 8 contains the descriptive statistics for the distribution of marks achieved by the NSC Mathematics and Mathematical Literacy candidates. These distributions are illustrated in Figure 3, where they are juxtaposed with the distributions of the NBT QL scores obtained by the students in these two subsets of the sample.

FIGURE 3: Distributions of National Senior Certificate Mathematics and National Senior Certificate Mathematical Literacy scores and of National Benchmark Test quantitative literacy scores for National Senior Certificate Mathematics candidates and National Senior Certificate Mathematical Literacy candidates. 

TABLE 8: Descriptive statistics for the National Senior Certificate Mathematics and Mathematical Literacy results for the whole sample (N = 7 464). 
In Figure 3 we see that in general the results for NSC Mathematical Literacy are higher than for Mathematics, with median values of 66% and 60% respectively, with greater variability shown by the Mathematics results. As we have already seen, the comparison of the NBT QL results for the same two subsets of the sample shows the reverse, with the NBT QL scores obtained by the NSC Mathematical Literacy candidates being considerably lower than those of the NSC Mathematics candidates.
In Figure 4 and Figure 5 the relationship between the NSC Mathematics results and the NBT QL scores is explored further.

FIGURE 4: National Senior Certificate Mathematics and National Benchmark Test quantitative literacy scores (N = 6 271). 


FIGURE 5: Percentage distribution of National Benchmark Test quantitative literacy scores by proficiency band for different National Senior Certificate Mathematics levels achieved. 

Figure 4 shows a scatter plot of the NBT QL scores of the 6 271 NSC Mathematics candidates plotted against their Mathematics results. Even though the correlation coefficient is 0.63, it is clear that the NSC Mathematics result is not a good predictor of NBT QL score. There is a wide range of NBT QL scores associated with any particular NSC Mathematics result, with this range being wider the higher the Mathematics mark. For example, for students who obtained over 80% (‘Outstanding’) for Mathematics, the NBT QL scores range from less than 25% to nearly 100%. Even a low mark for Mathematics is associated with quite a large range of possible NBT QL scores. For example, for those who obtained Mathematics marks less than 30% (‘Not achieved’), the NBT QL scores range from about 15% to 75%. From the fact that most of the points lie to the right of the dashed line (made up of points where the Mathematics result and the NBT QL results are equal) we can see that in general the NBT QL scores are lower than the Mathematics results (which can also be seen by comparing the box plots in Figure 3).
The same data (as in Figure 4) is used to produce the chart in Figure 5. Here the NSC Mathematics results are classified into the levels (1 to 7) as defined in the Curriculum Statement (DBE, 2009, p. 5) and the NBT QL scores are classified according to proficiency band. For each NSC Mathematics level the proportion of NBT QL scores in each proficiency band is illustrated. For those students who obtained a level 1 NSC Mathematics result (‘Not achieved’), just over 75% obtained a ‘Basic’ NBT QL score. This proportion decreases fairly linearly as the NSC Mathematics levels increase, so that only about 5% of the students with NSC Mathematics level 7 (‘Outstanding’) obtained NBT QL scores in the ‘Basic’ band.
However, it is most noteworthy that, even for those who obtained level 7 results for NSC Mathematics, less than 50% obtained a ‘Proficient’ NBT QL score, while the proportion of ‘Proficient’ scores for the level 6 (‘Meritorious achievement’) Mathematics results is even smaller, at just under 25%. Thus, based on the NBT QL scores, the majority of even the best NSC Mathematics performers will require some kind of additional support to cope with the quantitative demands of higher education study.
In Figure 6 and Figure 7 the relationship between the NSC Mathematical Literacy results and the NBT QL scores is explored (in the same way as it was for NSC Mathematics in Figure 4 and Figure 5).

FIGURE 6: National Senior Certificate Mathematical Literacy and National Benchmark Test quantitative literacy scores (N = 1 193). 


FIGURE 7: Percentage distribution of National Benchmark Test quantitative literacy scores by proficiency band for different National Senior Certificate Mathematical Literacy levels achieved. 

Figure 6 shows a scatter plot of the NBT QL scores of the 1 193 NSC Mathematical Literacy candidates plotted against their Mathematical Literacy results. Even though the correlation coefficient is 0.68 it is clear that for all but two students the NSC Mathematical Literacy result is higher than the NBT QL score by far. As for the NSC Mathematics candidates there is a fairly wide range of NBT QL scores associated with any particular NSC Mathematical Literacy result, with this range being wider the higher the Mathematical Literacy mark. For example, for students who obtained over 80% (‘Outstanding’) for Mathematical Literacy, the NBT QL scores range from about 25% to about 80%.
The proportions illustrated in Figure 7 reveal that the vast majority of NSC Mathematical Literacy candidates with results lower than level 6 have ‘Basic’ NBT QL scores, indicating that they are highly unlikely to cope with degreelevel study without extensive and longterm support in QL. Given that only a minimal proportion of those with the highest NSC Mathematical Literacy marks achieved ‘Proficient’ NBT QL scores, it is safe to conclude that practically speaking, nearly all students who have taken NSC Mathematical Literacy will need some kind of supplementary support to cope with the quantitative demands of higher education.
Performance on subgroups of NBTP quantitative literacy items
Each NBT QL item was assigned to a single subgroup based on a judgement of the main competency which the item was designed to assess. This results in six discrete subgroups of items, with the numbers of items in each subgroup as shown in Table 9. The distributions of scores for these subgroups are shown in Figure 8.

FIGURE 8: Distributions of the scores of the entire sample for each subgroup of items based on the main competency they assess. 

TABLE 9: Classification of the National Benchmark Test quantitative literacy items according to the main competence required. 
In all subgroups except ‘extrapolating’, the upper quartile is at or below the top of the ‘Upper Intermediate’ band, and the median is at or below the top of the ‘Lower Intermediate’ band, indicating that at least threequarters of the scores are not in the ‘Proficient’ band, and at least 50% are not in the ‘Upper Intermediate’ or ‘Proficient’ bands. The subgroups on which the performance was the weakest are ‘computing’ and ‘knowing’. For ‘computing’ the median is the lowest, at 40%, and about a quarter of the scores are below 30%. This reflects students’ poor number sense (as calculators were not used in the test session and many items required estimation) as well as difficulty with interpreting problem statements, especially in some cases where the context may have been unfamiliar. The strongest performance was on the subgroup ‘extrapolating’ which contains several items that require predicting future terms in a given sequence. These presumably draw on skills that are better developed at school. The median scores for ‘reasoning’, ‘knowing’, ‘translating’ and ‘using data’ are all about 50% and near the top of the ‘Lower Intermediate’ band.
In Figure 9 the mean scores for each subgroup are compared for the NSC Mathematics and Mathematical Literacy candidates.

FIGURE 9: Mean scores (and confidence intervals) for each subgroup of items based on the main competency they assess reported separately for the National Senior Certificate Mathematics and Mathematical Literacy candidates. 

The mean scores for the subgroups for the NSC Mathematics candidates follow the same pattern that is seen in Figure 8. This is expected, since these candidates make up 84% of the total sample. As expected from the comparison of the overall scores for the NSC Mathematics and Mathematical Literacy candidates (see Figure 3) the scores are lower for the Mathematical Literacy candidates in all the subgroups. However, these mean scores deviate from the pattern in Figure 8. On the whole the mean score for each subgroup is about 15 percentage points lower, but for the ‘knowing’ subgroup the mean score is even lower, and for the ‘reasoning’ subgroup it is only 10 percentage points lower.
Diagnostic information obtained from examining performance on individual items
The examples described in this section illustrate how a detailed examination of the proportions of students who chose different alternative answers in certain items can be used to gain rich insights into students’ abilities. In these items some of the alternative (incorrect) answers are obtained through applying common misconceptions or fallacious thinking, so examining the patterns of responses can be revealing. However, it is difficult to report this kind of information without also describing precisely the structure and content of the items, and for security reasons this is undesirable. Thus, we will try to illustrate how diagnostic information relevant to higher education studies is revealed, without publicising the actual test items.
Example 1: Interpreting percentage values in a table
A table from a reading that was given to firstyear medical students and which contains both numbers and percentages is shown in Table 10. In a table like this, one has to identify which percentages add up to 100% in order to understand the meaning of the percentage values. Using this approach, a student can identify the ‘whole’ of which the percentage is expressing a fraction. In this way they can identify that, for example, the percentage at the top of the third column means: ‘23.1% of the homicides of people under 15 years in age were females’, not ‘23.1% of under 15 year old females were homicides’ or ‘23.1% of the female homicides were under 15’ or some other variation of the relevant phrases.
TABLE 10: Homicide and suicide by gender and age (preliminary NMSS data, first quarter 1999). 
Figure 10 shows the results for an item in the NBTP QL test that refers to a table with similar data to Table 10. It shows the percentages of students who chose each of the alternative answers in each of the proficiency categories and the choices made by the total cohort. Students were classified into proficiency categories based on their performance on the whole QL test.

FIGURE 10: For a question requiring interpretation of a percentage in a table, the proportions of students who chose each alternative answer, for the total cohort and for each proficiency band. 

Less than half of all students could identify the correct description of the meaning of a particular percentage value in the table (alternative D). Only about 60% of those scoring overall in the ‘Upper Intermediate’ category and about 40% in the ‘Lower Intermediate’ category were able to do this.
In the bottom two categories more than half of the students chose alternative A (58% and 54% in the ‘Basic’ and ‘Lower Intermediate’ categories respectively). This was equivalent to their choosing ‘23.1% of the female homicides were under 15 years of age’ in the above example, showing that they identified the correct row and column headings, but could not identify what the ‘whole’ was that the percentages were a ‘part’ of.
Students in higher education will have to interpret tables of data and percentages in many disciplines and most lecturers will assume that they can understand these representations. The data in Figure 10 shows that many students will have trouble interpreting the language used to describe percentages and with making sense of data in tables, especially when it includes percentages.
Example 2: Converting to square units
Figure 11 shows the results for an item that required students to say how many square millimetres (mm^{2}) there are in 2 cm^{2} (with the fact that there are 10 mm in 1 cm given). The alternative answers were A 20, B 40, C 200 and D 400. Only onethird of all students and less than twothirds of those with scores in the ‘Proficient’ category answered this correctly. Students with scores in the bottom two categories preferred alternative A (the answer 20), indicating that they treated all units mentioned as linear, ignoring the references to square units. Even those in the ‘Upper Intermediate’ category were almost as likely to choose alternative A as the right answer (between 35% and 40% of them). Alternative D (answer 400) was chosen by between 10% and 20% in all proficiency bands. In this case students were aware that squaring the numbers was appropriate, but applied it inappropriately to the value 2, as well as in order to convert the units. Similarly, alternative B (answer 40) was chosen by between 10% and 20% of the students with scores in the lower three proficiency bands, indicating that they were aware of the need for squaring, but not aware of which numbers to square in order to convert the units. These results indicate that students are unable to think flexibly about square units in the metric system, or that many of them do not read questions carefully (interpreting mm^{2} as mm and cm^{2} as cm).

FIGURE 11: For a question requiring converting to square units, the proportions of students who chose each alternative answer, for the total cohort and for each proficiency band. 

The pattern of performance on this question shows that most students are not able to do a simple conversion from linear to square units, which is a competency that they would most likely be expected to have in many scientific and technical courses in higher education.
Example 3: Proportional reasoning and integrating data from different sources
Given values in a chart for the proportion of the population in several provinces that is, say, over 15 years of age, as well as the values of the total population for those provinces, one might ask which province had the greatest number of people over 15 years of age. If in addition the proportions are all similar, but the total populations are significantly different, one can conclude that the province with the largest overall population also has the largest number of people over 15 years of age.
The results for a task similar to the one just described are shown in Figure 12. Alternative A is the correct answer and alternative D represents the greatest proportion (the province with the tallest bar in the chart). The correct answer (the province with the greatest number of people over 15 years old) was chosen by only 40% of all students and by only 75% of those whose scores were ‘Proficient’. The majority of students with scores in the ‘Intermediate’ categories (and surprisingly, a slightly smaller proportion in the ‘Basic’ category) chose the largest proportion (alternative D), in effect answering the question ‘Which of the provinces had the greatest percentage of people over 15 years of age?’.

FIGURE 12: For a question requiring proportional reasoning, the proportions of students who chose each alternative answer, for the total cohort and for each proficiency band. 

The proportion of all students who could in fact answer correctly by using the proper reasoning about proportions was probably smaller than 40%, as it was possible to arrive at the correct answer by answering the question equivalent to ‘Which of the provinces had the greatest number of people?’ not ‘Which of the provinces had the greatest number of people over 15 years of age?’ (by using the data for the total populations and not checking that the proportions of people over 15 were similar in the four cases).
The pattern of performance on this task illustrates the difficulty that many people have with reasoning with proportions, understanding the distinction between absolute and relative quantities and the language used to make this distinction. These difficulties are further analysed by Frith and Lloyd (2016). It appears also that many students also did not recognise that the question required them to integrate information from two different data representations.
Example 4: Calculating percentage of percentage and integrating data from different sources
Given data for the percentage distribution of the population in the provinces of South Africa and the proportion of the population that is over 15 years of age in each province (for example in a stacked bar chart), a question might ask what percentage of the total population is over 15 years of age and lives in a particular province. This requires combining the percentage of the population that lives in the given province and the percentage of that provincial population that is over 15 years old, by calculating a percentage of a percentage.
Figure 13 represents the results for a task similar to this one, where alternative A represents the correct answer, alternative B represents the percentage of the total population that is in the province, alternative C is the percentage of the population of that province that is over 15 years. The vast majority of students (70%) selected alternative C, which means that they effectively answered the question ‘What percentage of the population in the province is over 15 years?’ rather than ‘What percentage of the total population is over 15 years and lives in the province?’. Even more than a quarter of the students with scores in the ‘Proficient’ category selected this incorrect alternative.

FIGURE 13: For a question requiring calculating a percentage of a percentage, the proportions of students who chose each alternative answer, for the total cohort and for each proficiency band. 

The pattern of performance on this question illustrates the extent to which students struggle to interpret the precise language used to describe percentages and, as in the previous example, that many of them are unable to integrate information from two different data representations.
Conclusion
In this article, the results are presented for a large sample of schoolleavers from across South Africa who wrote one version of the NBTP QL test in 2014 and who obtained NSC results that qualified them for entry into higher education in 2015. Nearly 70% of the scores for the test are in the lower two proficiency bands, with 36% in the ‘Basic’ band and 13% in the ‘Proficient’ band. This suggests that the majority of candidates aiming to enter higher education are in need of some kind of supplementary support for developing their QL, while more than a third will require extensive support. This support could be in the form of foundation courses, supplementary tutorials integrated into disciplinary curricula or online provision. Teachers and lecturers should be mindful of the assumptions they make about students’ QL competencies and should at all times attempt to make the implicit literacy practices of their disciplines more explicit.
Students who wrote NSC Mathematics perform considerably better on the NBT QL test than those who wrote NSC Mathematical Literacy. Nearly twothirds of those who wrote NSC Mathematical Literacy have scores in the ‘Basic’ category, compared to 30% of the NSC Mathematics candidates. Less than 1% of those who wrote NSC Mathematical Literacy have a ‘Proficient’ score, indicating that in general the school subject Mathematical Literacy does not prepare students to cope with the QL demands of higher education.
Comparison of the NBT QL scores and the NSC Mathematics results reveals that the NSC Mathematics result is a very poor predictor of the QL score. For any particular NSC Mathematics score there is a large range of NBT QL scores, but this range is particularly large for the lower NSC Mathematics scores. In general, the performance on the NBT QL test is lower than for the NSC Mathematics. A similar weak relationship is observed when the NBT QL scores are compared to the NSC Mathematical Literacy scores. In addition, the difference in overall performance is much greater (with a median score for the NBT QL test approximately half of the median for NSC Mathematical Literacy).
The scores on subgroups of items classified according to the main competencies they were designed to assess are also considered. Threequarters of the candidates score below the ‘Proficient’ level in all of these subgroups except the one mainly requiring recognising patterns or predicting terms in a sequence, a competency that is probably quite well developed in school mathematics. The weakest performance is in the subgroup that requires candidates to interpret problem statements and calculate or estimate answers. This could reflect difficulties experienced with interpreting the problem statements (the majority of students are not firstlanguage English speakers), but probably also points to students’ poor number sense and dependence on calculators, as most questions in this subgroup require estimation. This kind of analysis of the scores on subgroups of items provides diagnostic information that can inform the design of interventions to address the specific needs of students in terms of the competencies to be developed. Due to the contextbound nature of QL, teachers and lecturers need to use this information to develop their own curriculum solutions in their own disciplinary and classroom contexts. Where large numbers of students are not firstlanguage English speakers, and the possibility exists that language difficulties contributed to poor performance, it should possibly be assumed that some form of AL intervention will also be required in order to help students cope with the language demands of their quantitative studies.
In the last part of the article, close examination of patterns of performance on examples of individual items are presented to illustrate how diagnostic information can be derived from this kind of analysis. These examples show that many students have difficulty with quantitative language, especially the language used to describe percentages and absolute and relative quantities. Students generally also have difficulties with interpreting data in tables, especially percentages, and many struggle with proportional reasoning. These are all examples of concepts and competencies essential for practising QL in many academic disciplines. In a society where many arguments and the understanding of many situations and problems draw on quantitative data, competencies like these are also essential for effective and critical citizenship. It would therefore be essential to ensure that students develop these competencies in all appropriate contexts in school and also as graduate attributes in higher education.
These results stress that there is a lack of alignment between the exit level outcomes from schooling and the expectations of higher education in terms of QL. Higher education institutions need to recognise this fact and modify curricula accordingly. At the same time it would be productive if the school sector could give more attention to the development of this vital literacy, which is needed not only for higher education, but also for critical citizenship.
Acknowledgements
We thank the NBT project team at the Centre for Educational Testing for Access and Placement at the University of Cape Town who provided the opportunity to conduct this research, with the goal of contributing to the NBT project’s purpose of assessing the relationship between entry level proficiencies and schoollevel exit outcomes.
Competing interests
The authors declare that we have no financial or personal relationships that might have inappropriately influenced us in writing this article.
Authors’ contributions
R.P. and V.F. contributed equally to the conceptualisation and execution of this research and the production of the manuscript.
References
Centre for Educational Testing for Access and Placement. (2015). National benchmark tests project national report: 2015 intake cycle. Cape Town: Centre for Educational Testing for Access and Placement. Available from http://webcms.uct.ac.za/sites/default/files/image_tool/images/216/NBTPReport_2015.pdf
Chapman, A., & Lee, A. (1990). Rethinking literacy and numeracy. Australian Journal of Education, 34(3), 277–289. https://doi.org/10.1177/000494419003400305
Department of Basic Education. (2009). National examinations and assessment. Report on the National Senior Certificate examination results 2009. Pretoria: DBE. Available from http://www.education.gov.za/LinkClick.aspx?fileticket=l3hlVk9sypk%3d&tabid=92&portalid=0&mid=4359&forcedownload=true
Department of Basic Education. (2011a). National Curriculum Statement (NCS). Curriculum and assessment policy statement. Grades 10–12. Mathematics. Pretoria: DBE.
Department of Basic Education. (2011b). National Curriculum Statement (NCS). Curriculum and assessment policy statement. Grades 10–12. Mathematical Literacy. Pretoria: DBE.
Department of Education. (2003). National Curriculum Statement Grades 10–12 (General). Mathematical Literacy. Pretoria: DOE.
Department of Education. (2008). Higher education Act, 1997 (Act 101 of 1997). Minimum admission requirements for higher certificate, diploma and bachelor’s degree programmes requiring a National Senior Certificate. Pretoria: DOE.
Foxcroft, C. (2006). The nature of benchmark tests. In H. Griesel (Ed.), Access and entry level benchmarks, the National Benchmark Tests Project (pp. 7–16). Pretoria: Higher Education South Africa.
Frith, V., & Gunston, G. (2011). Towards understanding the quantitative literacy demands of a firstyear medical curriculum. African Journal of Health Professions Education, 3(1), 19–23. Available from www.ajhpe.org.za/index.php/ajhpe/article/view/120/40
Frith, V., & Lloyd, P. (2016). Proportional reasoning ability of schoolleavers aspiring to higher education in South Africa. Pythagoras, 37(1), a317. https://doi.org/10.4102/pythagoras.v37i1.317
Frith, V., & Prince, R. (2006). Quantitative literacy. In H. Griesel (Ed.), Access and entry level benchmarks, the National Benchmark Tests Project (pp. 28–34, 47–54). Pretoria: Higher Education South Africa.
Frith, V., & Prince, R. (2009). A framework for understanding the quantitative literacy demands of higher education. South African Journal of Higher Education, 23(1), 83–97. https://doi.org/10.4314/sajhe.v23i1.44804
Gal, I., Van Groenestijn, M., Manly, M., Schmitt, M.J., & Tout, D. (2005). Adult numeracy and its assessment in the ALL survey: A conceptual framework and pilot results. In T. Scott Murray, Y. Clermont, & M. Binkley (Eds.), International adult literacy survey. Measuring adult literacy and life skills: New frameworks for assessment (pp. 137–191). Ottawa: Statistics Canada. Available from http://www.statcan.gc.ca/pub/89552m/89552m2005013eng.pdf
Griesel, H. (Ed.). (2006). Access and entry level benchmarks, the National Benchmark Tests Project. Pretoria: Higher Education South Africa.
Hambleton, R.K., & Pitoniak, M.J. (2006). Setting performance standards. In R.L. Brennan (Ed.), Educational measurement (4th edn., pp. 433–470). Westport, CT: Greenwood/Praeger.
Holland, P.W., & Dorans, N.J. (2006). Linking and equating. In R.L. Brennan (Ed.), Educational measurement (4th edn., pp. 187–220). Westport, CT: Greenwood/Praeger.
HughesHallett, D. (2001). Achieving numeracy: The challenge of implementation. In L.A. Steen (Ed.), Mathematics and democracy: The case for quantitative literacy (pp. 93–98). Washington, DC: National Council on Education and the Disciplines. Available from http://www.maa.org/sites/default/files/pdf/QL/MathAndDemocracy.pdf
Johnston, B. (2002). Numeracy in the making: Twenty years of Australian adult numeracy. An investigation by the New South Wales Centre, Adult Literacy and Numeracy Australian Research Consortium. Sydney: University of Technology Sydney.
Johnston, B. (2007). Critical numeracy? In S. Kelly, B. Johnston, & K. Yasukawa (Eds.), The adult numeracy handbook. Reframing adult numeracy in Australia (pp. 50–56). Sydney: Adult Literacy and Numeracy Australian Research Consortium.
Kelly, S., Johnston, B., & Baynham, M. (2007). The concept of numeracy as social practice. In S. Kelly, B. Johnston, & K. Yasukawa (Eds.), The Adult numeracy handbook. Reframing adult numeracy in Australia (pp. 35–49). Sydney: Adult Literacy and Numeracy Australian Research Consortium.
McKenna, S. (2009). Cracking the code of academic literacy: An ideological task. In C. Hutchins, & J. Garraway (Eds.), Beyond the university gates: Provision of extended curriculum programmes in South Africa. Proceedings of the January 2009 Rhodes University Foundation Seminar (pp. 8–15). Grahamstown: Rhodes University. Available from http://www.cput.ac.za/storage/services/fundani/beyond_the_university_gates.pdf#page=9
Peden, M., & Butchart, A. (1999). Trauma and injury. In N. Crisp, & A Ntuli (Eds.), South African health review 1999 (pp. 331–344). Durban: Health Systems Trust. Available from http://www.hst.org.za/sites/default/files/chapter24_99.pdf
Prince, R. (2016). Predicting success in higher education: The value of criterion and normreferenced assessments. Practitioner Research in Higher Education, 10(1), 22–38. Available from http://194.81.189.19/ojs/index.php/prhe/article/viewFile/323/449
Prince, R., & Simpson, Z. (2016). Quantitative literacy practices in civil engineering study: Designs for teaching and learning. In A.M. Nortvig, B.H. Sørensen, M. Misfeldt, R. Ørngreen, B.B. Allsopp, B. Henningsen, et al. (Eds.), Proceedings of the Fifth International Conference on Designs for Learning (pp. 189–204). Aalborg: Aalborg University Press. Available from http://vbn.aau.dk/files/233636459/Proceedings_of_the_5th_International_Conference_on_Designs_for_Learning.pdf
R Core Team. (2016). R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing. Available from https://cran.rproject.org/doc/manuals/rrelease/fullrefman.pdf
Scott, I., Yeld, N., & Hendry, J. (2007). Higher education monitor no. 6: A case for improving teaching and learning in South African higher education. Pretoria: The Council on Higher Education. Available from http://www.che.ac.za/sites/default/files/publications/HE_Monitor_6_ITLS_Oct2007_0.pdf
Steen, L.A. (2004). Achieving quantitative literacy: An urgent challenge for higher education. Washington, DC: The Mathematical Association of America.
Street, B. (2005). Applying new literacy studies to numeracy as social practice. In A. Rogers (Ed.), Urban literacy. Communication, identity and learning in development contexts (pp. 87–96). Hamburg: UNESCO Institute for Education.
Street, B., & Baker, D. (2006). So, what about multimodal numeracies? In K. Pahl, & J. Rowsell (Eds.), Travel notes from the new literacy studies: Case studies of practice (pp. 219–233). Clevedon, UK: Multilingual Matters Ltd.
Wickham, H. (2009). ggplot2: Elegant graphics for data analysis. New York, NY: SpringerVerlag.
Yen, W.M., & Fitzpatrick, A.R. (2006). Item response theory. In R.L. Brennan (Ed.), Educational measurement (4th edn., pp. 111–153). Westport, CT: Greenwood/Praeger.
