Including probability and statistics in the core curriculum of mathematics in South African schools has made it necessary to train teachers to teach statistics at high school level. This study concentrates on practising mathematics teachers who were students in an inservice programme. The purpose of the study was to investigate students’ success rates on different questions of a multipart task based on the normal distribution curve. The theory that I used to understand the students’ difficulties is Duval’s theory about movement within and between semiotic representation systems, called treatment transformations and conversion transformations respectively. The first two parts of the problem were unknown percentage problems and involved a treatment followed by a conversion. The third was an unknown value problem and required a conversion before the students could undertake a treatment transformation.
The findings reveal that the success rates the students achieved in treatment transformations were higher than those they achieved in conversion transformations. The study also revealed that the direction of the conversions played a role in success rates. Recognising the different challenges the two types of transformations pose requires that teachers pay particular attention to actions that involve movement between different representation systems.
Including probability and statistics in the core curriculum of mathematics in South African schools has made it necessary to train teachers to teach statistics at high school level. Although the normal distribution curve is not part of the school curriculum, it is part of a basic course in statistics that aims to equip teachers to teach probability and statistics up to Grade 12 level.This exploratory study was conducted with 290 inservice secondary school mathematics teachers who had enrolled in an inservice mathematics programme. It focuses on one multipart problem, which was part of the course summative assessment and includes ‘unknown percentage’ and ‘unknown value’ problems (Watkins, Scheaffer & Cobb, 2004). In unknown percentage problems, students first transform a given value into an associated zscore using the standardisation process. Thereafter, students identify the probability associated with the zscore and interpret the value in terms of the graph. This involves working simultaneously with properties of the standard normal distribution and the properties of particular ztable values. In unknown value problems, students have the percentage and they have first to identify the zscore from a table of zvalues that corresponds to the given percentage by working with the properties of the standard normal distribution. Thereafter, they calculate the xscore by ‘unstandardising’ it, or reversing the standardising process. In this article, I refer to these inservice teachers as students because they were participants in the programme. In analysing the students’ performance, I drew on Duval’s (2006) framework for transforming semiotic representations, where he distinguishes between transformations that occur within the same system of representations (treatments) and those that involve a change of register (conversions). The purpose of this study was, firstly, to investigate whether there were differences in students’ success rates on the two types of transformations (conversions and treatments) that are inherent in one multipart problem and, secondly, to investigate whether the direction of the conversion transformations influenced success rates. Students experience unknown value and percentage problems as challenging for many reasons, including because they involve applying and not just recalling the properties of the normal distribution curve. In this article I report one particular aspect of the challenges. I am looking at students’ proficiency in carrying out treatment and conversion transformations and investigating whether the differential engagement with these two types of transformations could account for the differences in success rates. In doing so, I do not suggest that this is the only factor that accounts for the challenges associated with these types of problems.
Some literature on the normal distribution curve
Reading and Canada (2011) think that distribution of data is a fundamental concept in its own right, but that it is complex despite its relatively straightforward definition. One can see probability distributions as even more complex and understanding the differences between data distributions and probability distributions is a key step in statistical reasoning (Cohen & Chechile, 1997). The authors’ comment that, despite the emphasis on handson data analysis and alternative methods of inference, the concept of probability distributions should be part of all introductory statistics courses. Unlike data distributions, probability distributions are formal theoretical models statisticians use to describe the likelihood of a variable taking on a value or a range of values. It is this theoretical nature that brings out contrasts between probability and data, thereby helping students develop ideas about stochasm (Cohen & Chechile, 1997, p. 2). Wilensky (1997) regards probability distributions as a key concept in probability and statistics because of their importance in understanding statistical models in scientific research and because they stand ‘at the interface between the traditional study of probability and the traditional study of statistics’ (p. 175), and therefore provides an opportunity to make strong connections between the two fields. Concern about the lack of research into students’ understanding of the normal distribution led to Pfannkuch and Reading (2006) publishing a special issue of the Statistics Education Research Journal. It focused on reasoning about distributions and provides suggested research questions that could address various aspects of reasoning about distributions, including one about the ‘difficulties that students encounter when working with analysing and interpreting distributions’ (p. 5). Bakker and Gravemeijer (2004) regard a distribution as a conceptual entity for thinking about variability in data. Pfannkuch and Reading (2006) warn that any discussion about the nature of distributions needs to include a conceptual perspective (which clarifies the notions that underpin distributions and why they are important) and an operational perspective (which explains how distributions capture, display and manipulate specific sets of data). Reading and Reid (2006) included both perspectives in their development of a twocycle hierarchy of reasoning about distributions, based on the application of the structure of observed learning outcomes (SOLO) taxonomy. The first cycle involved understanding key elements whilst the second, more cognitively sophisticated levels, involved using those elements. Pfaff and Weinberg (2009) believed that actively generating data before analysing them would increase understanding of the statistical concepts. One may see this as indicative of the operational perspective that Pfannkuch and Reading (2006) described. However, their study (Pfaff & Weinberg, 2009) found that, despite the fact that their students actively generated data, their students’ performance in their postactivity assessments was no better than it was in their preactivity performance. Carlson and Windquist (2011), in their comment about these unexpected results, argued that Pfaff and Weinberg were correct in concluding that ‘the physical act of generating data was not sufficient to produce learning’ (p. 3). However, they disagreed with the conclusion that the authors (Pfaff & Weinberg, 2009) drew that ‘active learning approaches in general are ineffective’ (Carlson & Windquist, 2011, p. 3). North and Zewotir (2006) move beyond considering only the approach to teaching statistics. They question the content that introductory statistics courses should cover. They call for a rethink of the statistics courses for social scientists and argue for courses that focus on how to use descriptive statistics instead of focusing on calculations like those based on grouped data. They advise that the courses should devote more time to understanding principles and developing statistical reasoning by using rich contexts. (North & Zewotir, 2006). However, the situation of social scientists, who are learning how to interpret and use statistics when studying socioeconomic phenomena, is different to that of teachers who are learning how to teach statistics to school children – the context of the current study. Reading and Canada (2011) describe two studies about the statistical reasoning of elementary teachers. Both studies ‘firmly cast the teacher in the role of the learner’ (p. 229) In the current study, the teachers were also the learners in a basic course in statistics that aimed to equip them to teach probability and statistics up to Grade 12 level. The module covered aspects of statistics like central tendencies, grouped data, distributions, bivariate data, regression, probability concepts and probability distributions. A concept like the normal distribution curve is not part of the school curriculum. However, one can see it as an example of what Ball, Thames and Phelps (2008) call horizon knowledge. This is an ‘awareness of how mathematical topics are related over the span of mathematics included in the curriculum’ (p. 403) and is one of the six domains that comprise their model of mathematical knowledge for teaching. Having knowledge of the horizon can help teachers make decisions about how to teach concepts like variation, distributions and other statistical topics.
The analytic framework
A set of elementary signs, a set of rules for producing and transforming signs as well as an underlying meaning structure that derives from the relationship between the signs within the system characterise a semiotic system (Ernest, 2006). Radford (2001) has argued that using signs and tools modify our cognitive functions. On the other hand, Ernest (2006) says that a focus on signs and sign use is the characterising feature of a semiotic perspective of mathematical activity that provides a way of conceptualising the teaching and learning of mathematics. Each semiotic system has its own specific way of working. Duval (2006) points out that the role semiotic systems of representation play is not only to designate mathematical objects or to communicate but also to work on, and with, mathematical objects. Duval asserts that two different types of transformations of semiotic representations can occur during any mathematical activity. The first type, called treatments, involves transformations from one semiotic representation to another within the same system or register (Duval, 2006, p. 110). Duval (2002) argues that the treatments that one can perform depend on the register one uses and: the procedures for carrying out a numerical operation depend just as much on the system of representation used for the numbers as on the mathematical properties of the of the operation. (p. 111) He illustrates his argument with the fact that the algorithm for adding fractions
is different for a decimal notation and a fractional notation of the same numbers
(0.2 + 0.25 as opposed to ). Furthermore, when dealing with treatments, the semiotic system eases the connection of different representations because the rules of the semiotic system link different representations of the same object. The second type, called conversions, involves changing the system but retaining the reference to the same objects (Duval, 2006, p. 112). In order to illustrate the differences between treatments and conversions further, I will use an example from transformation geometry. Consider a point A (2; 3) on the Cartesian plane with the required transformation on A being a clockwise rotation of 90° around the origin. A person can perform the transformation on A by applying the algebraic rule (x, y) → (y, x) to get the result A (3; 2). This transformation is an example of a treatment because it does not require a change in the system of representation because, after applying the formula, the object is being described by the same representation. A study by Bansilal and Naidoo (2012), on learners’ engagement with transformation geometry, identified a learner who considered the representation of A in a different register by identifying the location of the point A(2; 3) on the Cartesian plane before performing the rotation transformation. This movement (from the twocoordinate description of A to the location of the point A in the Cartesian plane) is an example of a conversion transformation because the register has changed but not the object (point A). Thereafter the learner worked out the resulting location of the point when he rotated it 90° through the origin by interpreting the motion within the new register. He then identified the location of the rotated point and thereafter assigned the coordinates based on its new position (Bansilal & Naidoo, 2012). This example illustrates how it is possible for one to perform a transformation using the same representation system (a treatment) and how one could perform it using a representation from a different register. However, the second case needed a conversion transformation to move to the different register of representation before one could perform a treatment using the second system of representation. Duval gives conversions a more central role in understanding mathematics than he does to treatments and regards conversions as a cognitive threshold that is the main cause of learning difficulties in mathematics. He argues that one cannot reduce a conversion of a representation (change of register) to a treatment. Therefore, conversions account for one of the sources of incomprehension in mathematics. He believes that ‘we cannot deeply analyse and understand the problem of mathematics comprehension for most learners if we do not start by separating the two types of representation transformation’ (Duval, 2006, p. 127). Duval’s contention is that treatments command more attention in mathematics whilst conversions cause the greatest difficulties in mathematics. He argues that conversions only become relevant because we need to choose ‘the register in which the necessary treatments can be carried out most economically or most powerfully’. Another reason he suggests for using conversions is that they provide ‘a second register to serve as a support or guide for the treatments being carried out in another register’ (p. 127). The Visualiser/Analyser (VA) model of Zazkis, Dautermann and Dubinsky (1996), which specifies two elements (visualisation and analysis) as two interacting modes of thought, may help us develop an insight into the effort students require to understand conversion transformations. The model describes a series of movements between visual and analytic representations, each of which is mutually dependent in problem solving rather than unrelated opposites. In their model, the thinking begins with an act of visualisation, V1 (see Figure 1). It could consist of looking at some ‘picture’ and constructing mental processes or objects. The next step is an act of analysis, A1, which consists of some kind of coordination of the objects and processes constructed in step V1. This analysis can lead to new constructions. In a subsequent act of visualisation, V2, learners return to the same ‘picture’ they used in V1. However, because of the analysis in A1, the picture has changed. As learners repeat the movement between the V and A, they use each act of analysis, based on the previous act of visualisation, to produce new and richer visualisations that they then subject to more sophisticated analyses. This creates a spiral effect. In this model, the acts of analysis deepen the acts of visualisation and vice versa. It is also important to note that, according to this model, as learners repeat the horizontal motion in the model, the acts of visualisation and analysis become successively closer. At first, the passage from one to the other may represent a major mental effort. However, the two kinds of thought become gradually more interrelated and the movement between them becomes less of a concern. The VA model suggests that the repetition of these successive visual and analytic acts move closer together over time. The implication of this is that this fusion occurs when learners are able to see the properties of the object emerging from the various representations as a whole and can appreciate that the different representations of the same object emphasise different properties of the object. However, it is still one object, like seeing the object from different perspectives. At the stage when learners can see past the differences in representations and understand the connections between the properties revealed by the different registers, then conversion transformations are less likely to present barriers. Therefore, the VA theory suggests, that it is at this stage when the two kinds of perspectives merge, that the ease of conversion transformations may be facilitated. On the other hand, when learners view representations from two registers as being separate and unconnected, conversion transformations would be more laborious because the learners do not appreciate the links between the properties that each representation conveys.
The study utilised an interpretive approach because the main goal of the study was to understand the students’ interpretations of reality (Cohen, Manion & Morrison, 2000) when it comes to solving problems based on the normal distribution curve.The participants were 290 practising teachers who had enrolled in an inservice programme designed to upgrade and retrain mathematics teachers in the Further Education and Training (FET) band. The programme was for an Advanced Certificate in Education (ACE) with a Mathematics FET specialisation. The programme consisted of eight modules, four of which were specific to mathematics, two of which were generic education modules and two were mathematics education modules. This article focuses on one of the four mathematics modules devoted to a study of introductory probability and statistics suitable for teachers of FET mathematics. The test items was selected in the module specifically for assessment and research purposes and presented the threepart task as part of a summative classroom assessment, which included questions from other sections of the module. One can regard the analysis of the students’ responses as content analysis to throw ‘additional light on the source of communication, its author, and on its intended recipients, those to whom the message is directed’ (Cohen et al., 2000, p. 165). In this case, the students’ responses are the source of the communication intended to convey their engagement with the concept. The research questions that focused on one multipart problem based on the normal distribution are: • Are students more likely to succeed in completing the treatment or conversion transformations the problem requires?
• What role does the direction of the conversion transformations play in the students’ success rates? The data analysis process involved studying the responses of the 290 students in order to understand the ‘what’, the ‘why’ and the ‘how’ that underlies the data (Henning, 2004). Dey (1993, p. 30) describes data analysis as ‘a process of resolving data into its constituent components to reveal its characteristic elements and structure’. The students’ responses were broken down into constituent parts that reflected phases of treatments and conversions. I did this to classify and make connections between the data elements (Henning, 2004, p. 128). This means presenting ‘the operations by which data are broken down, conceptualised, and put together in new ways’ (Strauss & Corbin, 1998, p. 120) in order to assess their responses in terms of movement within the same system or between different systems. The students’ responses were then categorised into various categories according to their written explanations. The findings (see below) explain the specific coding, with examples.
Ethical considerations and recruitment procedures
The participants in this study were the teachers who had enrolled in the particular ACE programme. All students signed informed consent forms and agreed that their responses could
be used on condition that no real names or personal details would be revealed. No student refused permission.
Reliability and validity
The test items were carefully selected after discussing them with a colleague from the United States of America (USA). I ensured that the questions were ones that the students would have encountered in their learning during the course. The language was sufficiently basic to ensure that most students would understand it. I coded the responses myself. However, discussions with an experienced statistics education researcher constituted peer debriefing to improve the credibility of the analysis. Peer debriefing occurs when researchers describe the research to peers who ask the ‘why’ and ‘so what’ questions and may suggest alternative frameworks.
The test items
The tasks used an application of the properties of the standard normal distribution as its basis. When the distribution of a variable in a set of data is approximately normal, one can use the properties of the standard normal distribution curve to make inferences about the variable under discussion. The standard normal distribution has a mean of 0 and a standard deviation of 1. One refers to the scores as zscores in the standard normal distribution. Converting to standard units, or standardising, is the twostep process of recentring and rescaling that turns any normal distribution into the standard normal. Firstly, one recentres all the values in the normal distribution by subtracting the mean from each. This results in a distribution with a mean of 0. Thereafter, one divides all the values by the standard deviation (rescaling). This results in a distribution with a standard deviation of 1. This process of recentring and rescaling allows one to solve problems like the unknown percentage problem (Question 1 and Question 2) and unknown value problem (Question 3) (Watkins et al., 2004). Students encountered both types of questions during class discussions and assessments. In unknown percentage problems, students first transform a given value into an associated zscore by recentring and rescaling. The next step, in which students identify the probability associated with the zscore and interpret its value in terms of the graph, involves working simultaneously with properties of the standard normal distribution and the properties of particular ztable values. Unknown value problems require students first to identify the zscore from a table of zvalues that corresponds to a given percentage. Thereafter, they calculate the xscore by ‘unstandardising’ it, or reversing the standardising process. The questions under scrutiny in this study are: A university entrance examination scores are scaled so that they are approximately normal. The mean is about 505 and the standard deviation is about 111. 1. Find the probability that a randomly selected student has a score below 400.
2. Find the probability that a randomly selected student has a score between 450 and 600.
3. The school will offer scholarships to students scoring in the top 10%. What score should be used to decide who should be offered scholarships?
Remarks about, and solutions to, test items
Note that these types of questions were familiar to the students because part of the course was devoted to solving such problems using applications of the normal distribution curve. Defining the random variable X is important for computing the probabilities associated with the random variable.In this case, the random variable is the entrance examination scores, which have a normal distribution.
In order to solve this problem, students received a formula sheet that contained the standardisation formula
.
The students could use scientific calculators. Different statistics textbooks use different tabulation values of a standard normal curve area for a given positive value z_{0},_{ }like P(0 < Z < z_{0}) or P(Z < z_{0}) or P(Z > z_{0}), where these are associated with the area of the corresponding sectors. In the lectures and the assessments, the ztable the students had used was P(0 < Z < z_{0}) for positive z_{0}. In order to answer these questions, it is necessary for students to use properties that apply to the standard normal distribution, like having a mean of 0, a standard deviation of 1 and an area (under the curve) of 1. The area under the curve is the probability. The symmetry of the curve means that the area to the left of 0 is equal to the area to the right of 0. Because of symmetry at 0, P(z_{0 }< Z < 0) = P(0 < Z < z_{0}) and P(Z < z_{0}) = P(Z > z_{0}), where z_{0} is positive and z_{0} is negative.
Question 1: We need to find P(x < 400) = ?
The unknown percentage problem requires students to calculate the corresponding zscore from the given xscore
using the process of ‘standardising’:
.
Students then identify the percentage that corresponds to the zscore from the ztable and interpret it. Figure 2 shows the categorisation of the steps as treatments and conversions.
Table 1 presents a summary of the solution with explanatory comments and diagrams.
Question 2: Here we need P(450 < x < 600) = ?
Figure 3 is a diagram that explains the decomposition of the problem into treatments and conversions.Table 2 presents a summary of the solution with explanatory comments and diagrams.
Question 3: We need to find an xscore so that P(x > ?) = 0.1
The unknown value problem requires students first to identify the zscore from a table of zscores that corresponds to the given percentage. The students can then calculate the xscore by ‘unstandardising’, or reversing the standardising process. Figure 4 shows the categorisation of the steps as treatments and conversions. The diagram in Figure 4 has the arrows reversed from Question 1 to show that the direction of the solution is opposite to that of Question 1. Table 3 presents a summary of the solution with explanatory comments and diagrams.
One can regard the standardisation procedure as a treatment transformation because it is within the same register. The xscore is the input and the zscore is the output of the procedure. As Duval predicted, most students did not experience problems at this point. For Question 1 and Question 2, students could complete the standardisation procedure in one register with the visualisation serving only as ‘a second register to serve as a support or guide for the treatments being carried out in another register’ (Duval, 2006, p. 127).For Question 3, the situation is a bit different because the conversions are necessary because we need to choose ‘the
register in which the necessary treatments can be carried out most economically or most powerfully’ (Duval, 2006, p. 127)
which permitted the process of unstandardising of the zscore. One could not access the zscore without doing a c
onversion operation which would allow movement from the percentage value to the zscore.

FIGURE 2: Question 1 broken down in terms of conversions and treatments.



FIGURE 3: Question 2 broken down in terms of conversions and treatments.



FIGURE 4: Question 3 broken down in terms of conversions and treatments.


TABLE 1: Solution to Question 1 with explanations.

TABLE 2: Solution to Question 2 with explanations.

TABLE 3: Solutions to Question 3 with explanations.

It is necessary to distinguish between direct and inverse problems (Groetsch, 1999) in this study. A direct problem is one that asks for an output when students have the input and the process. In an inverse problem, students have the output and the problem could ask for the input or the process that led to the output. One can regard Question 1 as a direct problem and Question 2 as a two–step direct problem. One can regard Question 3
as an inverse problem because it consists of a conversion that takes a pvalue and converts it to a zscore.
The ztable is organised according to the zscores. For a given zvalue, students can read off a
corresponding pvalue. In Question 3, the students had a probability value and had to scan the tables until they
identified a suitable zscore that corresponded to the given probability. Secondly, the formula in the formula
sheet was the standardisation formula .
In Question 1 and Question 2, the students used the formula in the form presented. The value x was the input and the
output was z. However, for Question 3, the students had output z and they had to calculate the input. Therefore,
one can regard Question 3 as a combination of two inverse problems and as an inverse problem in the way that Groetsch (1999) described.In order to present the analysis, the students’ responses are labelled to serve as references, for example, S17 which means the response was that of student 17. Students’ responses are labelled from S1 to S290. The students’ responses are verbatim, although the layout has been changed because of limited space.
Findings for Question 1
× Blank or unrelated algorithm
Here responses were coded blank if students made no attempt. A response was coded as unrelated algorithms if students wrote a formula
where the algorithms did not relate to the standardisation procedure. Two examples follow:
■ Partial treatments (PT)
Here responses were coded as partial treatments (PT) if students wrote the appropriate standardisation formula but did not substitute the correct values
or substituted the correct values but did not compute the result correctly, for example:
■ Complete or full treatments (FT)
Here responses were coded as complete or full treatments (FT) if students completed the standardisation and arrived at the correct
figure of 0.945 or if they wrote the value 0.945 as the value they would read off from the ztable. If they went on to other
steps that were incorrect, then the responses were coded as FT. For example, some students (12) did not read off a pvalue from
the ztable and interpreted the zscore as a probability. An example follows:
Some students continued and used the resulting ‘probability value’ (obtained as for S1)
to determine a zscore in the ztable. An example follows:
Here the student used the zscore (0.946) as a probability value, found the zscore that corresponded
to the ‘probability value’ and presented the zscore (1.83) as a probability (even though it was greater than 1).
■ Partial conversions (PC)
Responses were coded as partial conversions (PC) if students determined a pvalue from the ztable that corresponded to the zscore even if the value was not accurate as long as there was a reading of
a pvalue from a related zscore. An example follows:
where 0.95 corresponds to a zscore of 33.65%
■ Complete or full conversions (FC)
Here responses were coded as complete or full conversions (FC) if students interpreted the pvalues of the ztable in terms of the area under the curve to provide correct (or nearly correct) answers.Each step depends on the previous step. Therefore, a student who completed an FC, would have done the PC, FT and PT steps. Table 4 shows that, of the 290 students, 223 (77%) were able to recognise
the correct standardisation formula. Only 199 (69%) were able to complete the standardisation procedure correctly.
Fiftyfive (19%) performed partial conversions and 79 (27%) completed the conversions and the question.
In order to get a clearer idea of how the students progressed from the treatment steps to the conversion steps, we can consider the cumulative totals: • the number of students who managed partial treatments will include those who completed the treatments
• those who completed the treatments will include those who managed partial conversions
• those who managed partial conversions will include those who completed full conversions. The bar graph in Figure 5 gives these numbers. Of the 290 students, 223 (77%) students began the appropriate standardisation procedure. Of these 223 students, 199 (89%) completed the standardisation treatments and of these, 134 (67%) were able to complete the first part of the conversions. Seventynine (59%) of the last group were able to complete the conversions correctly.
Findings for Question 2
The following codes were used for Question 2. It is not necessary to give examples of responses in all categories because they are similar to those for Question 1 except that there are two sets of treatments and conversions. × Blank or unrelated algorithm ○ Partial treatments (PT), where students chose the appropriate standardisation formula (in one or in both cases) but did not complete both. ● Full or complete treatments (FT), where students
completed the standardisation procedure in one or in both
cases but completed no further correct steps. □ Partial conversions (PC), where students read off a pvalue from the ztable in one or in both cases, but did not combine
the two pvalues correctly, for example:
■ Full or complete conversions (FC), where students
interpreted the pvalues of the ztable in terms of the area under curve to provide correct (or nearly correct) answers.Table 5 shows that, of the 290 students, 174 (60%) started one or both standardisation procedures,
whilst only 156 (54%) were able to complete one or both standardisation procedures correctly.
Only 40 students (14%) completed the questions correctly (two of whom had a final answer that
differed slightly from the expected one).
In order to get a clearer idea of how the students progressed from the treatment steps to the conversion steps, I considered the cumulative totals from right to left:• the number of students who managed partial treatments will include those who completed treatments
• those who completed treatments will include those who managed partial conversions
• those who managed partial conversions will include those who completed full conversions. The bar graph in Figure 5 gives these numbers. Of the 290 students, 174 (60%) were able to recognise the correct standardisation formula, whilst only 156 (90%) of these student were able to complete it correctly once or twice. Of these 156 students, 96 (62%) completed only the first part of the conversions once or twice (they read off the pvalue for the corresponding zscore). Only 40 (42%) of these were able to complete the conversions and arrive at the correct result.
Findings for Question 3
The following codes were used for Question 3:× Blank or unrelated algorithm □ Partial conversions (PC), where students interpreted the percentage value given as a pvalue, which was the correct
one (p = 0.4), but did not carry out any further correct steps
or could have interpreted the percentage as an incorrect
pvalue.
■ Full or complete conversions (FC), where students read off
pvalues in a ztable to generate a zscore which was correct or incorrect; students who completed full conversions all continued.○ Partial treatments (PT), where students chose the appropriate formula for unstandardising a zscore.
● Full or complete treatments (FT), where students
completed the procedure for unstandardisation correctly
or nearly correctly.
The response of S133’s was coded almost correct compared to that of S135, where the final answer was not close to the expected one.Table 6 shows that, of the 290 students, 108 students did not respond and 34 used an irrelevant algorithm. Therefore, 142 (49%) did not even begin partial conversions. Seventyeight (27%) tried but did not generate the correct pvalue whilst 20 (7%) students completed partial conversions by correctly extracting the pvalue from the information the students had. Three (1%) students completed the conversions and started the unstandardising treatments, whilst 47 (15%) students managed complete treatments and obtained a correct or almost correct solution (the final answer that 26 students reached differed slightly from the expected answer). In order to get a clearer idea of how the students progressed from the conversion steps to the treatment steps, I considered the cumulative totals from right to left: • the number of students who completed partial conversions will include those who completed full conversions
• those who completed full conversions will include those who completed partial treatments
• those who completed partial treatments will include those who completed full treatments. The bar graph in Figure 6 gives these figures. There were 148 (51%) students who started the conversions (obtained pvalues). Of these 148 students, 50 were able to complete the conversions by reading off pvalues and chose the correct formula for unstandardising. That is, 34% completed the conversions (read off the pvalues for the corresponding zscore) and started treatments whilst 47 (94%) of the
50 students were able to complete the treatments and solve the problem (the final answers of 26 students differed slightly from the expected one).
Performance on the three questions
Students clearly found that Question 2 was more challenging than Question 1 was. Only 40 students got Question 2 correct whilst 79 students managed to complete Question 1 correctly – almost twice as many. Furthermore, there were 67 blank or incorrect algorithms for Question 1 compared to 116 for Question 2. This showed that more students did not attempt to solve Question 2 than those who failed to attempt Question 1. It is clear that Question 2 was more complex than Question 1 because it involves regions bounded by two given xscores. Therefore, there were two sets of treatments as well as two sets of partial conversions and completing the conversions meant that students had to take a global view of the two areas and decide how they would use them to generate the required percentages. Consequently, solving Question 2 would have been more demanding than just carrying out treatments followed by conversions, as Question 1 required. Question 3 was challenging for the 142 (49%) students who did not start correctly. Fortyseven completed the whole question correctly or almost correctly. This was more than the 40 who completed Question 2 correctly or almost correctly but fewer than the 69 who completed Question 1 correctly or almost correctly. If one compares performance on Question 3 with that on Question 1, 67 students did not start Question 1 correctly. On the other hand, there were more than twice as many (142) students who did not begin Question 3 correctly. There are two possible reasons for this. Firstly, the inverse nature of the question meant that the steps to the solution were reversed, which made it more complex (Bansilal, Mkhwanazi & Mahlabela, in press; Groetsch, 1999; Nathan & Koedinger, 2000). Secondly, students had to complete the conversions before the treatments. This created a bigger first barrier than the situation where the first barrier was not as great as the second was. Duval’s (2006) theory maintains that conversion transformations are more difficult than treatment transformations are because they require crossing into another register of representation. Conversions are more complex because they involve movement in each of the two registers and movement across them, whilst treatments require movement in one register only.
Success rates in conversion transformations and treatment transformations
The bar graph in Figure 5 provides a visual representation of the progress of students through the stages
for Question 1 and Question 2. It shows the number of students who did a PT, FT, FT PC and FT FC respectively and excludes
the students who made no response or used a wrong formula. Note that, in this graph, the first set includes the second, which
includes the third, which includes the fourth and derives from the figures Tables 4 and Table 5 provide.

FIGURE 5: Number of students progressing at each stage to the final solution for
Question 1 and Question 2.



FIGURE 6: Number of students progressing at each stage in Question 3 to the
final solution.


The cumulative picture for Question 3 (see Figure 6) shows the number of students who completed a PC, FC, FC PT and FC FT respectively. The first set includes the second, which includes the third, which includes the fourth. These figures derive from the information Table 6 provides.There are clear trends in performance on Question 1 and Question 2. Of the 290 students, 223 (77%) performed a PT on Question 1. Of these, 199 (89%) completed the treatments. Of this group, 134 (67%) went on to complete a PC and 79 (59%) of this group were successful. For Question 2, the numbers from Table 2 are 290 (original), 174 (PT), 156 (FT), 96 (PC) and 40 (FC). The flow diagrams below show these figures: Question 1: 100% → (PT) 77% → (FT) 89% → (PC) 67% → (FC) 59% Question 2: 100% → (PT) 60% → (FT) 90% → (PC) 62% → (FC) 42% The attrition rate at each stage of Question 2 was higher than that for Question 1, except for the progression from partial treatments to full treatments, where 90%of students who managed partial treatments for Question 2 completed the treatments. The corresponding percentage for Question 1 was 89%. However, for all other stages, the progression rate from one to the next was higher for Question 1 than it was for Question 2. On both questions, the highest attrition rate was in the progress from PC to FC. It showed that only 59% of students who started conversions for Question 1 completed them, whilst for Question 2 only 42% of students who started the conversions were able to complete them. When one considers the performance on Question 3, the numbers from Table 6 are 290, 148 (PC), 50 (FC), 50 (PT), 47 (FT). The flow diagram below shows the figures: Question 3: 100% → (PC) 51% → (FC) 34% → (PT) 100% → (FT) 94% Here, as for Question 1 and Question 2, the highest attrition rate was in the movement from PC to FC. Only 34% of the group who started conversions were able to complete them and all of these students went on to start treatments. Thereafter, there were few challenges for this group and only three students did not complete the procedure. The treatment procedure for Question 3 was not a problem for those students who completed their conversions. Fortyseven of the 50 students (94%) who completed conversions were able to complete treatments. The conversions were problems in Question 1 and Question 2. They were insurmountable for many, because only 79 of the 199 (39%) and 40 of the 156 (25%) of the students who completed treatments were successful with conversions. A comparison between trends in responses across the questions supports Duval’s assertion that conversion transformations can be more complex than treatments. For Question 1 and Question 2, the percentage of students who proceeded from full treatments to full conversions was 39% and 25% respectively, whilst for Question 3 the percentage of students who proceeded from full conversions to full treatments was 94%. It is clear that, for the group as a whole, the students’ success rates in conversion transformations were lower than in treatment transformations. However, not all the students would have experienced conversions as more difficult than treatments. The movement between the two registers was not a problem for some students.
Direction of conversions
The direction of conversions is another factor that Duval contends affects the complexity of mathematical activities. Duval maintains that a ‘conversion in one direction can be without any cognitive link with this in the reverse direction’ (Duval, 2008, p. 47), suggesting that the direction of the conversions is important. Duval has shown that, when the original and destination registers of conversions change, students’ performances vary considerably. In one case of linear algebra, 83% of students were able to move successfully between a twodimensional table representation of a vector to a twodimensional graphical representation, whereas only 34% of students were able to move in the opposite direction.The direction of the conversions seems to have been a factor that influenced the students’ success rates. Sixtynine students completed Question 1 correctly, whilst only 40 students did so on Question 3. Of the students who started conversions for Question 1, 59% were able to complete them, whilst only 34% of the students who started conversions for Question 3 were able to do so. The reason for the lower completion rate for the conversions for Question 3 could lie in the fact that the conversion transformation of Question 1 involved moving from the zscores to the probability value (or area) that travelled in the opposite direction to the conversion in Question 3 (moving from the probability value to the zscore). In addition, 89% of the students who completed conversions for Question 3 went on to complete the treatments. Therefore, the conversions were bigger hurdles. The percentage of students who proceeded from full treatments to full conversions in Question 1 was 39%. One of the factors that made Question 3 more challenging was the direction of the conversions, which was different in the two cases. Duval’s own observations about linear algebra (2008) support this. However, we need further research to help us understand why conversions in one direction were more challenging to complete than were conversions in another.
This article presented an analysis of 290 students’ responses to a threepart task using applications of the normal distribution curve. Duval’s framework was used to explain the students’ difficulties with solving the task.Question 1 and Question 2 of the task are ‘unknown percentage problems’ and Question 3 is an example of an ‘unknown value problem’ (Watkins et al., 2004) and one can regard it as an inverse problem (Groetsch, 1999). Different parts of the solutions to the questions were categorised into conversions and treatments, depending on whether the operation required students to move across a register or stay within the same register. The students’ responses were coded according to whether they performed partial treatments, complete treatments, partial conversions or complete conversions. The findings show that Question 2 was more difficult than Question 1: twice as many students completed Question 1 correctly compared to Question 2. It was argued that Question 2 was more challenging because students had to complete two sets of conversions and two sets of treatments. The results of these transformations had to be synthesised together to produce an answer. It was also found that Question 3 was more challenging than Question 1 was. Seventynine students obtained correct answers for Question 1 and only 47 obtained correct, or close to correct, answers for Question 3. It was argued that one factor could be the inverse nature of Question 3, whilst Question 1 was a direct problem. The other factor could be that students needed to complete the conversion transformations for Question 3 before the treatment transformations. Furthermore, because the conversions were bigger hurdles, more students could not progress further. The students encountered the treatment transformations first in Question 1. More students succeeded with this hurdle than with the first hurdle in Question 3, allowing them to progress. Duval’s theory that conversions are more challenging than treatments is supported by the findings in this study. When the attrition rate is examined at each stage in each of the three questions, there were clear patterns in the performance of the students. On Question 1 and Question 2, 59% and 42%, respectively, of the group that started conversions were able to complete them. This compares to approximately 90% of the group who started treatments who were able to complete at least one treatment. In addition, only 34% of the group who started conversions for Question 3 were able to complete them, whereas 94% of the group who started treatments were able to complete them. This shows that completing the conversions was harder than completing the treatments in all three of the questions. Furthermore, this study supports Duval’s (2006) examples in linear algebra that show that the direction of conversions also plays a role in the difficulty level of questions. He writes that ‘when the roles of source register and target register are inverted within a semiotic representation, the problem is radically changed for students’ and that ‘performances vary according to the pairs (source register, target register)’ (p. 122, brackets added). This was true for Question 1 and Question 3. In Question 1, if one considers the group of 134 who completed the treatments, then 79 of these (or 58%) succeeded in completing the conversions when the movement was from z_{0} to P(Z < z_{0}). In Question 3, when the movement was from P(Z_{ }> z_{0})_{ }to z_{0}, the success rate was 34% (50 of the 148 had identified some sort of pvalue). This shows that the students found the second conversion more difficult. If one considers the percentages for the whole group of 290, then 79 of the 290 (or 27%) were able to complete conversions for Question 1 whilst only 50 of the 290 (or 17%) were able to complete conversions for Question 3.
Implications of the findings
Duval (2006) differentiated between treatments and conversions and commented that ‘we cannot deeply analyse and understand the problem of mathematics comprehension for most learners if we do not start by separating the two types of representation transformation’ (p. 127). This study has also shown that conversions and treatments in this problem offer different levels of challenges to students. Therefore, educators should note the additional challenge of moving between systems of representations. The findings suggest that educators may need to support conversion transformations more than treatment transformations to help learners to overcome the challenges. One aspect that deserves notice is that this group of students did not receive any computeraided instruction, nor could they work through computer simulations of normal curves, as normally happens in probability and statistics modules nowadays. If they had had some exposure, they might have had a better idea of the visual aspects of the normal distribution curve and may have been able to switch between representations more easily. Applets or other computer simulation activities could allow students to engage with the properties the different representations reveal. They could also help students to explore situations that show links between the changes in the zscores with the changes in the area values in the different modes of representation. Drawing on Zazkis et al.’s (1996) VA model, perhaps such opportunities will help students move more effortlessly between the different registers, thus reducing the barriers related to carrying out conversion transformations. The solutions to these questions involved coordinating two different registers, which were initially separate. However, Zazkis et al. (1996) suggest, in their VA model, that even though movement between two modes may start as distinct and separate, they eventually merge. Zazkis et al. confine their discussion to the movement between the acts of visualisation and analysis. However, we can apply it to the two registers that we have identified here to suggest that, at some point, the students will regard the combination of these two registers as one that enriches their ‘cognitive architecture’ (Duval, 2006), and which will enable them to move on to further layers of movement between more complicated registers. Finally, this article delved into students’ engagements with the treatment and conversion transformations associated with one particular problem. Readers may want to consider whether one could look at other areas in similar ways and whether they could help to explain the students’ difficulties in those areas. It is hoped that this study will encourage other researchers to look for evidence to support or contradict these findings in other areas. Additionally, it is hoped that such further research would help to illuminate further the challenges that learners experience when they work with problems that involve moving across different registers of representation.
I acknowledge a grant from the United States Agency for International Development (USAID), administered through the nongovernmental organisation Higher Education for Development for research on the different modules in the ACE certification programme. There was no specific grant for this article.I also acknowledges the contribution from Thomas Schroeder (University at Buffalo, State University of New York [SUNY], USA), who assisted with a preliminary report on this project, sketched the normal distribution curves in the article and acted as peer debriefer during the analysis process.
Competing interests
I declare that I have no financial or personal relationship(s) that may have inappropriately influenced me when I wrote this article.
Bansilal, S., & Naidoo, J. (2012). Learners engaging with Transformation Geometry. South African Journal of Education, 32, 26–39. Available from
http://www.sajournalofeducation.co.za/index.php/saje/article/view/452/291
Bansilal, S., Mkhwanazi, T.W., & Mahlabela, P. (in press). Mathematical Literacy teachers’ engagement with contextual tasks based on personal finance. Perspectives in Education. Bakker, A., & Gravemeijer, K. (2004). Learning to reason about distribution. In D. BenZvi, & J. Garfield (Eds.), The challenge of developing statistical literacy, reasoning and thinking (pp. 147–168). Dordrecht: Kluwer Academic Publishers. Ball, D.L., Thames, M.H., & Phelps, G. (2008). Content knowledge for teaching: What makes it special? Journal of Teacher Education, 59(5), 389–407.
http://dx.doi.org/10.1177/0022487108324554 Batanero, C., Tauber, L.M., & Sanchez, V. (2004). Students’ reasoning about the normal distribution. In D. BenZvi, & J.Garfield (Eds.), The challenges of developing statistical literacy, reasoning, and thinking (pp. 257–276). Dordrecht: Kluwer.
http://dx.doi.org/10.1007/1402022786_11 Carlson, K.A., & Winquist, J.R. (2011). Evaluating an active learning approach to teaching introductory statistics: A classroom workbook approach. Journal of Statistics Education, 19(1), 1–22. Available from
http://www.amstat.org/publications/jse/v19n1/carlson.pdf Cohen, L., Manion, L., & Morrison, K. (2000). Research methods in education. London: Routledge Falmer.
http://dx.doi.org/10.4324/9780203224342 Cohen, S., & Chechile, R.A. (1997). Probability distributions, assessment and instructional software: Lessons learned from an evaluation of curricular software. In I. Gal, & J.B. Garfield (Eds.), The assessment challenge in statistics education (pp. 253–262). Amsterdam: IOS Press. Available from http://www.stat.auckland.ac.nz/~iase/publications/assessbk/chapter19.pdf Dey, I. (1993). Qualitative data analysis: A userfriendly guide for social scientists. London and New York: Routledge.
http://dx.doi.org/10.4324/9780203412497 Duval, R. (2002). The cognitive analysis of problems of comprehension in the learning of mathematics. Mediterranean Journal for Research in Mathematics Education, 1(2), 1–16. Duval, R. (2006). A cognitive analysis of problems of comprehension in the learning of mathematics. Educational Studies in Mathematics, 61, 103–131.
http://dx.doi.org/10.1007/s106490060400z
Duval, R. (2008). Eight problems for a semiotic approach in mathematics. In L. Radford, G. Schubring, & F. Seeger (Eds.), Semiotics in mathematics education (pp. 39–62). Rotterdam: Sense Publishers. Ernest, P. (2006). A semiotic perspective of mathematical activity: The case of number. Educational Studies in Mathematics, 61, 67–101
. http://dx.doi.org/10.1007/s1064900664237 Groetsch, C.W. (1999). Inverse problems: Activities for undergraduates. Washington D.C: Mathematical Association of America. Henning, H. (2004). Finding your way in qualitative research. Pretoria: Van Schaik Publishers. Nathan, M.J., & Koedinger, K.R. (2000). Teachers’ and researchers’ beliefs about the development of algebraic reasoning. Journal for Research in Mathematics Education, 31(1), 171–190. North, D., & Zewotir, T. (2006). Teaching statistics to social science students: Making it valuable. South African Journal of Higher Education, 20(4), 503–514. Pfaff, T.P., & Weinberg, A. (2009). Do handson activities increase student understanding? A case study. Journal of Statistics Education, 19(1), 1–34. Available from
http://www.amstat.org/publications/jse/v17n3/pfaff.pdf
Pfannkuch, M., & Reading, C (2006). Reasoning about distribution: A complex process. Statistics Education Research Journal, 5(2), 4–9. Radford, L. (2001, July). On the relevance of semiotics in Mathematics Education. Paper presented to the Discussion Group on Semiotics and Mathematics Education at the 25th Conference of the International Group for the Psychology of Mathematics Education. Utrecht, Netherlands. Available from
http://www.laurentian.ca/NR/rdonlyres/C81F52CE648B44DF928D5A1B0C0612C0/0/On_the_relevance.pdf
Reading, C., & Canada, D. (2011). Teachers’ knowledge of distribution. In C. Batanero, G. Burrill, & C. Reading (Eds.), Teaching statistics in school mathematics: Challenges for teaching and teacher education (pp. 223–234). Dordrecht: Springer.
http://dx.doi.org/10.1007/9789400711310_23 Reading, C., & Reid, J. (2006). An emerging hierarchy of reasoning about distribution: From a variation perspective. Statistics Education Research Journal, 5(2), 46–68. Available from
http://www.stat.auckland.ac.nz/~iase/serj/SERJ5(2)_Reading_Reid.pdf
Strauss, A.L., & Corbin, J. (1998). Basics of qualitative research: Techniques and procedures for developing grounded theory. Thousand Oaks, CA: Sage. Wilensky, U. (1997). What is normal anyway? Therapy for epistemological anxiety. Educational Studies in Mathematcs, 33, 171–202.
http://dx.doi.org/10.1023/A:1002935313957 Watkins, A.E., Scheaffer, R.L., & Cobb, G.W. (2004). Statistics in action. Understanding a world of data. Emeryville, CA: Key Curriculum Press. Zazkis, R., Dautermann, J., & Dubinsky, E. (1996). Using visual and analytic strategies: A study of students’ understanding of permutation and symmetry groups. Journal for Research in Mathematics Education, 27(4), 435–457.
http://dx.doi.org/10.2307/749876
