Teaching statistics meaningfully at school level requires that mathematics teachers conduct classroom discussions in ways that give statistical meaning to mathematical concepts and enable learners to develop integrated statistical thinking. Key to statistical discourse are narratives about variation within and between distributions of measurements and comparison of varying measurements to a central anchoring value. Teachers who understand the concepts and tools of statistics in an isolated and processual way cannot teach in such a connected way. Teachers’ discourses about the mean tend to be particularly processual and lead to limited understanding of the statistical mean as measure of centre in order to compare variation within data sets. In this article I report on findings from an analysis of discussions about the statistical mean by a group of teachers. The findings suggest that discourses for instruction in statistics should explicitly differentiate between the everyday ‘average’ and the statistical mean, and explain the meaning of the arithmetic algorithm for the mean. I propose a narrative that logically explains the mean algorithm in order to establish the mean as an origin in a measurement of variation discourse.

This article explores the knowledge needed by teachers to enable meaningful

Thompson’s (

At school level Statistics is usually taught by mathematics teachers, whose studies may not have included courses in Statistics. Hence, the instructional discourse of Statistics tends to be restricted and mostly aimed at instruction for performing well-defined mathematical procedures, such as calculating the mean when it is asked for explicitly. In contrast, statistical thinking ‘involves “big ideas” that underlie statistical investigations’ (Ben-Zvi & Garfield,

The

Teachers who cannot logically explain the mean algorithm may fail to explain why it yields a statistically representative number and why the mean is an important statistic in more advanced procedures. Although there is a substantial amount of research about teachers’ and learners’ explanations of average and mean (Shaugnessy,

The discussion that provides the data for this article took place in the third session of a semester course in introductory Statistics for high school teachers. The course formed part of an honours degree in mathematics education. I was the lecturer of the course and engaged the teachers as students in deep discussions of data contexts, engaging with and contrasting everyday reasoning with statistical reasoning in such contexts. Twelve students were enrolled in the course. I arranged the students into three groups of four and video-recorded the discussions of two of the groups. I constituted the groups in a way that would reflect the language complexities of classroom discourse in South Africa, but also provide the best possible chance of promoting discussion. I mainly controlled for power issues related to age, gender and previous knowledge of Statistics. Group 1 comprised mature students who are experienced mathematics teachers, evenly divided according to gender and previous knowledge of Statistics. Two students (KH and RK) had taken Statistics as an undergraduate course. Only one student (KH) had English as a first language. Group 2 comprised young students, with little or no teaching experience. In this group only one student was male, but gender power issues amongst the younger students were unproblematic. Two students (SDS and GG) had English as their first language and three (SDS, NM and MM) had recently done a Statistics course in their B.Ed. programme. In total, five of the eight students in the video-recorded groups had done Statistics courses prior to this course and five of them were teaching Statistics at Grade 10 level at the time of the research. The third group was not included in the study as a separate group, although the contributions of these students were included in analysis of whole class discussions. I decided not to include the last group since they were least balanced in terms of my criteria. The discussions were transcribed from the video tapes and analysed together with the students’ written work.

I studied the group and classroom discussions during the course as part of my doctoral research. Ethical clearance for the study was duly obtained from the ethics committee of the relevant university's School of Education. After a contact session during which information about my research was provided and the conditions for consent were negotiated with the students, they gave informed consent that their recorded discussions and their written work may be used as research data and disseminated in scholarly conferences and publications. The conditions for consent were anonymity in the wider dissemination of the research and ensuring that their withholding consent would not influence their participation in the course or their assessments.

For this case study I undertook discourse analysis of three sessions of the course in order to investigate emergent statistical reasoning. I used Sfard’s (

Commognitive research requires in-depth analysis of the uses of words and discursive patterns in extended discussions. Words are concepts and the ways in which participants elaborate on word uses through other words or representations like gestures allow the researcher to make conjectures about participants’ discourses and hence understanding of concepts.

The word usage of the participants in my research is not independent of culturally validated uses in different discourses. Hence, I begin by contrasting the meanings of average and mean as they are used in three discourses: everyday discourse evident from dictionaries, statistics discourse used in subject dictionaries and mathematics discourse as evident from the historical emergence of the arithmetic mean. Then I discuss literature about discourse on average and mean that emerge in teaching and learning situations.

A study of dictionary entries under ‘average’ and ‘mean’ reveals an opaque and circular relationship between the two terms. In

Comparison of definitions of average in everyday and statistics discourses.

Everyday discourse |
Statistics discourse |
---|---|

1. (a) A single value (as mean, mode or median) that summarises or represents the general significance of a set of unequal values |
1. In everyday use the word average is often used loosely to mean typical or representative, as in a statement like ‘William is average at football’. … According to context, it may be any (or none) of mean, mode, median and midrange (p. 14). |

2. An estimation of or approximation to an arithmetic mean. | 2. In technical use, average usually has the same meaning as mean or arithmetic mean. In certain contexts technical use (of average) requires other types of mean … geometric mean and harmonic mean. |

3. A ratio expressing the average performance especially of an athletic team or an athlete, computed according to the number of opportunities for successful performance. | - |

A comparison of the everyday and statistics definitions of average in

A second observation is that in both discourses average is implicitly utilised as

In

Comparison of definitions of mean in everyday and Statistics discourses.

Everyday discourse |
Statistics discourse |

1. (adjective): Something intervening or intermediate. | - |

2. A middle point between extremes | - |

3. A value that lies within a range of values and is computed according to a prescribed law: as (1): arithmetic mean (2): expected value | 1. A measure of an average value. There are several types of mean, used in appropriate circumstances, but unless stated otherwise the term ‘mean’ is usually taken to be the arithmetic mean (p. 150) |

4. Either of the middle two terms of a proportion | - |

The definitions of ‘mean’ in the Merriam-Webster Online Dictionary (Merriam-Webster,

Research about understanding of the statistical mean in teaching and learning situations indicates that the conflation of average and mean is problematic for teaching, since it leaves the ontologies of the mean and the average unexplained. A teacher who needs to answer the question ‘what is the statistical mean?’ may invoke the calculation procedure to imply ‘the mean is what is does’, but, as the statistics education literature reports, the process-definition is open to varied interpretations.

In-depth interviews as well as large-scale studies that have researched the meanings learners and teachers assign to the mean provide wider context for the meanings of average and mean, which are reflected in dictionaries. It also illuminates the potential for confusion in statistics classrooms: literally, participants in a classroom discussion may not be talking about the same thing when they refer to average or to mean.

Various meanings of average in everyday discourse are described in Statistics education literature. Both teachers and learners routinely elaborate the meaning of ‘average’ as ‘middle’. In turn, ‘middle’ is understood in more than one way: sometimes middle is determined by active ordering of measurements of some attribute, where after the middle

Average is also explained as ‘typical’ in everyday discourse. When data are available, ‘typical’ tends to be associated with the most frequent observation (Konold & Pollatzek,

The complexity does not end here. Everyday meanings of average do not depend on the comparison of numerical values. Interpretations of average are often based on qualitative judgments of what is experienced as ‘not extreme’. Hence, a person can be described as average in appearance, based on a qualitative judgement of appearance that lies between extremes, for example the extremes of ugly and attractive. ‘Average’ in context may be so tightly associated with normative contextual descriptions that it is associated with adjectives like good, bad (to score an ‘average’ mark is good or bad, depending on the value of the average mark), low, high, cheap or expensive, rather than reflecting a relationship between overt or covert measurements of an attribute of a collection of objects (Lampen,

These everyday meanings of average held by teachers and learners suggest that simply explaining the number obtained by the mean calculation as the average does not provide access to statistical discourse. Indeed, the equal sharing meaning suggested by the mean algorithm is not associated with average by people who do not know the algorithm (Mokros & Russell,

Attempts to unpack the mean didactically as a statistical object have led to descriptive definitions such as an equal share, true value, signal in noise, balance point or representative value (Konold & Pollatzek,

Makar and Confrey (

students need to create the mean as an adjustment on the measure of group performance … as one runs through the contribution of cases to the mean of the group. (p. 2)

However, in their conceptualisation, the mean as an object is a multiplicative concept that serves as a measurement of group performance, hence it foregrounds the relationship:

Historically the concept of the mean can be traced back to estimation in order to solve practical, measurement-related problems and the geometric construction of different means in mathematics, namely the harmonic, geometric and arithmetic means. Statistical use of the mean can only be traced back to the 19th century (Bakker,

Bakker (

The first procedure uses one representative value multiplicatively to estimate a large total number. Bakker (^{1}^{2}

Structurally, ‘a representative object’ represents the mean and its value can be calculated by a simple transformation of the relationship above. It is important to note that in this historical use of finding a total number of objects the mean was not an unknown or hypothetical value. It was the smallest component unit (a brick in a wall or leaves on a twig) that could be used to access measurements of larger, composite objects (rows of bricks and walls or leaves on a tree). Hence, there is no intuitive conceptual step to ‘creating’ the arithmetic mean by equal sharing. In practice, bricks are made to a standard size whilst the heights of walls vary; it does not make practical sense to ask how wide a brick must be to build a wall of a given height with a given number of rows.

The geometric concepts of arithmetic, geometric and harmonic means existed long before the statistical concept of mean and were studied in Pythagoras's time (around 500 BC). In ancient Greece, where these concepts were mathematically formalised, lengths were constructed with the use of compasses and straight edges and treated as concrete objects (to the extent that numerical discourse on square root lengths was problematic). Bakker (

Theorem of Pappus: OD is the arithmetic mean of AB and BC.

Through the construction of Pappus (ca. 320 AD) the arithmetic mean existed as an object with a measurable length. The formula that was used to calculate

In this equation it is clear that the mean length (

According to Bakker, it is unclear how average in this sense came to signify the arithmetic mean and when and how the shift from the concept of the arithmetic mean to the statistical concept of representative value or balance point of a data set occurred. Such loose ends in overlapping discourses about average and mean are problematic in teaching for statistical reasoning.

The use of mean in a discourse on variation, hence statistical discourse, developed quite recently in the history of mathematics. Until about the 19th century the calculation of the mean was used to find a ‘real’ value, a measurement of a physical object (e.g. the diameter of the moon or the number of leaves on a tree). Bakker (

I now report on the meanings of the statistical mean that emerged in a discussion of the mean algorithm by a group of high school teachers, after which I reflect on connections between their narratives about the mean and average, and their understanding of the meaning of the division step in the mean algorithm; finally, I consider possibilities for integrated discourse for instruction of the mean as a statistical concept.

Prior to the discussion of the meaning of the mean, the students had studied real data of samples of prices of used cars and drawn various graphs of the data with the aid of FATHOM™ in order to investigate shapes of distributions and to estimate measurements that could reasonably serve to represent and summarise central tendency and spread. They had also compared calculated values of the mean and the median to their estimations on graphs. Furthermore, the sensitivity of the mean to extreme values had been explored empirically and discussed as a reason for representing and comparing skewed data sets by the median rather than the mean. Hence, all the students knew how to find the median and how to calculate the mean.

I introduced the following prompt for the discussion of the meaning of the mean algorithm:

‘What is the logic or common sense behind using the mean as a measure of centre?’

The aim of the discussion as a learning task was to engage the students in analysing the meanings of average and mean, and in constructing a logical connection between the syntax of the mean algorithm and the role of the mean as a statistical measure of centre. In my analysis of the discussions I looked for ‘seed concepts’ that could be used in discourses for instruction to develop statistical reasoning about the mean. In particular, I wanted to understand if and how the participants considered the enacted meanings of addition (putting together) and division (sharing or grouping) in their explanations of the mean algorithm. It transpired that their discourse maps well onto everyday discourses such as those evident from the dictionary entries. The students too explained mean as average and average as mean with ‘middle’ as the predominant spatial image. They were at a loss to give meaning to the mean algorithm, yet they developed a generative narrative of the mean as a norm or a value to which to compare measurements. This narrative holds the key to a new object definition of the mean. I will now report on seven meanings that emerged during group and whole class discussion of the meaning of the mean algorithm. The excerpts are provided in chronological order and provide the opportunity to describe discursive shifts in the discussion. In order to establish confidence in the credibility of my own interpretive narratives (and hence the validity of my research) I provide extended transcripts of the discussions (Sfard,

Throughout the group and class discussions the students explained the mean as the ‘average’ in contexts in which they imagined the mean could be used. The excerpt in

At first glance it appears that the students are treating mean and average simply as synonyms, yet in Turn 10 and Turn 15 KH's utterances suggest a primary ontological position for average. The students seem to share the common sense meaning of average that they believe ‘people’ have. The discussion about the mean as an object (‘the mean is …’) stops here. The ontological collapse in this narrative prevents the students from further reasoning. The requirement to further unpack the meaning of average seems ridiculous: the mean is ‘just’ the average as if the average was self-evident and no further explanation is needed.

Mean is average.

Turn | Student | Utterance |
---|---|---|

7 | SM | All right, we can say that it [ |

8 | KH | Yes, so it's the … just the average. |

9 | SM | Yes just the average price. |

10 | KH | It's what people understand by average. |

14 | RK | … the common sense behind that [ |

15 | KH | I think it is because … it is the average. When you talk to the general public, the mean is the average, they understand average. Median is a different aspect [ |

In the excerpt in

Average gives a general picture.

Turn | Student | Utterance |
---|---|---|

17 | KH | So |

18 | RK | I think it's because it gives them the picture. … It captures … some particular number encapsulating … like the mean height. |

22 | RK | So how could we phrase it? |

23 | RK | It kind of gives the general picture of how tall the kids is. |

24 | KH | Yes, yes. |

34 | RK | In a class |

36 | RK | You give a |

37 | KH | Mm. |

44 | RK | The mean is the general picture. |

45 | KH | Yes, yes an impression. |

Intertwined with the impression narrative in

Average is middle.

Turn | Student | Utterance |
---|---|---|

20 | RK | I want to say, if you say the average height of kids. … Let's say one meter two [ |

23 | RK | It [ |

26 | KH | [ |

28 | RK | I think the median is like … if you have it ordered. |

30 | RK | You take the middle value. |

31 | GK | The middle value is the median yes. |

32 | SM | That is the mean. |

38 | RK | [ |

39 | KH | Exactly half are above that height and exactly half are below. |

40 | GK | And what is the median. |

41 | RK | That is the median. |

42 | GK | The half is the median. |

43 | KH | Aha, yes, OK. |

44 | RK | The mean is the general picture. |

In the excerpt in

RK's leading narrative about the mean as a ‘middle value’ is within everyday discourse in which physical examples and imagined contexts are used to give weight to the argument. KH's narrative, on the other hand, is anchored in statistical discourse, drawing on the procedural definition of the median. The students seem to have control over the median: they are certain they find the middle when they calculate the median position, whilst there is no such agency in their narrative about the mean. Since the logic by which mean becomes middle is not clear, the students are unable to resolve the conflict around the meaning of the mean-as-middle, and RK and KH (Turn 44 and Turn 45) retreat to the initial realisations of mean as ’the general picture’ and ‘an impression’ of what is going on in a situation in which it is used. An underlying problem is that the objects that support the reasoning at this stage are a concrete, although imagined, collection of ‘kids’. The mean does not have anything more to say about this collection; average is adequate. With no recourse to logical reasoning about the syntax of the mean algorithm in relation to average and average-is-middle, there is no opportunity to develop more abstract statistical narratives about the mean. As I mentioned before, the students knew how to calculate the mean and how to find the median; hence, their confusion between mean and median cannot simply be ascribed to lack of algorithmic knowledge.

In the excerpt in

Average is most.

Turn | Student | Utterance |
---|---|---|

49 | GK | [ |

50 | RK | [ |

51 | GK | Mm-mm [ |

52 | RK | How can we say it? |

In Turn 49 GK agrees with the narrative that the mean as the average gives a general picture of some aspect of a context. She then realises her understanding of the use of the mean algorithm. The result of ‘add[ing] up the total and dividing it by the number’ is realised as a frequency of occurrence ‘how often you can get it’. With her verbal realisation of average as most, GK gestures grouping together of objects within brackets. In Turn 49 (

At this stage in the discussion the student teachers do not have access to narratives that unpack the meaning of the mean; instead, their narratives compare uses of the statistical mean with the everyday, self-evident notion of average.

Three narratives about the mean as the average in everyday discourse.

The ontology of the mean – what the mean is – is completely realised in intuitive everyday understanding of average in which similarity and extremity are observed properties of objects. The epistemology of the mean is similarly intuitive and practical: we come to know what the mean is through its uses in everyday contexts. Hence, both ontology and epistemology of the mean in these teachers’ narratives are intuitive and restricted to everyday discourse. The meanings they assign to the mean as average are reflected in the dictionary definitions I mentioned earlier. The problem is that even the definitions in the statistics dictionary do not provide a way out of the conundrum of the conflation of mean and average.

In the ensuing discussion the conflation of mean and average is gradually resolved. By comparing measurements to the mean, the mean is useful to determine what is

In order to focus the discussion on the syntax of the mean algorithm, I led the student teachers to think about the division step as equal sharing and then challenged: ‘What does it help you to pretend they are all the same? They are not the same!’ (in reference to the sample of car prices that was used in the group discussion). The students haltingly started to compare a state in which all the cars were hypothetically assigned the same price and the actual state of variable prices.

In the excerpt in

The mean is a value to compare to.

Turn | Student | Utterance |
---|---|---|

77 | RK | Example of buying a car. I mean. If … if you typically buy a car, it tells you in this car shop, you know that this brand of car, the RunX I want to buy, it generally costs around this [mean] price. [ |

Concurrent with the discussion of the first group reported so far, the second group of four students that were video-recorded raises the distance of a point from the mean as a means to judge in context whether an object is average or not.

In Turn 269 (

Far from the mean is not average.

Turn | Student | Utterance |
---|---|---|

269 | NM | I think it's because … remember that total is coming from all of them, so sharing their effort, for example, that total that you have just before dividing, so if they were to share … [ |

270 | SDS | So then you can judge in terms of it [ |

273 | GG | So for example, if we go to the car one [ |

The discussion of the meaning of the mean algorithm closes with tentative object definitions of the mean as a constant amidst variable measurements and as a norm. The accompanying procedure is that of levelling out variable measurements.

In the excerpt in

Mean is a constant and a norm.

Turn | Student | Utterance |
---|---|---|

144 | RK | It gives me a sense that uh, I think the value of the mean perhaps we should have a constant that we have, [ |

232 | SDS | For example the marks, if you add all these marks together, and you divide by 5 [ |

246 | NM | We are sharing the mark equally to all of them, the total mark after adding their marks up, we’re sharing it equally [ |

247 | SDS | Like in your histogram, when we were looking at the histogram, remember, like I was saying [ |

248 | NM | Yes, add the mark. |

249 | SDS |

SDS's explanation (Turn 232 and Turn 247) of the result of evening out as norm supports the shift in the discourse from intuitive awareness of variation in context to comparing measurements to a fixed number. In these attempts to define the mean as an object, the position of the mean (in the ‘middle’) is not mentioned. Levelling out and fair sharing emerge as process meanings of the division step.

Informal statistical narratives on the meaning of the mean algorithm.

In the discussion of the meaning of the mean algorithm, the mean emerged as a hypothetical, abstract object that serves as an objective point of comparison amongst measurements. Hence, the conflation of average and mean is resolved and the students’ narratives now belong to informal statistical discourse.

The meanings of the mean and average that emerged in my study support findings in the literature that the mean algorithm is badly understood by teachers. The tendency to accept the mean as a ready-made formula to assign a number to a variety of everyday meanings of average is pervasive and persistent. The reported discussion suggests that, unless teachers consciously work to separate the meanings of the calculated mean and the contextual average, their discourses for instruction will be limited to everyday, experiential meanings.

From the students’ discussion I identified two seed narratives for developing connections between average, the mean algorithm and the statistical mean. The students’ narratives presented the mean as

Evening out is reported in the literature as an intuitive process to find a mean value (Bakker,

The bars in a case-value bar graph can be ordered from small to large to support a narrative about ordered evening out. The process is illustrated in

Evening out differences between ordered measurements.

As a narrative the algorithm proceeds as follows: even out the difference between the smallest and the second smallest measurement by taking away half of the difference between the measurements and adding it to the smallest measurement. Then the difference between the largest measurement and the two equalled measurements is shared equally amongst all three bars to achieve the mean measurement. This process can be extended to any number of measurements. Modelling the evening-out action closely, the algebraic process yields a mathematical narrative about the algorithm for the statistical mean, as shown in

Algebraic derivation of the algorithm for the statistical mean.

The evening-out process to derive the statistical mean can be described as a first-divide-then-redistribute process, since in this enacted narrative division happens first and is effected on a single measurement at a time. Each bar is divided according to the proportion required to even out bars that are shorter. In this example, in the first step the difference between the shortest bar and the second shortest bar is halved, whilst in the second step, the difference between the length of the evened bars and the remaining long bar is divided into thirds. The redistribution between the bars is additive. Consequently, there is a disjunct between the mathematical structure of the mean algorithm (where division is the final action) and the meaning derived from the evening-out process. The disjunct demands a statistical redefinition of the object that is constructed by evening out. The object definition of the mean as a ‘fair share’ is not compatible with the process of sequential sharing between two measures at a time. An object definition based on the narratives that emerged about the mean as a norm in my research is the following: the mean is

Statistics education literature abounds with reports of learners’ inappropriate comparison of distributions according to a contextually meaningful measure, rather than a statistical measure of central tendency (Bakker & Gravemeijer,

In addition to reflecting on the connections between statistical concepts, a teacher who wishes to teach Statistics as a cycle of enquiry (Wild & Pfannkuch,

In this article I have argued that the teachers in my study could initially not create a narrative about the mean as a statistical object. Their explanations conflated mean with vague and varied ideas about average and middle in imagined situations. Through focused discussion of the mathematical structure of the mean algorithm they were able to construct narratives about the statistical mean as a constant and a norm or standard to which actual data can be compared. Such understanding of the statistical mean is a big idea in a discourse in which statistics is the science of measuring variation. Averaging in the sense of calculating a mean pervades the structure of more complicated statistical models. Therefore, for discussions of the mean to be statistical rather than informal the mean must be used with conscious consideration of variation and, most importantly, the endeavour to measure variation.

The implication of this study for teachers’ statistical discourses for instruction is twofold:

Instructional discourse must consciously strive to separate the meanings of average in context and the statistical mean. The intuitive understanding of the mean as the middle value of an interval of average (not extreme) values in a data set should be taken up in a deviation discourse, which raises the need to measure variation. Hence, I draw the attention of teachers to another big idea, namely that statistics is concerned with the

The object conception of the mean as a norm or a standard has the potential to construct clear narratives of the difference between the statistical mean and the arithmetic mean. In arithmetic narratives the mean is understood as a fair share, whilst in statistical narratives the mean is the origin or zero variation value from which variation is measured. I showed how intuitively accessible evening-out procedures can be ordered and used to derive the mean algebraically. The conception of the mean as a norm or standard is thus rich in connections to intuitive reasoning as well as formal statistical reasoning.

Further classroom-based research is needed to understand how teachers develop instructional discourses about measurement of variation and the mean as an origin for such measurement.

Thank you to the students who participated in this research.

The author declares that she has no financial or personal relationships that may have inappropriately influenced her in writing this article.

An ancient Indian story reported by Hacking (1975).

From the history of the Peloponnesian War (431–404 BC). See Bakker (