^{1}

Current views in the teaching and learning of data handling suggest that learners should create graphs of data they collect themselves and not just use textbook data. It is presumed real-world data creates an ideal environment for learners to tap from their pool of stored knowledge and demonstrate their meta-representational competences. Although prior knowledge is acknowledged as a critical resource out of which expertise is constructed, empirical evidence shows that new levels of mathematical thinking do not always build logically and consistently on previous experience. This suggests that researchers should analyse this resource in more detail in order to understand where prior knowledge could be supportive and where it could be problematic in the process of learning. This article analyses Grade 11 learners’ meta-representational competences when constructing bar graphs. The basic premise was that by examining the process of graph construction and how learners respond to a variety of stages thereof, it was possible to create a description of a graphical frame or a knowledge representation structure that was stored in the learner's memory. Errors could then be described and explained in terms of the inadequacies of the frame, that is: ‘Is the learner making good use of the stored prior knowledge?’ A total of 43 learners were observed over a week in a classroom environment whilst they attempted to draw graphs for data they had collected for a mathematics project. Four units of analysis are used to focus on how learners created a frequency table, axes, bars and the overall representativeness of the graph

Traditionally instructional focus in the statistics classroom has been on learners’ construction of various graphs with the instruction being didactic in nature but with little attention being given to the analysis of reasons why the graphs were constructed that way in the first place (Friel, Curcio & Bright, 2001). Similar concerns have been expressed by diSessa, Hammer, Sherin and Kolpakowski (

One of the difficulties with conventional instruction … is that students’ meta-knowledge is often not engaged, and so they come to know ‘how to graph’ without understanding what graphs are for or why the conventions make sense.

Watson and Fitzallen (

Although previously activated knowledge structures (diSessa,

Given this objective, it is doubtful whether one could discuss adequacy, productivity or effectiveness in graph construction without making references to conventions that guide us in validating our concept of adequate, truth, correctness and accuracy in such mathematical activities. With this in mind it seems appropriate to develop an understanding of the way graphs are structured to appreciate the way in which they communicate information. In doing so I acknowledge Watson and Fitzallen (

Despite this variability in nomenclature and conventions, especially in statistical graphs, researchers warn that writing realistic assessment items and resources to mark them would not be easy if there was no movement towards convergence on conventions (Kosslyn,

The basic-level constituent parts of a graph.

The background is the pattern over which the other component parts of a graph are presented. In most instances the background is blank as it is not necessary to include a pattern or picture. The framework represents the kinds of entities being related, in this case weight on the

Although there is variability in naming these bar-like graphs in this article I adopt terminology used by Cooper and Shore (

Watson and Fitzallen (

The number of books read by 12 students.

Name of child | Mary | Anne | George | Barb | Tom | Jerry | Dan | Laura | Carol | Fred | Ken | Pat |
---|---|---|---|---|---|---|---|---|---|---|---|---|

No. of books read | 2 | 4 | 4 | 4 | 3 | 0 | 2 | 3 | 4 | 2 | 1 | 1 |

Source: Watson, J., & Fitzallen, N. (2010).

Cooper and Shore (

Number of books read by each of the 12 children.

Such a representation is often encountered by learners as early as preschool and is typical of the way in which data is represented in elementary and middle school curricula. Without discrediting other terms that have been used elsewhere, in this article I will refer to it as a value bar graph consistent with Cooper and Shore's (

Rainfall for Beijing and Toronto

The critical distinguishing features in both cases (

Let me point out here that, historically, bar-like representations are rooted in geographical analysis of population statistics where a large amount of information was gathered (Cooper & Shore,

Distribution bar graph for the number of books read by 12 children.

The resultant graph is an aggregation of data (distribution bar graph) as opposed to single cases that characterise a value bar graph. We immediately notice how in the distribution bar graph, the individual cases are lost as we can no longer tell how many books were read by each of the children. According to Cooper and Shore (

Within the group of bar-like representations, the histogram is an innovation developed from the distribution bar graph. According to Cooper and Shore (

Let us say we want to count the number of people in a region who are aged 50 years and older. However, we might not want to report a separate count for every individual case of the 1000 people that fall within this age range (a value bar graph) and neither do we want to report on an individual age from 50 to 100 (a distribution bar graph). This age range (50–100) could then be converted into interval scale by subdividing the full range into smaller ranges, for example, ranges labelled 50–59, 60–69, 70–79, 80–89, and 90–99. According to Few (

A histogram showing the distribution of ages of people in a region.

Histograms are best used with data where non-integers are actually possible; hence the bars are drawn adjacent to each other as they represent intervals of continuous data. The numbers on the horizontal axis correspond to the midpoints of the intervals (e.g. 55 in the first interval of 50–60), which determine where a particular data point gets counted on the histogram. Due to the use of the midpoint value the raw data values are no longer accessible in a histogram. The reader therefore is less likely to calculate a measure of variability and even when an attempt is made, accuracy is lost in measures of centre such as the mean as they become more estimates. In a histogram the counting of a particular data point at the midpoint of intervals is supported by Cooper and Shore (

This article works with archived data collected from four experienced (over seven years on average) Grade 11 teachers, two male and two female (Mhlolo & Schäfer,

There is general consensus on the view that learners’ meta-representational competence is the very resource out of which expertise is constructed (diSessa & Sherin,

We pick up the conversation after the learners had drawn a frequency table on the board showing the results of the survey of the number of children in different households. Initially the table had been drawn without the tally column. In the extracts below, ‘T’ stands for teacher, ‘L’ for learner and ‘Chorus’ indicates a group response.

So what do we do next after you have drawn the frequency table?

We make tallies. We make a pie chart. We make a graph. [

[

Have you ever seen something like this?

Yes

Where?

Last year. Last of last year. The previous maths teacher.

So the previous maths teacher showed you how to tally? OK, can you complete the table then. [

Frequency table for the number of children in different households.

Now after this information, how can you display this information? What it is like here, the information has been collected and now it has been organised. OK now how are you going to display this information?

In a graph.

Graph, we have different types of graphs and also we have different types of data. It's grouped and ungrouped. The way you display grouped data is not the same way as you display ungrouped data. So what type of a graph?

Bar graph.

Can somebody show us how to go about it?

[

OK what do you call this line? [

The

Now on the horizontal or the vertical OK you need to have either the number of children in each family and on the other you need to have maybe type of frequency whatever.

[

Axes drawn for the bar graph.

Now how are you going to display your data? Where, OK here it is number of children [

[

Is he correct?

Somehow, almost, maybe.

That bar shows a quarter and eight, Ma'am.

OK so the zero was supposed to be where? Here?

[

So if the 0 was here he would be correct.

Maybe it's incorrect.

It's incorrect.

Let's see if you can put the bar for 1 and 14. Let's see. Let's try.

[

[

[

[

[

[

Graph showing the number of children in different households.

Learners’ bar-like graph of survey data.

Bar-like graph of the number of children in different households.

If you were to display something like this to a person who doesn't know mathematics will that person be in a position to read? OK remember that you have organised your data and now you are displaying your data, can a person be able to read this?

I think maybe you have to label whether which side is talking about number of children and the households. [

Ok now turn to the notice board. Look at the graph of inflation. This type of graph is called a bar graph. Look at it and the one we have just drawn. What is the difference? [

The spaces, it's decorated, it's neatly displayed. [Lesson ends]

The questions steering this analysis were:

What is the nature of learners’ prior knowledge for graphs?

Where do these ideas come from?

How are they involved both productively and unproductively in the process of constructing bar graphs?

Each unit of analysis attempts to answer these three questions.

From the discussion that took place during the process of making a frequency table for the collected data, it is evident the learners brought the knowledge of tallying from the ‘previous teacher’. It can be argued that the knowledge of tallying was neither supportive nor problematic since with or without the tally column the students would still have been able to construct a correct bar graph. However, the agreement by learners that frequencies should be ‘tallied’ opened up a number of questions about their procedural and conceptual understanding of tallying. Let us recall that a tally is a mark used in recording a number of acts or objects, most often consisting of four vertical lines cancelled diagonally or horizontally by a fifth line. Tallying or counting is the act of finding the number of elements of a finite set of objects through a one-to-one correspondence. It is meant to avoid visiting the same element more than once. After tallying the value of the final object gives the desired number of elements (cardinality) in that set. So if the learners’ frequency table had a column of frequencies, by implication tallying had already been done. Therefore from the learners’ wanting to tally the number 8 or 14 or 20 (frequencies) it can be concluded that the purpose of tallying and when it should be done were not clear to them. This suggests that learners had a superficial understanding of the concept.

When prompted to show the information on a bar graph, what is evident is that learners brought their prior knowledge of a framework of a graph with an

However, to a larger extent, it is evident that their prior knowledge of axes was not very productive as they later struggled to draw the bars for their data. When both the

It is evident that learners did not have a clear understanding of this relationship. By drawing a framework of a graph with an

After drawing the axes, it was evident that learners did bring their prior knowledge of matching the height of bars with the frequencies (see

The problem however surfaced in terms of where these bars sit. By placing the 0 at the origin the class struggled to draw the first bar showing 8 families with 0 children each and the subsequent bars were also problematic. This suggests that learners were unable to distinguish the data set that they were dealing with. Distinguishing between sets of data as discrete cases, discrete categories or grouped numerical data along some scale is a critical factor for constructing appropriate representations of the data. In all the three representations of categorical data, that is, value bar graphs, distribution bar graphs and histograms, categories of the variable are typically marked at the midpoints of the category on that particular axis (horizontal if it is a column graph and vertical if it is a bar graph). From the way learners drew their bars, it is evident that this convention was not recognised as their bars were sitting on two different numbers at the same time.

Another evident failure to recognise a convention was that at times learners drew joint bars as in the histogram and at times disjoint bars as in a bar graph, yet conventionally histograms must have joint bars and bar graphs must have disjoint bars. A study on learners’ conceptual understanding of statistics by delMas et al. (

Let us recall that the learners wanted to represent their own collected data on a bar graph. The question then is: to what extent did they achieve this objective? We notice from the basic-level constituents discussed earlier that the learners’ representation is neither a value bar graph, nor a distribution bar graph nor a histogram. Whilst the heights of bars matched with the frequencies, they were joint bars and were sitting on two different values on the horizontal axis in violation of the midpoint convention that guides where bars should be located in value bar graphs, distribution bar graphs and histograms. The overall mathematical outcome here was something close to a histogram but did not represent the original data set particularly well, either in terms of mathematical structure and convention or with reference to the real-world situation being represented. This suggests that learners’ meta-representational competences were inadequate for bar graph construction.

When numbers are used in bar graphs, the axis that assumes a categorical scale could represent three fundamental types: nominal, ordinal and interval data. These categorical contexts of number are problematic even with adults given that the majority of time spent on number and operations in the earlier grades focuses on numbers in their quantitative contexts, with learners usually encountering the categorical contexts of number only when dealing with data handling tasks. This suggests that to communicate effectively using graphs, one has to understand the nature of the data, graphing conventions and a bit about visual perception. Without guiding principles rooted in a clear understanding of graph design, choices are arbitrary and the resulting communication fails to represent the information effectively, as was the case in this class.

This article has both theoretical and practical implications. In terms of theory, this article has shown that due to the more recent emergence of the field of statistics, there is much more flexibility in nomenclature and lack of convergence on what the conventions should be. Watson and Fitzallen (

In terms of concept formation, as long as these bar-like representations are referred to loosely as bar graphs, learners will not make connections between the different graphical representations of quantitative data and their corresponding ways of conveying information on measures of centre and variability for that data. Research indicates that learners entering college may have only a superficial understanding of centre and variability and are likely to have particular difficulty extracting information about those measures when data are presented in graphical form (Cooper & Shore,

In terms of practice, this study argues that knowing the ways in which these types of bar-like graphs (value bar graphs, distribution bar graphs and histograms) represent certain types of data may help teachers make decisions about the level of complexity for instruction. Whilst the so called ‘bar graph’ is often encountered by students as early as preschool, this article argues that the level of complexity of categorical data that is handled by learners at that early stage is low. This is the kind of data that is best represented in what has been defined in this article as the value bar graph. Friel

I acknowledge the Department for International Development for funding the PhD study from which this article is drawn. The views expressed in this article are not necessarily those of the funders.

The author declares that they have no financial or personal relationships that may have inappropriately influenced them in writing this article.

The Department of Education granted approval to proceed with this study under permit T-728 P01/02 U-848.