
How to Display Data- P8
Mô tả tài liệu
How to Display Data- P8:The best method to convey a message from a piece of research in health is via a fi gure. The best advice that a statistician can give a researcher is to fi rst plot the data. Despite this, conventional statistics textbooks give only brief details on how to draw fi gures and display data.
Tóm tắt nội dung
Displaying univariate categorical data 27
is that it is diffi cult to compare intermediate categories such as the mixed
feeding category (both breast & formula milk) in Figure 3.7. In general clus-
tered bar charts are preferable.
Summary of the main points when displaying
categorical data
• Categorical data can be displayed using either pie charts or bar charts.
• Bar charts are preferable to pie charts.
• Use pie charts only for displaying one set of proportions.
• Use clustered bar charts to display two or more sets of proportions.
• Always include the total number of subjects; for cluster or stacked bar
charts always include the number in each group.
• Never use three-dimensional bar charts or pie charts, they are diffi cult to
read and can be misleading.
• Different shades of the same colour are best for distinguishing between
different categories. Colours and patterns to distinguish between different
groups should be used with caution.
• Discrete or count data can be displayed using bar charts.
Under 20
(n�270)
20
40
60
80
100
0
20–24
(n�574)
25–29
(n�1006)
30–34
(n�915)
35–39
(n�350)
40�
(n�96)
Maternal age (years)
breast milk only
both breast and
formula milk
formula milk only
Pe
rc
en
t
Figure 3.7 Stacked bar chart showing the relative frequency of feeding methods
between the different age groups.1
28 How to Display Data
References
1 O’Cathain A, Walters S, Nicholl JP, Thomas KJ, Kirkham M. Use of evidence based
leafl ets to promote infomred choice in maternity care: randomised controlled trial
in everyday practice. British Medical Journal 2002;324:643–6.
2 Ehrenberg ASC. A primer in data reduction. Chichester: John Wiley & Sons; 2000.
29
Chapter 4 Displaying quantitative data
This chapter will describe the basic graphs available for displaying quantita-
tive data. As described in Chapter 1 quantitative data can be either counted
or continuous. Count data are also known as discrete data and as the name
implies occur when the data can be counted, such as the number of children
in a family or the number of visits to a GP in a year. Continuous data are
data that can be measured and in principle they can take any value on the
scale on which they are measured; they are limited only by the precision of
the scale of measurement and examples include height, weight and blood
pressure.
4.1 Count data
Count data can only take whole numbers and the best method to display
them is using a bar chart. As with categorical data, an initial step is to add
up the number of observations in each category and express them as per-
centages of the total sample size. For example, Table 4.1 shows data from
an investigation by Campbell of the effect of environmental temperature on
the number of deaths attributed to Sudden Infant Death Syndrome (SIDS).1
The table summarises the numbers of deaths, in England and Wales, from
SIDS each day over a 5-year period (1979–1983) (n � 1819 days). Figure 4.1
displays these data using a bar chart. On the horizontal axis are the number
of deaths per day, going from a minimum of 0 deaths per day to a maxi-
mum of 16 deaths per day, while on the vertical axis is the frequency with
which these occur during this 5-year period. The vertical scale for this graph
is the frequency; this could easily be rescaled to percentages. As discussed
in Chapter 3 there are advantages to both types of scale and the shape
of the resultant chart will not be affected by the choice of scale. Use of
the percentage scale facilitates the comparison of groups. For example,
if it was of interest to compare England and Wales with Scotland, the
smaller number for Scotland would make comparison more diffi cult if the
frequency scale were used.
30 How to Display Data
Table 4.1 Number of deaths from SIDS per day,
England and Wales, 1979–1983
Number of deaths per day Number of days (%)
0 121 (6.7)
1 277 (15.2)
2 330 (18.1)
3 307 (16.9)
4 270 (14.8)
5 205 (11.3)
6 127 (7.0)
7 89 (4.9)
8 45 (2.5)
9 20 (1.1)
10 14 (0.8)
11 8 (0.4)
12 4 (0.2)
13 1 (0.1)
14 –
15 –
16 9 (0.1)
Total 1819 (100.0)
0
0
50
100
150Fr
eq
ue
nc
y 200
250
300
350
1 2 3 4 5 6 7 8
Number of deaths per day
9 10 11 12 13 14 15 16
Figure 4.1 Bar chart showing the distribution of number of sudden infant deaths per
day for England and Wales, 1979–1983 (n � 1819).1
Displaying quantitative data 31
Count data are ordered in that there is a natural ordering to the groups:
2 children in a family is more than 1, and 3 is more than 2 and so on. Thus,
a bar chart displays the shape of the distribution of the data. This would
not be obtained from a pie chart. Pie charts should not be used for count
data as they make no use of the additional information that arises from the
ordering of the data.
4.2 Graphs for continuous data
A variety of graphs exists for plotting continuous data. The simplest graphs
are dotplots and stem and leaf plots and they both display all the data. In
addition there are other graphs which provide useful summaries of the data
such as histograms and box-and-whisker plots.
4.3 Dotplots
A basic principle for displaying data is ‘above all else display the data’.2
Dotplots are perfect for following this maxim as each point represents a
value for a single individual. They are one of the simplest ways of displaying
all the data. As part of a study examining the cost effectiveness of special-
ist leg ulcer clinics compared to standard district nursing care participants
were asked their height.3 Figure 4.2a shows dot plots of the heights of the
participants. Each dot represents the value for an individual and is plotted
along a vertical axis, which in this case, represents height in metres. Data
for several groups can be plotted alongside each other for comparison;
Figure 4.2b shows these data plotted by sex and in this case the differences
in height between men and women can be clearly seen.
4.4 Stem and leaf plots
Another simple way of showing all the data is the stem and leaf plot. Each
data point is divided into two parts, a stem and a leaf; the leaf is usually the
last digit and the stem is the other part of the number. For example, for a
height of 1.58 m, the leaf would be 8 and the stem would be 1.5. Each data
point in the sample is thus divided and the results displayed in the form of
a stem and leaf plot. There is a separate line for each different stem value,
but within particular stem values the individual leaf values are arranged
on the same line. The stem is on the left of the plot and the leaves are on
the right. In addition the number of data points in each stem can also be
displayed on the left. It is easiest to understand by means of an example.