How to Display Data- P8
Mô tả tài liệu
How to Display Data- P8:The best method to convey a message from a piece of research in health is via a fi gure. The best advice that a statistician can give a researcher is to fi rst plot the data. Despite this, conventional statistics textbooks give only brief details on how to draw fi gures and display data.
Tóm tắt nội dung
Displaying univariate categorical data 27 is that it is diffi cult to compare intermediate categories such as the mixed feeding category (both breast & formula milk) in Figure 3.7. In general clus- tered bar charts are preferable. Summary of the main points when displaying categorical data • Categorical data can be displayed using either pie charts or bar charts. • Bar charts are preferable to pie charts. • Use pie charts only for displaying one set of proportions. • Use clustered bar charts to display two or more sets of proportions. • Always include the total number of subjects; for cluster or stacked bar charts always include the number in each group. • Never use three-dimensional bar charts or pie charts, they are diffi cult to read and can be misleading. • Different shades of the same colour are best for distinguishing between different categories. Colours and patterns to distinguish between different groups should be used with caution. • Discrete or count data can be displayed using bar charts. Under 20 (n�270) 20 40 60 80 100 0 20–24 (n�574) 25–29 (n�1006) 30–34 (n�915) 35–39 (n�350) 40� (n�96) Maternal age (years) breast milk only both breast and formula milk formula milk only Pe rc en t Figure 3.7 Stacked bar chart showing the relative frequency of feeding methods between the different age groups.1 28 How to Display Data References 1 O’Cathain A, Walters S, Nicholl JP, Thomas KJ, Kirkham M. Use of evidence based leafl ets to promote infomred choice in maternity care: randomised controlled trial in everyday practice. British Medical Journal 2002;324:643–6. 2 Ehrenberg ASC. A primer in data reduction. Chichester: John Wiley & Sons; 2000. 29 Chapter 4 Displaying quantitative data This chapter will describe the basic graphs available for displaying quantita- tive data. As described in Chapter 1 quantitative data can be either counted or continuous. Count data are also known as discrete data and as the name implies occur when the data can be counted, such as the number of children in a family or the number of visits to a GP in a year. Continuous data are data that can be measured and in principle they can take any value on the scale on which they are measured; they are limited only by the precision of the scale of measurement and examples include height, weight and blood pressure. 4.1 Count data Count data can only take whole numbers and the best method to display them is using a bar chart. As with categorical data, an initial step is to add up the number of observations in each category and express them as per- centages of the total sample size. For example, Table 4.1 shows data from an investigation by Campbell of the effect of environmental temperature on the number of deaths attributed to Sudden Infant Death Syndrome (SIDS).1 The table summarises the numbers of deaths, in England and Wales, from SIDS each day over a 5-year period (1979–1983) (n � 1819 days). Figure 4.1 displays these data using a bar chart. On the horizontal axis are the number of deaths per day, going from a minimum of 0 deaths per day to a maxi- mum of 16 deaths per day, while on the vertical axis is the frequency with which these occur during this 5-year period. The vertical scale for this graph is the frequency; this could easily be rescaled to percentages. As discussed in Chapter 3 there are advantages to both types of scale and the shape of the resultant chart will not be affected by the choice of scale. Use of the percentage scale facilitates the comparison of groups. For example, if it was of interest to compare England and Wales with Scotland, the smaller number for Scotland would make comparison more diffi cult if the frequency scale were used. 30 How to Display Data Table 4.1 Number of deaths from SIDS per day, England and Wales, 1979–1983 Number of deaths per day Number of days (%) 0 121 (6.7) 1 277 (15.2) 2 330 (18.1) 3 307 (16.9) 4 270 (14.8) 5 205 (11.3) 6 127 (7.0) 7 89 (4.9) 8 45 (2.5) 9 20 (1.1) 10 14 (0.8) 11 8 (0.4) 12 4 (0.2) 13 1 (0.1) 14 – 15 – 16 9 (0.1) Total 1819 (100.0) 0 0 50 100 150Fr eq ue nc y 200 250 300 350 1 2 3 4 5 6 7 8 Number of deaths per day 9 10 11 12 13 14 15 16 Figure 4.1 Bar chart showing the distribution of number of sudden infant deaths per day for England and Wales, 1979–1983 (n � 1819).1 Displaying quantitative data 31 Count data are ordered in that there is a natural ordering to the groups: 2 children in a family is more than 1, and 3 is more than 2 and so on. Thus, a bar chart displays the shape of the distribution of the data. This would not be obtained from a pie chart. Pie charts should not be used for count data as they make no use of the additional information that arises from the ordering of the data. 4.2 Graphs for continuous data A variety of graphs exists for plotting continuous data. The simplest graphs are dotplots and stem and leaf plots and they both display all the data. In addition there are other graphs which provide useful summaries of the data such as histograms and box-and-whisker plots. 4.3 Dotplots A basic principle for displaying data is ‘above all else display the data’.2 Dotplots are perfect for following this maxim as each point represents a value for a single individual. They are one of the simplest ways of displaying all the data. As part of a study examining the cost effectiveness of special- ist leg ulcer clinics compared to standard district nursing care participants were asked their height.3 Figure 4.2a shows dot plots of the heights of the participants. Each dot represents the value for an individual and is plotted along a vertical axis, which in this case, represents height in metres. Data for several groups can be plotted alongside each other for comparison; Figure 4.2b shows these data plotted by sex and in this case the differences in height between men and women can be clearly seen. 4.4 Stem and leaf plots Another simple way of showing all the data is the stem and leaf plot. Each data point is divided into two parts, a stem and a leaf; the leaf is usually the last digit and the stem is the other part of the number. For example, for a height of 1.58 m, the leaf would be 8 and the stem would be 1.5. Each data point in the sample is thus divided and the results displayed in the form of a stem and leaf plot. There is a separate line for each different stem value, but within particular stem values the individual leaf values are arranged on the same line. The stem is on the left of the plot and the leaves are on the right. In addition the number of data points in each stem can also be displayed on the left. It is easiest to understand by means of an example.