Do you love figures, calculations and graphs? Or would you rather avoid numbers if at all possible?
If you’re an editor or proofreader who routinely works on mathematics or science material, you’ll be used to dealing with figures, percentages, tables and graphs. But even editors who work on other texts (academic, business, and even fiction) will sometimes need to handle numbers and data.
Perhaps you’re editing a survey report, a paper containing the results of a research study, or an organisation’s annual report. Even if the client isn’t expecting you to perform an in-depth analysis of their calculations and data presentation (and you feel it’s out of your area of expertise – or your comfort zone), there are some straightforward things you can look out for to help your author keep their data in line.
Most of what follows is based on my own experience, so it isn’t intended to be an exhaustive list of issues. But I hope it will reassure you that checking figures and data presentation doesn’t necessarily require you to have a PhD in mathematics. Your basic editing and proofreading skills – together with a bit of logic and common sense – can often help you to spot when something’s amiss.
1. Words or figures in the text?
This issue will be familiar to most editors and proofreaders. When writing a number, should it be ‘five’ or ‘5’, for example? What about ‘twenty’ or ‘20’? ‘Three hundred and forty-nine’ or ‘349’? This will often be covered in your client’s style guide. A common convention is for numbers below 10 to be written as words, and 10 and above to be written as figures. But this can vary, so it’s worth checking.
In some (though not all) conventions, numbers within the same sentence should be made consistent:
In formal writing, it’s preferable to avoid starting a sentence with a figure. This might simply mean using words instead:
Alternatively – and particularly if the number is a large one, or it isn’t a whole number – it’s better to reword the sentence.
If you’re editing fiction, conventions might be slightly different. For example, numbers are often expressed in words when they appear in dialogue.
2. Talking about numbers in the text
There are various ways in which statistics can be unintentionally misreported in the text. As an example, let’s look at two statements.
It’s easy to see how the confusion arises: the sentences contain the same words – just in a slightly different order. In some cases – like this one – it’s obvious that one of these sentences is incorrect, purely on the basis of general knowledge. In other cases, such inaccuracies are more difficult to spot without checking elsewhere in the document.
Other things to look out for in the text include:
3. Consistency and common sense
In the same way that you’d proofread names, punctuation and capitalisation of words and phrases, you should be on the lookout for obvious errors and inconsistencies in numerical content. For example, when numbers are mentioned in more than one place (in a table, in the text, in the introduction, on a graph), are they consistent? Are the correct signs used (+, −)?
Similarly, in the same way that you’d use your own knowledge to spot errors when proofreading a general piece of text, you can sometimes see clearly that a mistake has been made. If an author claims that the population of London is 8,787 or 8.7 billion, you’ll probably realise that there’s something wrong. In a school report I was proofreading, I noticed that a student was congratulated on achieving 1.25 cm in the high jump, which conjured up an amusing – though misleading – image.
4. A note about averages
Remember when Michael Gove, the then UK Secretary of State for Education, stated that he wanted all schools to be performing above the national average? Think about it. If all schools are above the national average, that is no longer the average. As proofreaders and editors, we need to have some level of understanding of basic mathematical terminology – or at least, of words and concepts that are in everyday use – so that we can at least raise a query if something doesn’t look right.
5. Problems with percentages
An awareness of what percentages mean can help you to spot errors and problems. Even if you’re not going to be checking an author’s calculations, it does help to know that 50% is half of something, 33% around a third, and so on. So if an author claims that 24 of the 47 people in a survey are men, and that this proportion is 25% (i.e. a quarter), it’s something you need to query.
As well as basic calculation mistakes, it’s worth being on the lookout for inaccurate descriptions of percentages. I’ve seen ‘majority’ used to mean ‘the largest proportion’. Here’s an example:
This does not mean that ‘the majority of people are employed’ – ‘majority’ means ‘most’ (i.e. more than 50%), rather than simply the largest group.
Another thing to check is whether the percentages add up to 100. But beware – that might not always be appropriate. For example, if reporting on answers to a survey question where people could tick more than one option, the total could well be more than 100%. In this example, it’s clear that some people like both apples and bananas:
In cases like this, the percentages won’t necessarily add up to 100.
A more subtle issue when it comes to percentage is whether the author really does mean ‘per cent’ (%), or whether they mean ‘percentage points’. As an example, if the unemployment rate in 2001 was 5% and the unemployment rate in 2011 was 10%, the correct way to describe this would be to say that the 2011 rate is 5 percentage points higher than the 2001 rate. The unemployment rate is definitely not 5% higher (in fact, it is 100% higher!).
6. Calculation check
Even if the client isn’t expecting you to check all the calculations – some of which might be pretty complicated – there’s no reason you shouldn’t check straightforward sums. Adding up a column of numbers in a table can sometimes reveal some surprising errors, or issues that at least need to be queried. It can be handy to use Excel to check figures: simply cut and paste your column(s) of numbers into a blank spreadsheet and use the SUM function (∑ from the ribbon on the Home or Formulas tab).
You can also use Excel to perform other straightforward calculations (%, −, ×, ÷). Even if you don’t routinely use Excel, it’s worth familiarising yourself with the basic functions and with how to create a simple formula. Of course, you can always check figures using pen and paper or a calculator, but Excel can save you quite a bit of time (and – if used with care – can reduce the risk that you’ll make errors in your own calculations).
If mathematics isn’t your thing, it’s possible that your eyes glaze over when you see any type of equation. But even if you’re not expected to proofread equations in the text, you can use your existing skills to spot obvious errors. For example, you might notice a term in the equation that’s different from those mentioned in the text. There might be a reason for this, but it could be a typo, so it’s worth a query to the author. Look out, too, for equations that the author has copied and pasted (to save retyping) with the intention of putting in different figures or other terms. Has the author remembered to update all the details?
8. Number ranges
I occasionally come across a table or chart in which the number ranges look something like this:
This is a problem because the ranges don’t have clear upper and lower limits: they overlap. For example, in which category would a value of ‘20’ be placed? Only the originator of the data would know the answer, so there’s usually very little that the editor or proofreader can do to correct this, other than query it. Ideally, ranges should look like this:
Or like this:
And if the data includes values that are not whole numbers (e.g. 2.8, 19.99, etc.), the ranges will need to be more exact:
9. Units, decimal separators and thousands separators
Again, consistency is the key here, as well as sticking to the style requirements.
b) Decimal separators
In UK and US English, a full stop (full point) is usually used to separate whole numbers from decimals. The decimal point is usually on the baseline of the text, but sometimes a middle dot is used (e.g. 34·12). And in some countries a comma is used as a decimal separator. I’ve come across this most often in work by European authors, but the convention is also followed elsewhere.
c) Thousands separators
Similarly, there are different conventions for separating groups of digits in larger numbers. In many cases a comma is used (10,000), but some styles call for a non-breaking space (10 000).
10. Checking charts
Whole books have been written about the best way to present data in graphs, charts and diagrams, and the mistakes that can be made in data visualisation. Here, I’m just going to highlight some of the issues that I’ve come across when editing material containing such features.
a) A suitable style
Does the chart present the data clearly and unambiguously? For example, pie charts are often not a good way of presenting data, as this article explains. When checking a chart, graph or diagram, you need to ask whether it makes sense. Can you think of a clearer or better way of presenting the data?
It’s tempting to use colours and special effects to make a chart, graph or diagram more eye-catching and ‘interesting’. However, that’s often not necessary (unless such effects are part of house style or branding). Bear in mind that colours, shapes and fancy shading can be distracting and confusing for the reader. In any case, such effects will often be stripped out at the next stage of the publication process.
As a rule, these should always start at zero. In this example, the chart on the right seems to be suggesting that The Times newspaper has twice as many sales as the Daily Telegraph, but that’s simply because the vertical axis starts at 420,000 rather than at zero. Results can easily be distorted if the chart isn’t showing the full picture.
d) When charts go wrong
Special mention must be made of this chart, which shows the average female height in various countries. It’s misleading in a couple of important ways:
You first need to consider whether the table is suitable for the type of document (e.g. an academic paper vs. a brochure aimed at the general public) and whether it presents the data clearly.
When it comes to checking the details within a table, the advice I’ve already mentioned is relevant. Use your proofreading skills to check whether figures are consistent with those mentioned in the text. Add up columns to check totals. Apply your common sense to make sure the data looks correct.
It’s also worth checking the following specific points:
Are the units clearly stated? For a table displaying only one type of data (e.g. percentages or monetary values), the units are sometimes included in the table caption:
Table 1: Owner-occupiers as a proportion of the population, 1950–2017 (%)
If the table shows different types of information, the units might be included with each value. So the columns will look something like this:
Consider whether the table would be less cluttered if the units were added to the column headers and deleted from the data itself, as in this example:
b) Order of information
Check whether items are listed in a consistent way. For example, in the table above, the ‘countries’ are shown in alphabetical order. But the author could also have chosen to list them in ascending or descending order, by either the unemployment rate or the average income. Either of these alternatives would have been acceptable. However, if items are listed in apparently random or inconsistent order (e.g. in a series of tables), this might be confusing to the reader.
c) Row and column headings
Do these clearly explain the data in the table? If there’s more than one table with a similar layout, are the row and column headings presented in a consistent way?
d) Table layout
I recently came across a table similar to this one (I’ve only included part of the table, although the rest of it was similarly misleading). The author had tried to present information about the sample population, but the table was laid out in such a way that it implied an association between different variables. For example, it appeared that all the male participants are in the three younger age groups, and that all the female participants are in the older age groups.
The way this table is laid out suggests connections between the different categories that don’t (in this case) exist. And – to illustrate a point made earlier – you might have noticed that the figures in the ‘Age’ categories add up to 104, rather than 100 (the total for the ‘Gender’ categories).
Rather than presenting the information in this way, it would have been better to keep all the categories separate. This is the layout that I suggested to the author.
This version is not as concise as the original, but at least it doesn’t suggest relationships that aren’t (necessarily) there.
I hope this article has reassured you that your usual proofreading skills, together with a bit of common sense, will help you to check numerical information in a document, even if you don’t consider yourself a ‘numbers’ person. Here’s a summary of my top tips for approaching this type of work.