How to Design and Report Experiments Read online

Page 2


  To draw meaningful conclusions about the relationships between variables, scientists have to measure them in some way (see Box 1.1). Psychologists cannot measure psychological constructs directly and so instead we use techniques such as self-report (e.g. asking people how they feel) and questionnaires. Any device we use to measure something will provide a different quality of data. There are basically four levels at which variables can be measured:

  1. Nominal (a.k.a. categorical)

  2. Ordinal Non-parametric;

  3. Interval

  4. Ratio Parametric

  We’ll discuss each of these levels in turn, but for those wanting a gentler introduction Sandy MacRae (1994) covers the material excellently.

  Nominal Data

  The word nominal derives from the Latin word for name and the nominal scale is literally a scale on which two things that are equivalent in some sense are given the same name (or number). With this scale, there is no relationship between the size of the number and what is being measured; all that you can tell is that two things with the same number are equivalent whereas two things with different numbers are not equivalent. The classic example is numbers in a football team. A player with number 7 on his back should play in mid-field, whereas a player with number 1 on his back plays in goal. However a number 7 player is not necessarily better than a number 1 (most managers would not want their midfielder playing in goal!). The numbers on the back of shirts could equally well be letters or names (in fact, until recently many rugby clubs denoted team positions with letters on the back of shirts).

  Data from a nominal scale should not be used for arithmetic because doing so would be meaningless. For example, imagine if the England coach found that his number 7 (David Beckham) was injured. Would he consider replacing him with seven David Seaman (who plays number 1) or – heaven forbid – combine Phil and Gary Neville (at numbers 2 and 5)? Even more ludicrous, I used to play wing in rugby (number 1 1 – the fast good-looking ones who score all the tries, ahem, well maybe not!). Imagine if one day the coach replaced a number 11 with a number 8 (burly bloke at the back of the scrum) piggy-backing a number 3 (huge bullock-like blokes at the front of the scrum)! They certainly wouldn’t be as fast (or good looking!) as a number 11. The only way that nominal data can be used is to consider frequencies. For example, we could look at how frequently number 1 Is score tries compared to number 3s. Having said this, as Lord (1953) points out in a very amusing and readable article, numbers don’t know where they came from and will behave in the same way, obeying the same arithmetic rules regardless.

  Ordinal Data

  Ordinal data give us more information than nominal data. If we use an ordinal scale to measure something, we can tell not only that things have occurred, but also the order in which they occurred. However, these data tell us nothing about the differences between values. Figure 1.1 illustrates ordinal data: imagine you went to a frog race in which there were three frogs (Silus, Hoppy and Flibbidy – or Flibbs to his mates). The names of frogs don’t give us any information about where they came in the race, however if we label them according to their performance – first, second and third – then these labels do tell us something about how the frog performed; these categories are ordered. In using ordered categories we now know that the frog that came second was better than the frog that came third.

  The limitation of ordinal data is that it tells us little about the differences between ordered categories; we don’t know how much better the winner was than the frogs that came second and third. In Figure 1.1 the two races show Flibbs winning, Hoppy coming second and Silus losing. So, the ordered categories attached to each frog are the same in the two races: Flibbs is 1, Hoppy is 2, and Silus is 3. However, in the first race Flibbs and Hoppy tightly contested first place but Silus was way behind (so first and second place were actually very similar to each other in terms of performance), but in the second race Flibbs is a long way ahead whereas Hoppy and Silus are very similar (so first and second place are very different in terms of performance). This example shows how ordinal data can tell us something about position but nothing about the relative differences between positions (first place is always better than second place, but the difference between first and second place can vary). Nominal and ordinal scales don’t tell us anything about the differences between points on the scale and need to be analysed with non-parametric tests (see Chapter 7).

  Figure 1.1 Two frog races

  A lot of psychological data, especially questionnaire and self-report data, are ordinal. Imagine we asked several socially anxious individuals to think of embarrassing times in their lives, and then to rate how embarrassing each situation was on a 10-point scale. We might be confident that a memory they rate as 10 was more embarrassing than one they rate as 5, but can we be certain that the first memory was twice as embarrassing as the second? How much more unreliable does this become if we compare different people’s ratings of their memories – would you expect a rating of 10 from one person to represent the same level of embarrassment as another person’s or will their ratings depend on their subjective beliefs about what is embarrassing? Most self-report responses are likely to be ordinal and so in any situation in which we ask people to rate things (e.g. rate their confidence about an answer they have given, rate how scared they are about something, rate how disgusting they find some activity) we should regard these data as ordinal although many psychologists do not.

  Interval Data

  Interval data are considerably more useful than ordinal data and most of the statistical tests we use in psychology rely on having data that are measured on an interval scale. To say that data are interval, we must be certain that equal intervals on the scale represent equal differences in the property being measured. So, for example, if a psychologist took several spider phobic individuals, showed them a spider and asked them to rate their anxiety on a 10-point scale, for this scale to be interval it must be the case that the difference between anxiety ratings of 5 and 6 is the same as the difference between say 1 and 2, or 9 and 10. Similarly, the difference in anxiety between ratings of 1 and 4 should be identical to the difference between ratings of 6 and 9. If we had 4 phobic individuals (with their anxiety ratings in brackets): Nicola (10), Robin (9), Dave (2) and Esther (3), an interval scale would mean that the extra anxiety that Esther subjectively experiences compared to Dave is equal to the extra anxiety that Nicola experiences compared to Robin. When data have this property they can be analysed with parametric tests (see Chapter 6).

  Ratio Data

  Ratio data are really a step further from interval data. In interval data, all that is important is that the intervals between different points on the scale represent the same difference at all points along the scale. For data to be ratio they must have the properties of interval data, but in addition the ratios of values along the scale should be meaningful. We use ratio measurement scales all the time in every day life; for example, when we measure something with a ruler we have a ratio scale because not only is it true that, say, the difference between 25 and 30 cms (a difference of 5 cms) is the same as the difference between 80 and 85 cms or 57 and 62 cms, it is also true that distances along the scale are divisible (e.g. we know that something that is 20 cms is twice as long as something that is 10 cms and twice as short as an object that is 40 cms).

  It is possible for a scale to be interval but not ratio (although by definition if a scale is ratio then it must be interval). One good example is temperature (when measured in Celsius). On the Celsius scale it is the case that the difference between, say, 20° and 27° is the same as the difference between 85° and 92° (for both the difference is 7° and in both cases this 7° difference is equivalent), however, it is not the case that 40° is twice as hot as 20°. The reason is because the Celsius scale has no absolute zero value (you can have minus temperatures). One example of a ratio measure used in psychology is reaction time; if, in our spider phobia example above, we measured the speed at which each phobic reacted to the
spider (how long it took them between seeing the spider and running away from it) this would give us ratio data. We can use other measures though, such as the percentage score on a test or the number of errors someone makes on a task.

  Discrete versus Continuous Variables

  Earlier on we mentioned that variables could take many forms, and two important forms are whether variables are continuous or discrete. A discrete variable is one for which no underlying continuum exists; in other words, the measure classifies items into non-overlapping categories. One example would be being pregnant: a woman can be either pregnant or not, there isn’t such a thing as being ‘a bit pregnant’. Other variables are measured in a continuous way: for example aggression probably runs along some kind of continuum (beginning at calm and ending at extremely violent). The distinction between discrete and continuous variables can be very fuzzy indeed. For example, at first glance gender seems like a discrete variable (you can be either male or female but not both), however there probably is some kind of underlying continuum because some genetic females can be quite masculine and some genetic males can be very feminine (either in looks or behaviour); there are also chromosomal disorders that can confuse the genetic gender of a person. To confuse matters further, some continuous variables can be measured in discrete terms; for example, although reaction times are continuous, they may, in practice, be measured in discrete terms (i.e. we tend to measure to the nearest millisecond even though in theory it could be measured in infinitely small steps!).

  1.2 Experimental versus Correlational Research

  * * *

  So far we’ve learnt that scientists do experiments to answer questions, and that questions can usually be answered either by observing what naturally happens, or by manipulating some aspect of the environment and observing the effect it has on some variable of interest. In addition any question that you want to answer will involve some variables that need to be measured in some way. The main distinction between what might be termed correlational research (that is, where we observe what naturally goes on in the world without directly interfering with it) and experimental research is the fact that experimentation involves the direct manipulation of variables. In correlational research we either observe natural events (such as facial interactions between a mother and child) or we take a snapshot of many variables (such as administering several questionnaires, each measuring a different aspect of personality, at the same point in time, to see whether certain personality characteristics occur in the same people at that moment in time). The good thing about this kind of research is that it provides us with a very natural view of the question we’re researching: we are not influencing what happens and so we get measures of the variables that should not be biased by the researcher being there (this is an important aspect of ecological validity). If research questions can be answered using the correlational method then why bother doing experiments? To answer this question we need to look at the philosophy underlying science.

  Claws (. . .groan!) and Effect

  We began the chapter by discussing some research questions and mentioned that these questions implied some outcome had changed as a result of some other variable. As such, research questions often imply some kind of causal link between variables. Sometimes this could be in a direct statement such as ‘does smoking cause cancer?’ or sometimes the implication might be subtler. Taking one of the other examples (‘does reading a book on experimental design help you to design experiments?’), the implication is that reading a book on experimental design will have an effect (one way or another) on your ability to design experiments. Many research question can basically be broken down into a proposed cause (in this case reading a book) and a proposed outcome (your ability to design an experiment). Both the cause and the outcome are variables: for the cause some people will have read this book whereas others won’t (so it is something that varies), and for the outcome, well, people will have different abilities to design experiments (again, this is something that varies). The key to answering the research question is to uncover how the proposed cause and the proposed outcome relate to each other; is it the case that the people good at designing experiments are the same people that read this book (hopefully so, but probably not!)?

  Hume

  How do we discover a causal relationship between variables? Well, this question has been long-debated by people much cleverer than I am (I don’t speak for Graham) and philosophers and methodologists have spent (literally) centuries arguing about it (did they have nothing better to do?). Much of what we accept as conventional wisdom on cause and effect stems from David Hume’s (1739–40, 1748) ideas about causality. Hume stressed the importance of observed temporal regularities between variables. In essence Hume proposed three criteria that need to be met to infer cause and effect: (1) cause and effect occur close together in time (contiguity); (2) the cause must occur before an effect does; and (3) the effect should never occur without the presence of the cause. In essence, these conditions imply that causality can be inferred through corroborating evidence, and cause is equated to high degrees of correlation between contiguous events. However, Hume also pointed out that the inference to causality was a psychological leap of faith and not one that was logically justified. What is the problem with these ideas? Think about it while we have a look at an illustration of the principles.

  Figure 1.2 illustrates some of Hume’s principles; in this example we are trying to confirm the causal statement ‘Andy talking about causality causes boredom’. According to what we’ve learnt about Hume, proving this statement requires three things: (1) boredom and me talking about causality must occur contiguously (close in time); (2) me talking about causality must occur before boredom does; and (3) boredom should not occur without me talking about causality (so the correspondence between boredom and me talking about causality should be strong). Looking at Figure 1.2 it’s clear that me talking about causality and boredom occur close in time. Also in all situations in the diagram talking about causality precedes boredom; hence conditions 1 and 2 are satisfied. You should also note that in 5 out of 6 of the situations shown in the diagram talking about causality results in boredom, so the correspondence between cause and effect is very strong, and at no point do we see a bored face preceded by anything other than me talking about causality (which satisfies condition 3).

  Earlier on I asked you to think about possible problems with these criteria, and one already emerges: there is an instance in Figure 1.2 in which talking about causality leads to happiness (not boredom). This instance doesn’t contradict any of Hume’s criteria yet surely poses a doubt about the causal connection between talking about causality and boredom (because in one situation the proposed cause does not have the desired effect). There are also mathematical reasons why a correlation between variables does not imply causality (see Field, 2000, Chapter 3 for some general discussion of this issue). There are two main reasons why correlation does not imply causality:

  The tertium quid: This always makes me think of seafood for some reason (‘I’ll have the grilled quid, please’), but it actually means a third person or thing of indeterminate character. In this context it is another variable that has an effect on both the proposed cause and the outcome. For example, we might observe a strong correlation between having dreadlocks and supporting environmental issues. However, it is unlikely that having dreadlocks causes an interest in the environment – presumably, there is an external factor (or factors) that causes both. These extraneous factors are sometimes called confounding variables or confounds for short. So, in the example of me talking about causality and people being bored it could be that some other factor is causing the boredom (perhaps the person I’m talking to is tired and hungover and that is what’s making them bored – I live in hope!).

  Direction of causality: Although Hume’s condition of temporal precedence (cause must precede effect) allows logical inference about the direction of causality, mathematics cannot prove the direction of cause. So, although it m
ight look as though me talking about causality causes boredom, there is no reason why it cannot be that boredom causes me to talk about causality (it certainly causes me to write about it!).

  Figure 1.2 A demonstration of Hume’s criteria for causality

  Run of the (John Stuart) Mill?

  John Stuart Mill was one of the main proponents of Inductivism, which is a view that science should be based on inductive reasoning. Inductive reasoning is just reasoning based on probable outcomes. For example, ‘Andy likes statistics, Andy writes statistics books, therefore Andy is a dullard’ is an example of inductive reasoning: based on the two premises the conclusion is probably accurate. It could be false – I might be interesting – but it is more likely to be true that I’m as dull as dishwater. In science, inductive reasoning really relates to extrapolating from a set of observations to some more general conclusions.

  Mill (1865) ‘borrowed’ many of Hume’s ideas to formulate his thinking about causality and ultimately expanded upon Hume’s original ideas considerably. He described three conditions necessary to infer cause:

  Cause has to precede effect.

  Cause and effect should correlate.

  All other explanations of the cause-effect relationship must be ruled out.

  The first two conditions mirror those of Hume: temporal precedence of the cause and a strong correlation between cause and effect. Mill’s main contribution was to add the third condition, which suggests that the effects of a tertium quid should be ruled-out. To verify that the third criterion is true, Mill proposed several methods: