 To explain
 how effect sizes and confidence intervals
 can be employed
 in primary studies
 as indicators of amount of psychological change.

 efficacy of the intervention
 validity, sensitivity and relevance of the DV
 appropriateness of the analyses
 interpretation of the results
 > understanding and controlling of causative processes

 The Significance Testing Controversy
 What is Metaanalysis?
 Effect Sizes
 Interpretation of Effect sizes
 Confidence Intervals
 Graphical Displays
 Benchmarking & Comparisons
 Future directions

 Statistical significance testing was developed by Fisher to determine
whether some agricultural techniques were superior to other techniques

 Statistical significance in a study with:
 N=10?
 N=100?

 Statistical significant testing has been utilised with little adaptation
in psychological research, even though quite different questions are
often being asked
 This has undermined the value of much psychological research

 Calls for a shift away from significance testing have been largely
unheeded for approx. 30 years

 Power ~.60 in social science research
i.e. on average, 40% chance of Type II error
 Under reporting of power
 Under reporting of effect sizes

 “Despite numerous efforts to change selfconcept there appears to be no
consistent answer as to whether it is possible”
 Janet Hattie (1992, p.221)

 Ways of Measuring Psychological Change
Clinical Observation/Opinion
 Difference Scores
 T Scores
 Significance Testing
 Effect Sizes & Confidence Intervals

 Ways of Reviewing Research on Psychological Change
Traditional Literature Review
 Vote Counting
 Secondary Analysis
 Metaanalysis
 Megaanalysis

 Psychotherapy Debate
 To counter what appeared to
be
selectivity of studies
included in a review of
psychotherapy effects by Eysenck,
Glass introduced a
procedure
he termed metaanalysis.
[1976,1977]

 Equivalent to traditional (qualitative) review paper
 Enters summary quantitative data from each study into a new database,
with IV codings
 Overall effects are summarised and variance predicted
 Used in medicine, psychology and education
 Outcome measure of interest is the ‘effect size’

 A standardised measure of
 ‘how much change’ OR
 ‘how much shared variation’

 Cohen’s d
 Hedge’s g
 Pearson’s r
 ANOVA  etasquare, omegasquared
 Regression  R squared
 Categorical  Phi & Cramer’s V

  norms
  control group
  pooled

 A measure of
the difference between two means
in standard deviation
units.
d is equivalent to the differences between two z scores

 ve = negative change
 0 = no change
 +ve = positive change

 Cohen (1977): .2 = small

.5 = moderate

.8 = large
 Wolf (1986): .25 = educationally
significant

.50 = practically
signficant
(therapeutic)
 ESs are proportional, e.g.,
.40 is twice as much change as .20

 No agreed standards
 Interpretation is subjective
 Best approach
 compare with previous findings

 Adult psychotherapy outcomes
: .68
(Smith, Glass & Miller, 1980)
 Children psychotherapy outcomes
: .71
(Casey & Berman, 1980)
 Classroom intervention  Achievement
: .40
(cited in Hattie, Marsh, Neill, & Richards, 1997)
 Classroom intervention  Affective : .28
(cited in Hattie, Marsh, Neill, & Richards, 1997)
 Selfconcept intervention programs
: .37
(Hattie, J.A., 1992)

 Adolescent OE programs (43 studies) : .31
(Cason & Gillis, 1994)
 All OE research (96 studies) : .34
(cited in Hattie, Marsh, Neil, & Richards, 1997)
 Adventure Therapy  LOC : .38
(Hans, 1997, 2000)
 USA summer camps with selffocus
: .41
(cited in Hattie, Marsh, Neil, & Richards, 1997)

 Psychotherapy
30% improvement for average client
 Classroombased affective programs
11% improvement for average students
 Outdoor education
13% improvement for average participant
 65% of OE participants are better off
than people who don’t do an OE program
(35% are not better off!)

 Benchmarking for program evaluation and quality assurance
 Increasing opportunities for cumulative, primary data research
 MA may become common expectation for literature reviewing

 Use MAs and ESs in your literature reviews
 Report ESs and CIs for your primary data
 Discuss relevant ES comparisons
 Suggest benchmarks
 When reporting significance, report power

