close

Вход

Забыли?

вход по аккаунту

код для вставкиСкачать
The choice between fixed and random
effects models: some considerations
for educational research
Claire Crawford
with Paul Clarke, Fiona Steele & Anna Vignoles
Motivation
• Appropriate modelling of pupil achievement
– Pupils clustered within schools → hierarchical models
• Two popular choices: fixed and random effects
• Which approach is best in which context?
– May depend whether primary interest is pupil or school
characteristics
– But idea is always to move closer to a causal interpretation
Outline of talk
• Why SEN?
• Fixed and random effects models in the
context of our empirical question
• Data and results
• Conclusions
Special educational needs (SEN)
• One in four Year 6 pupils (25% of 10 year olds) in
England identified as having SEN
– With statement (more severe): 3.7%
– Without statement (less severe): 22.3%
• SEN label means different things in different schools
and for different pupils
– Huge variation in numbers of pupils labelled across schools
– Assistance received also varies widely
• Ongoing policy interest (recent Green Paper)
Why adjust for school effects?
• Want to estimate causal effect of SEN on pupil
attainment no matter what school they attend
• Need to adjust for school differences in SEN labelling
– e.g. children with moderate difficulties more likely to be
labelled SEN in a high achieving school than in a low
achieving school (Keslair et al, 2008; Ofsted, 2004)
– May also be differences due to unobserved factors
• Hierarchical models can account for such differences
– Fixed or random school effects?
Fixed effects vs. random effects
• Long debate:
– Economists tend to use FE models
– Educationalists tend to use RE/multi-level models
• But choice must be context and data specific
Basic model
y is   0   1 X is  u s  e is
• FE: us is school dummy variable coefficient
• RE: us is school level residual
– More flexible and efficient than FE, but:
– Additional assumption required: E [us|Xis] = 0
• That is, no correlation between unobserved school
characteristics and observed pupil characteristics
• Both: models assume: E [eis|Xis] = 0
– That is, no correlation between unobserved pupil
characteristics and observed pupil characteristics
Relationship between FE, RE and OLS
y is   0   1 X is  u s  e is
FE:
y is  y i   1 ( X is  X i )  ( e is  e i )
RE:
y is   y i   1 ( X is   X i )  ( e is   e i )
Where:
  1
1
1  S u /  e
2
2
How to choose between FE and RE
• Very important to consider sources of bias:
– Is RE assumption (i.e. E [us|Xis] = 0) likely to hold?
• Other issues:
–
–
–
–
Number of clusters
Sample size within clusters
Rich vs. sparse covariates
Whether variation is within or between clusters
• What is the real world consequence of choosing
the wrong model?
Sources of selection
• Probability of being SEN may depend on:
– Observed school characteristics
• e.g. ability distribution, FSM distribution
– Unobserved school characteristics
• e.g. values/motivation of SEN coordinator
– Observed pupil characteristics
• e.g. prior ability, FSM status
– Unobserved pupil characteristics
• e.g. education values and/or motivation of parents
Intuition I
• If probability of being labelled SEN depends
ONLY on observed school characteristics:
– e.g. schools with high FSM/low achieving intake
are more or less likely to label a child SEN
• Random effects appropriate as RE assumption
holds (i.e. unobserved school effects are not
correlated with probability of being SEN)
Intuition 2
• If probability of being labelled SEN also depends
on unobserved school characteristics:
– e.g. SEN coordinator tries to label as many kids SEN as
possible, because they attract additional resources
• Random effects inappropriate as RE assumption
fails (i.e. unobserved school effects are correlated
with probability of being SEN)
• FE accounts for these unobserved school
characteristics, so is more appropriate
– Identifies impact of SEN on attainment within schools
rather than between schools
Intuition 3
• If probability of being labelled SEN depends
on unobserved pupil/parent characteristics:
– e.g. some parents may push harder for the label
and accompanying additional resources;
– alternatively, some parents may not countenance
the idea of their kid being labelled SEN
• Neither FE nor RE will address the
endogeneity problem:
– Need to resort to other methods, e.g. IV
Other considerations
• Other than its greater efficiency, the RE model
may be favoured over FE where:
– Number of observations per cluster is large
• e.g. ALSPAC vs. NPD
– Most variation is between clusters
• e.g. UK (between) vs. Sweden (within)
– Have rich covariates
Can tests help?
• Hausman test:
– Commonly used to test the RE assumption
• i.e. E [us|Xis] = 0
– But really testing for differences between FE and
RE coefficients
• Over-interpretation, as coefficients could be different
due to other forms of model misspecification and
sample size considerations (Fielding, 2004)
– Test also assumes: E [eis|Xis] = 0
Data
• Avon Longitudinal Study of Parents and
Children (ALSPAC)
– Recruited pregnant women in Avon with due
dates between April 1991 and December 1992
– Followed these mothers and their children over
time, collecting a wealth of information:
•
•
•
•
Family background (including education, income, etc)
Medical and genetic information
Clinic testing of cognitive and non-cognitive skills
Linked to National Pupil Database
Looking at SEN in ALSPAC
• Why is ALSPAC good for looking at this issue?
– Availability of many usually unobserved individual
and school characteristics:
• e.g. IQ, enjoyment of school, education values of
parents, headteacher tenure
Descriptive statistics
• 17% of sample are identified as having SEN at age 10
Individual characteristics
School characteristics
Standardised KS1 APS
-0.104** % eligible for FSM
-0.002**
IQ (age 8)
-0.003** H’teacher tenure: 1-2 yrs
-0.044**
SDQ (age 7)
0.012** H’teacher tenure: 3-9 yrs
-0.046**
Mum high qual vocational
-0.028*
Mum high qual O-level
-0.021
Mum high qual A-level
-0.033*
Mum high qual degree
-0.019
H’teacher tenure: 10+ yrs
-0.031
Observations
5,417
Notes: relationship between selected individual and school characteristics and SEN status. Omitted
categories are: mum’s highest qualification is CSE level; head teacher tenure < 1 year.
SEN results
Fixed
effects
-0.335**
[0.025]
Random
effects
-0.330**
[0.025]
Intra-school
correlation
0.175
%
difference
1.5
M2: M1 + admin data
-0.347**
[0.025]
-0.342**
[0.025]
0.161
1.4
M3: M2 + typical survey data
-0.355**
[0.025]
-0.349**
[0.024]
0.086
1.7
M4: M3 + rich survey data
-0.321**
[0.024]
-0.314**
[0.024]
0.076
2.2
M5: M4 + school level data
-0.321**
[0.024]
-0.319**
[0.024]
0.064
0.6
M1: KS1 APS only
Notes: ** indicates significance at the 1% level; * at the 5% level. Robust standard errors are shown in parentheses.
Summary of SEN results
• SEN appears to be strongly negatively correlated with
progress between KS1 and KS2
– SEN pupils score around 0.3 SDs lower
• Choice of model does not seem to matter here
– FE and RE give qualitatively similar results
– Suggests correlation between probability of having SEN
and unobserved school characteristics is not important
• Consistency across specifications suggests regression
assumption is also likely to hold
Summary of FSM results
• In contrast to the SEN results, the estimated effects
of FSM on attainment decrease as richer data is used
– Suggests that the regression assumption may fail in models
with few controls, such as those based on admin data
• There are also relatively larger differences between
FE and RE models until we add school characteristics
– Suggests that the RE assumption is less likely to hold here
Conclusions
• Approach each problem with agnostic view on model
– Should be determined by theory and data, not tradition
• FE should be preferred when the selection of pupils
into schools is poorly understood or data is sparse
• RE should be preferred when the selection of pupils
into schools is well understood and data is rich
• Worth remembering that neither FE nor RE deals
with correlation between observed and unobserved
individual characteristics
FSM results
Fixed
effects
-0.157**
[0.028]
Random
effects
-0.175
Intra-school
correlation
0.145**
[0.028]
%
difference
11.5
M2: M1 + admin data
-0.122**
[0.028]
-0.138
0.161**
[0.027]
13.1
M3: M2 + typical survey data
-0.089**
[0.029]
-0.103
0.086**
[0.028]
15.7
M4: M3 + rich survey data
-0.089**
[0.028]
-0.102
0.076**
[0.028]
14.6
M5: M4 + school level data
-0.089**
[0.028]
-0.095
0.064**
[0.028]
6.7
M1: KS1 APS only
Notes: ** indicates significance at the 1% level; * at the 5% level. Robust standard errors are shown in parentheses.
1/--страниц
Пожаловаться на содержимое документа