#
All entries for Wednesday 03 August 2005

## August 03, 2005

### Back to Work

Things to do:

(1) Mark suggests that there are 3 components in the model of missing data: model of Y given X, model of X and model of R (the missing variable). For example, TLR assume that Y is linear in X but does not assume the distribution of X and the parametric model for R. So TLR is parametric in Y given X but not in X and R. Sample Selection model of Newey et al. seems not to restrict the model of Y given X but assume some thing about the model of R.

So we have to read and to clarify about this point.

(2) Emprircal Works: need to read the paper of Skinnner and do the IPW estimation with the data set.

Also, after doing IPW, think about the criteria of comparing different estimators. For building the wage distribution, we dont have to estimate the fitted values??

(3) one of the problems discussed was that TLR needs as less X as possible because we would like to be non-parametric about the disribution of X.

(4) Mark would like us to do (i) descriptive stats on all variables, (ii) complete case analysis (probably both unweighted and weighted using basic weight (i.e. non-income one)), (iii) IPW estimation (wtd & unwtd) with the following variables as our X's:

Years of education (= age left full-time education – 5)

Experience (potential) (= Current age – age left full-time education)

Experience squared

[These 3 variables are the core of the "Mincerian earnings equation" that economists generally start with]

Female (dummy)

Married (dummy)

Part-time (dummy)

Size of workplace (dummy for 25+)

London & South-East (combined dummy)

[These 5 variables are all commonly used ones and give a good mixture of characteristics of the individual and characteristics of the job]

(5) an issue with IPW is that whether X and Z have to be different

(6) Empirical Work: should we include age < 22 in our sample

as Skinner does?