#
All entries for Monday 15 August 2005

## August 15, 2005

### IPW in Wooldridge (2003)

To explore the following issues: when Z is not completely recorded and when the model of R is incorrectly specified.

It is shown here that estimated probabilities yield more efficient estimator (than that using the known ones) as long as the generalised version of information matrix equality holds in the first-step estimation.

Z can be missing when R=1 if the model of R is the conditional log-likelihood function for the cencoring values in the context of censored survival or duraiton analysis.

When the sampling is exogenous (or R depends only on X) and the expectation of the objective function is conditional on X (no misspecification), if we you use Weighted estimator then the selection model (R's model) is allowed to be misspecified.

This should work well in SCENARY 2. In this scenary, we fully record X and R depends on X. The incentive for using Unweighted is that if the feature of interest is correctly specified and GCIME holds than it will be consistent and efficiency. (Note that, in MLE, this requires correct specification in the mean function (for consistency) and the conditional density (for GCIME)!!!

However, using Weighted estimator allows misspecification in both the model for the feature of interest (pop mean or median functions) and the model for missing-data mechanism. This sounds very promissing indeed.

In term of efficiency, we note below that asym var of estimated and unestimated (known) IPW estimator are the same under exogenous sampling and correct specification. From the result about misspecification in selection model, we can relax this result a bit since we no longer require the first-step estimation to be MLE and the correct specification of its model. Now, we can allow for any regular estimation problem with conditional variable Z and allow the misspecification in the probability of selection model (as long as sampling is exogenous and, say, conditional median is correctly specified).

This result extends the cases where GCIME holds, that is Unweighted is more efficient than Weighted ( even though selection model in Weighted estimation is allowed to be misspecified)