[Limdep Nlogit List] A selection qeustion--a quick follow-up

Fri Oct 19 05:53:07 EST 2007

Agreed completely on all this.

Bayesians (I'm one, sort of) like to trumpet the superiority of their methods for
*high-dimensional* models. For example, models involving heterogeneity are is
very, very difficult to estimate classically outside of certain specific instances
(e.g., Bryk-Raudenbush HLM territory), and are especially tough in discrete choice
settings. Prof. Greene is right that, for homogeneous models, you're only going to
get suspect classical results if your sample is very small. [One other case is
when you are estimating a bounded parameter, and want to rely on dividing the
estimate by its standard error... but that's just faulty reasoning, and nothing
intrinsically wrong with classical estimation.]

As for the standard errors being off sometimes, this is actually what Puhani found
-- unless I'm misremembering -- using certain classical estimators, especially so
two-steps for Heckman-type set-ups. I'd also point out that Bayesian approaches
not only provide the marginal densities of any parameters of interest, but also
give the joint density of all unknown quantities, which are sometimes legitimate
objects of inquiry of their own.

Sorry if I came across as a Bayesian cultist.

FF

William Greene wrote:

> Fred (and all).  The Terza JAE (2002) paper is a FIML estimator for
> the MNL model with selection that was queried in the first email. The
> use of the Taylor series in this paper was to describe an alternative
> approach by Mullahy and Sindelar.  I am current working on an implementation
> of the model in LIMDEP, but am not ready to say it is complete. The
> Terza and Kenkel (2001) paper is a formal sample selection model for
> counts that derives the appropriate conditional mean function for the
> Poisson variable under selection, and estimates it by nonlinear least
> squares. It does not use an IMR approach.  Terza and I developed the
> FIML estimator for this model simultaneously (to some extent, jointly)
> in the early 1990s.  It is currently fully implemented in LIMDEP; however,
> being a model for counts, it is not helpful to this discussion.
>   The Boyes, Hoffman and Low paper describes the bivariate probit model
> with sample selection that has been fully implemented in LIMDEP since
> about 1992.  Likewise the Poirier model on partial observability, which
> is one of several forms of this model.  The LIMDEP manual contains
> extensive documentation on all of these models.  FIML estimators for all
> of them are incorporated in LIMDEP.  They do not use IMRs.  On the other
> hand, neither are these helpful; they are not the model that was queried
> about.  As Fred points out, the modeling of the joint distribution of the
> disturbances that is used in the probit and Heckman linear models does not
> carry over to the multinomial logit model, so the problem remains.
>    The business of the standard errors of MLEs in these models relying
> on suspect asymptotic results is a canard.  The idea that the Bayesian
> estimators somehow get everything right and the MLE doesn't has two flaws,
> notwithstanding all the other problems listed: (1) If the sample is
> at all large, and that does not have to be excessively so, though the
> typical application does use a big sample, then the asymptotics for these
> models are fine, and, in fact, they are very well behaved.  I and many
> others have been using them for a couple decades.  Moreover, even in a
> moderately sized sample, unless one has a very strong prior - and that
> would taint the results - the Bayesian estimates are generally close
> to identical to the MLEs (a theorem of Bernstein and von Mises is at work).
> Indeed, it is a signal that something is wrong when the Bayesian
> estimate deviates too much from the MLE.  After all, if the sample is
> relatively large, the log likelihood is nearly symmetric and unimodal
> (central limit theorem), so the mode (MLE) and the mean (posterior mean)
> coincide.
> (2) If the sample is so
> small that these criticisms would have any relevance -- and we would be
> talking about small 2 digit sample sizes, then the fact that the Bayesian
> estimators have exact small sample properties is a flaw not a virtue. The
> results are posterior to just the sample data, and can't be extended beyond
> the small sample in hand. But, it is a heroic (and very classical) assumption
> that somehow this small sample tells you the story about the whole
> population you are trying to characterize.
>
> /Bill Greene