[Limdep Nlogit List] MNL Dummy Variable issues

Thomas C. Eagle teagle at tceagle.com
Fri Oct 13 23:17:18 EST 2006


You have to have one alternative have a parameter of zero for the constant.  If
you do not then you have a vector of ones across all alternatives in the choice
set which leads to a linear dependency.  This is critical in choice modeling.  I
say choose on carefully only to make you think about the ramifications of
setting the alternative you choose to have a constant parameter equal to zero.

If you do not think predictability will be improved with the pRF1*gender
interaction then do not do it.....

I really think you better get a handle on what the terms of you model mean in
terms of interpreting the utility function.  Personally explanation is more
important to me than predictability.  Why build the model if you are not trying
to explain something.  Why add gender if you do not think there is a difference
between genders.  Build you model based upon theory!

Tom

-----Original Message-----
From: limdep-bounces at limdep.itls.usyd.edu.au
[mailto:limdep-bounces at limdep.itls.usyd.edu.au] On Behalf Of contactemt
Sent: Friday, October 13, 2006 8:53 AM
To: Limdep and Nlogit Mailing List
Subject: Re: [Limdep Nlogit List] MNL Dummy Variable issues

Sorry dont understand this:

>Choose this single alternative carefully.

Why? All alternatives are generic. What will be so special about the one I 
choose?
The only distinction I have is that one of the alternatives is chosen in 
each choice set.

> However, if you do this why would you not expect differences in the 
> parameter
> for pRF1 across gender?

Does it matter? All I want is a better fit for predictive purposes ie for a 
novel choice set the probability that each alternative has for being the one 
chosen as "the best".

If adding the gender provides a better prediction I am happy - if it doesn't 
I leave it out. I just want the ability to test its impact in teh MNL model.

Cheers


> If you have "unlabeled alternatives" (also called generic) you can create 
> your
> own generic dummy variable that you can use in place of the ONE option 
> given in
> the NLOGIT setup.  Simply code a new attribute (call it genASC) with the 
> value
> of 1.0 for every alternative EXCEPT one.  For that single alternative give 
> the
> same new attribute a coded value of 0.  Choose this single alternative
> carefully.  It becomes your base alternative.  Now create a 2nd new 
> variable
> that is the interaction of gender and genASC (gendXCon = gender * genASC). 
> Your
> utility function would now be genASC + gendXCon + pRF1.  The code might 
> look
> like:
>
> NLOGIT ; Lhs = CHOICE, SETSIZE
>  ; Rhs = genASC, gendXCon, pRF1
>  ; Prob = probs $
>
> This interacts gender with every alternative except the base.  This 
> essentially
> tests whether the overall constant's utility is different across genders.
> However, if you do this why would you not expect differences in the 
> parameter
> for pRF1 across gender?
>
> Tom
>
> -----Original Message-----
> From: limdep-bounces at limdep.itls.usyd.edu.au
> [mailto:limdep-bounces at limdep.itls.usyd.edu.au] On Behalf Of contactemt
> Sent: Friday, October 13, 2006 6:45 AM
> To: Limdep and Nlogit Mailing List
> Subject: Re: [Limdep Nlogit List] MNL Dummy Variable issues
>
> Hi,
>
> I have understood what ASC are now and realise they are not applicable to 
> my
> model :)
>
> My model deals with generic choices - Greene uses the term "unlabeled" -
> with the further complication that the choice set size can vary. I have
> looked at his book and he describes the problem with unlabeled and ASC's 
> in
> Appendix 10A.
>
> Coincidentally, in his example he uses gender as a non varying parameter
> (within sets) as I did.
> However, to get around the lack of ASC's he uses a pre defined utility 
> model
> and combines it with one of the utility variables. Why he chooses a
> particular variable I don't know - and what is TTgen anyway (if you have 
> the
> book).
>
> My variables are purely measured items - I do not wish to apply utility
> constraints to them. I want the data to describe the model. Any utility
> constructs I place on the model would be arbitrary. So is there another
> approach I can use?
>
> But if not, say I do try to use interactions to include categories - or
> SDC's as he calls them:
>
> How should I choose the variable(s) to interact with?
> Should I choose 1 or many?
> Should I only include the variable interaction term or should I include it
> by itself also - what would be the point as they would be collinear 
> wouldn't
> they?
> How should I encode gender?
> If I choose 1,0 then half the interactions would be 0.
> If I choose 1,-1 will NLOGIT be able to fit the =ve and -ve values of the
> interacting term(s) OK?
>
> If I have more categories/dummy variables to add to the model, do I need a
> set of interactions for each one or can I combine them?
>
> I have searched and not been able to find any examples of the type of
> unlabeled model I wish to run, and I'm afraid I don't have the ability to
> extrapolate from the exclusively "labeled" models out there.
>
> Any further help appreciated.
>
>
>
>> You get this because the default in NLOGIT is to fit alternative specific
>> constants (when you use the ONE term) and you have 30 alternatives in the
>> choice
>> set.  Add to that the interaction between gender you requested in RHS2 
>> and
>> your
>> one generic attribute (pRF1) you get 59 parameters.  Several of these
>> parameters
>> are fixed which means you have very low frequencies for them or one level
>> of
>> gender never choice that specific alternative.
>>
>> Perhaps you should read about choice modeling in Greene's book Applied
>> Choice
>> Analysis.  He discusses all these issues and the defaults of NLOGIT.
>>
>> Tom
>>
>> -----Original Message-----
>> From: limdep-bounces at limdep.itls.usyd.edu.au
>> [mailto:limdep-bounces at limdep.itls.usyd.edu.au] On Behalf Of contactemt
>> Sent: Thursday, October 12, 2006 1:29 PM
>> To: Limdep and Nlogit Mailing List
>> Subject: Re: [Limdep Nlogit List] MNL Dummy Variable issues
>>
>> Thanks,
>>
>> After a quick look it seems the Limdep rh2 variable is used for this.
>> So I have:
>>
>> NLOGIT ; Lhs = CHOICE, SETSIZE
>>    ; Rhs = v
>>   ; Rh2 = One, GENDER
>>   ; Prob = probs $
>>
>>
>> Incidentally, (I havent read through the issues but)
>> I run a very simple (one attribute)
>>
>> NLOGIT ; Lhs = CHOICE, SETSIZE
>>    ; Rhs = pRF1
>>   ; Rh2 = One, GENDER
>>    ; Prob = probs $
>>
>>
>> as a test and get 59 parameters. Why is this?
>>
>> Sorry if a stupid Q.
>> +
>> | Discrete choice (multinomial logit) model   |
>> | Maximum Likelihood Estimates                |
>> | Model estimated: Oct 12, 2006 at 06:14:54PM.|
>> | Dependent variable               Choice     |
>> | Weighting variable                 None     |
>> | Number of observations             1704     |
>> | Iterations completed                 18     |
>> | Log likelihood function       -.3005320E-08 |
>> | R2=1-LogL/LogL*  Log-L fncn  R-sqrd  RsqAdj |
>> | No coefficients  -5795.6403 1.00000 1.00000 |
>> | Constants only.  Must be computed directly. |
>> |                  Use NLOGIT ;...; RHS=ONE $ |
>> | Response data are given as ind. choice.     |
>> | Number of obs.=  1704, skipped   0 bad obs. |
>> +---------------------------------------------+
>>
>>
>> |+---------+--------------+----------------+--------+---------+
>> |Variable | Coefficient  | Standard Error |b/St.Er.|P[|Z|>z] |
>> +---------+--------------+----------------+--------+---------+
>> PRF1       .28562667     1100.58000      .000   .9998
>> A_Alt.1       121.889077   ......(Fixed Parameter).......
>> AltxHCA1     -86.8291397     47685.8780     -.002   .9985
>> A_Alt.2      -17.6981970     50283.9113      .000   .9997
>> AltxHCA2      11.4289719     38053.7076      .000   .9998
>> A_Alt.3      -15.6341676     28185.0248     -.001   .9996
>> AltxHCA3      9.45993004     23720.4712      .000   .9997
>> A_Alt.4      -14.0208822     18985.8265     -.001   .9994
>> AltxHCA4      7.93083927     14502.9551      .001   .9996
>> A_Alt.5      -13.7822969   ......(Fixed Parameter).......
>> AltxHCA5      7.88816799   ......(Fixed Parameter).......
>> A_Alt.6      -12.9643251   ......(Fixed Parameter).......
>> AltxHCA6      7.03157581     431.216215      .016   .9870
>> A_Alt.7      -11.5834031     2324.85755     -.005   .9960
>> AltxHCA7      5.80855146     1886.76721      .003   .9975
>> A_Alt.8      -9.82066678   ......(Fixed Parameter).......
>> AltxHCA8      4.04330766     271.466257      .015   .9881
>> A_Alt.9      -8.15221444     48.3176779     -.169   .8660
>> AltxHCA9      2.65183332     31.0102930      .086   .9319
>> A_Alt.10     -6.50502468     24.7267852     -.263   .7925
>> AltxHCA*      7.33600806      .02888315   253.989   .0000
>> A_Alt.11     -5.06016269   ......(Fixed Parameter).......
>> AltxHCA*      7.33600806      .02888315   253.989   .0000
>> A_Alt.12     -4.04467058      .00742264  -544.910   .0000
>> AltxHCA*      7.33600806      .02888315   253.989   .0000
>> A_Alt.13     -3.22120224   ......(Fixed Parameter).......
>> AltxHCA*      7.33600806      .02888315   253.989   .0000
>> A_Alt.14     -2.70256143    .202507D-04  ********   .0000
>> AltxHCA*      7.33600806      .02888315   253.989   .0000
>> A_Alt.15     -2.37109609    .211547D-04  ********   .0000
>> AltxHCA*      7.33600806      .02888315   253.989   .0000
>> A_Alt.16     -2.16550807   ......(Fixed Parameter).......
>> AltxHCA*      7.33600806      .02888315   253.989   .0000
>> A_Alt.17     -2.01944461   ......(Fixed Parameter).......
>> AltxHCA*      7.33600806      .02888315   253.989   .0000
>> A_Alt.18     -1.97037254    .217681D-04  ********   .0000
>> AltxHCA*      7.33600806      .02888315   253.989   .0000
>> A_Alt.19     -1.98283571    .242113D-04  ********   .0000
>> AltxHCA*      7.33600806      .02888315   253.989   .0000
>> A_Alt.20     -2.04980440   ......(Fixed Parameter).......
>> AltxHCA*      7.33600806      .02888315   253.989   .0000
>> A_Alt.21     -2.13493369    .113446D-04  ********   .0000
>> AltxHCA*      7.33600806      .02888315   253.989   .0000
>> A_Alt.22     -2.14340813    .125961D-04  ********   .0000
>> AltxHCA*      7.33600806      .02888315   253.989   .0000
>> A_Alt.23     -2.14440660   ......(Fixed Parameter).......
>> AltxHCA*      7.33600806      .02888315   253.989   .0000
>> A_Alt.24     -2.18364464   ......(Fixed Parameter).......
>> AltxHCA*      7.33600806      .02888315   253.989   .0000
>> A_Alt.25     -2.23302643    .174142D-04  ********   .0000
>> AltxHCA*      7.33600806      .02888315   253.989   .0000
>> A_Alt.26     -2.26795137   ......(Fixed Parameter).......
>> AltxHCA*      7.33600806      .02888315   253.989   .0000
>> A_Alt.27     -2.29960229   ......(Fixed Parameter).......
>> AltxHCA*      7.33600806      .02888315   253.989   .0000
>> A_Alt.28     -2.29869412    .153029D-04  ********   .0000
>> AltxHCA*      7.33600806      .02888315   253.989   .0000
>> A_Alt.29     -2.30517803    .174283D-04  ********   .0000
>> AltxHCA*      7.33600806      .02888315   253.989   .0000
>>
>>
>>
>>>
>>  That is exactly right: anything constant across the choice set
>> needs to be put in as an interaction effect, via multiplying
>> with with non-constant quantities. That way, you are in effect
>> estimating two coefficients -- assuming you are assessing the effect
>> of a dummy variable -- for each of those other (non-contant)
>> quantities. To keep with your original example, you'd be getting a
>> set of "male coefficients" and "female coefficients" for each of the
>> non-constant variables with which you're interacting. Note that this
>> would only be estimated *across* choice sets, since each individual
>> is, presumably, constant in gender, so the gender "variable" never
>> varies within any one choice set. [You should be careful that you
>> don't have a small proportion of either zeros or ones in your dummy
>> variable, or you may wind up not having enough cases to estimate the
>> gender difference in coefficients. You might also consider some form
>> of hierarchical modeling, particularly hierarchical Bayes.]
>>
>>  FF
>>
>>  Quoting "Thomas C. Eagle" <teagle at tceagle.com>:
>>
>>> You have to interact the category variables with alternative
>> specific
>>> constants,
>>> much like you do with socio-demographic effects.
>>>
>>> Tom
>> _______________________________________________
>> Limdep site list
>> Limdep at limdep.itls.usyd.edu.au
>> http://limdep.itls.usyd.edu.au
>>
>>
>>
>> -- 
>> No virus found in this incoming message.
>> Checked by AVG Free Edition.
>> Version: 7.1.408 / Virus Database: 268.13.2/471 - Release Date: 
>> 10/10/2006
>>
>> _______________________________________________
>> Limdep site list
>> Limdep at limdep.itls.usyd.edu.au
>> http://limdep.itls.usyd.edu.au
>>
>>
>> _______________________________________________
>> Limdep site list
>> Limdep at limdep.itls.usyd.edu.au
>> http://limdep.itls.usyd.edu.au
>>
>>
>>
>> -- 
>> No virus found in this incoming message.
>> Checked by AVG Free Edition.
>> Version: 7.1.408 / Virus Database: 268.13.2/471 - Release Date: 
>> 10/10/2006
>>
>
> _______________________________________________
> Limdep site list
> Limdep at limdep.itls.usyd.edu.au
> http://limdep.itls.usyd.edu.au
>
>
> _______________________________________________
> Limdep site list
> Limdep at limdep.itls.usyd.edu.au
> http://limdep.itls.usyd.edu.au
>
>
>
> -- 
> No virus found in this incoming message.
> Checked by AVG Free Edition.
> Version: 7.1.408 / Virus Database: 268.13.2/471 - Release Date: 10/10/2006
>
> 

_______________________________________________
Limdep site list
Limdep at limdep.itls.usyd.edu.au
http://limdep.itls.usyd.edu.au





More information about the Limdep mailing list