Skip to main content

Multinomial Logistic Regression

This section builds upon the concepts discussed in the Logistic Regression section. In this section we will focus on unordered choices.

Random Utility Basis

For the ithi^{th} consumer faced with JJ choices, the utility of choice jj is

Uij=wiβj+zijγj+εijU_{ij} = \bold{w'}_i \boldsymbol{\beta}_j + \bold{z'}_{ij}\boldsymbol{\gamma}_j + \varepsilon_{ij}

Variables are defined as follows:

  • Uij:=U_{ij}:= Utility an individual ii gets while consuming product jj.
  • wi:=w'_i:= Individual isi's characteristic like income, sex etc. Note that these characteristics do not vary with products.
  • zij:=z'_{ij}:= Attributes of product jj and some attributes can vary across individuals, eg. Transit time.

If the consumer chooses jj, then

P[Uij>Uik],kj.P[U_{ij}>U_{ik}], \hspace{10px} \forall k\neq j.

Assume there are 4 choices {A,B,C,D}\{A,B,C,D\}. Then the utilities for the ithi^{th} consumer are given as

UiA=wiβA+ziAγAViA+εiA=ViA+εiAUiB=wiβB+ziBγB+εiB=ViB+εiBUiC=wiβC+ziCγC+εiC=ViC+εiCUiD=wiβD+ziDγD+εiD=ViD+εiD\begin{align*} U_{iA} &= \underbrace{\bold{w'}_i \boldsymbol{\beta}_A + \bold{z'}_{iA}\boldsymbol{\gamma}_A}_{\bold{V}_{iA}} + \varepsilon_{iA}=\bold{V}_{iA}+ \varepsilon_{iA}\\ U_{iB} &= \bold{w'}_i \boldsymbol{\beta}_B + \bold{z'}_{iB}\boldsymbol{\gamma}_B + \varepsilon_{iB}=\bold{V}_{iB}+ \varepsilon_{iB}\\ U_{iC} &= \bold{w'}_i \boldsymbol{\beta}_C + \bold{z'}_{iC}\boldsymbol{\gamma}_C + \varepsilon_{iC}=\bold{V}_{iC}+ \varepsilon_{iC}\\ U_{iD} &= \bold{w'}_i \boldsymbol{\beta}_D + \bold{z'}_{iD}\boldsymbol{\gamma}_D + \varepsilon_{iD}=\bold{V}_{iD}+ \varepsilon_{iD} \end{align*}

If ithi^{th} consumer chooses CC, then

UiC>UiA and UiC>UiB and UiC>UiD.U_{iC}>U_{iA} \text{ and } U_{iC}>U_{iB} \text{ and } U_{iC}>U_{iD}.

This implies,

P[yi=C]=PiC=P[UiC>Uij,jC]=P[ViC+εiC>Vij+εij,jC]=P[εij<ViC+εiCVij,jC]\begin{align*} P[y_i=C]=P_{iC}&=P[U_{iC}>U_{ij},\hspace{7px}\forall j \neq C]\\ &=P[\bold{V}_{iC}+ \varepsilon_{iC}>\bold{V}_{ij}+ \varepsilon_{ij},\hspace{7px}\forall j \neq C]\\ &=P[\varepsilon_{ij}<\bold{V}_{iC}+ \varepsilon_{iC} -\bold{V}_{ij},\hspace{7px}\forall j \neq C] \end{align*}

Assume that εiC\varepsilon_{iC} is given, then

PiCεiC=P[εij<ViC+εiCVijεiC,jC]\begin{align*} P_{iC}|\varepsilon_{iC}&=P[\varepsilon_{ij}<\bold{V}_{iC}+ \varepsilon_{iC} -\bold{V}_{ij}|\varepsilon_{iC},\hspace{7px}\forall j \neq C] \end{align*}

Since εis\varepsilon_{i}'s are independent

PiCεiC=P[εiA<ViC+εiCViAεiC]P[εiB<ViC+εiCViBεiC]P[εiD<ViC+εiCViDεiC].\begin{align*} P_{iC}|\varepsilon_{iC}&=P[\varepsilon_{iA}<\bold{V}_{iC}+ \varepsilon_{iC} -\bold{V}_{iA}|\varepsilon_{iC}]\cdot P[\varepsilon_{iB}<\bold{V}_{iC}+ \varepsilon_{iC} -\bold{V}_{iB}|\varepsilon_{iC}]\cdot P[\varepsilon_{iD}<\bold{V}_{iC}+ \varepsilon_{iC} -\bold{V}_{iD}|\varepsilon_{iC}]. \end{align*}

We have assumed that each εij\varepsilon_{ij} is independently, identically distributed extreme value. The distribution is also called Gumbel and type I extreme value (and sometimes, mistakenly, Weibull). The density and cumulative distribution for each unobserved component of utility are

f(εij)=eεijeeεij,F(εij)=eeεij.\begin{align*} f(\varepsilon_{ij})&=e^{-\varepsilon_{ij}}\cdot e^{-e^{-\varepsilon_{ij}}},\\ F(\varepsilon_{ij})&=e^{-e^{-\varepsilon_{ij}}}. \end{align*}

Therefore

PiCεiC=ee(ViC+εiCViA)ee(ViC+εiCViB)ee(ViC+εiCViD)=jCee(ViC+εiCVij).\begin{align*} P_{iC}|\varepsilon_{iC}&=e^{-e^{-(\bold{V}_{iC}+ \varepsilon_{iC} -\bold{V}_{iA})}}\cdot e^{-e^{-(\bold{V}_{iC}+ \varepsilon_{iC} -\bold{V}_{iB})}} \cdot e^{-e^{-(\bold{V}_{iC}+ \varepsilon_{iC} -\bold{V}_{iD})}}\\ &=\prod_{j\neq C}e^{-e^{-(\bold{V}_{iC}+ \varepsilon_{iC} -\bold{V}_{ij})}}. \end{align*}

Using the law of total probability

PiC=PiCεiCf(εiC)dεiC=(jCee(ViC+εiCVij))f(εiC)dεiC=(jCee(ViC+εiCVij))eεiCeeεiCdεiC=eViCjeVij=exp(ViC)jexp(Vij)=exp(wiβC+ziCγC)exp(wiβA+ziAγA)+exp(wiβB+ziBγB)+exp(wiβC+ziCγC)+exp(wiβD+ziDγD)\begin{align*} P_{iC}&=\int P_{iC}|\varepsilon_{iC}\cdot f(\varepsilon_{iC})d\varepsilon_{iC}\\ &=\int\Big(\prod_{j\neq C}e^{-e^{-(\bold{V}_{iC}+ \varepsilon_{iC} -\bold{V}_{ij})}}\Big)\cdot f(\varepsilon_{iC})d\varepsilon_{iC}\\ &=\int\Big(\prod_{j\neq C}e^{-e^{-(\bold{V}_{iC}+ \varepsilon_{iC} -\bold{V}_{ij})}}\Big)\cdot e^{-\varepsilon_{iC}}\cdot e^{-e^{-\varepsilon_{iC}}}d\varepsilon_{iC}\\ &=\frac{e^{\bold{V}_{iC}}}{\sum_je^{\bold{V}_{ij}}}=\frac{\text{exp}(\bold{V}_{iC})}{\sum_j \text{exp}(\bold{V}_{ij})}\\ &=\frac{\text{exp}(\bold{w'}_i \boldsymbol{\beta}_C + \bold{z'}_{iC}\boldsymbol{\gamma}_C)}{\text{exp}(\bold{w'}_i \boldsymbol{\beta}_A + \bold{z'}_{iA}\boldsymbol{\gamma}_A) + \text{exp}(\bold{w'}_i \boldsymbol{\beta}_B + \bold{z'}_{iB}\boldsymbol{\gamma}_B) + \text{exp}(\bold{w'}_i \boldsymbol{\beta}_C + \bold{z'}_{iC}\boldsymbol{\gamma}_C) + \text{exp}(\bold{w'}_i \boldsymbol{\beta}_D + \bold{z'}_{iD}\boldsymbol{\gamma}_D)} \end{align*}