Advanced Analysis

How to Compute D-error for a Choice Experiment

by Justin Yap

D-error is a way of summarizing how good or bad a design is at extracting information from respondents in a choice experiment. A design with a low D-error is better than a design with a high D-error, provided that both designs are for the same experiment; comparing D-error between designs for different experiments is meaningless. Many other related measures exist that also serve this purpose, such as D-optimality.

This article gives an overview of D-error and demonstrates how to compute D-error by working through an example. Concepts in this article are covered in more (mathematical) detail here.

Prior parameter assumptions

When computing D-error, a prior assumption about the respondent parameters needs to be made. D₀-error assumes that all parameters are zero — i.e., respondents have no preference for any of the attribute levels. D_P-error assumes that all respondent parameters are equal to a parameter vector. On the other hand, D_B-error assumes that respondent parameters are distributed according to a probability distribution, which is often a multivariate normal distribution with a diagonal covariance matrix.

D_P-error example

A small choice experiment design is shown below:

Version	Task	Question	Alternative	Attribute 1	Attribute 2	Attribute 3
1	1	1	1	1	2	1
1	1	1	2	2	1	2
1	2	2	1	1	2	2
1	2	2	2	2	1	1
1	3	3	1	2	2	1
1	3	3	2	1	1	2
2	4	1	1	2	2	2
2	4	1	2	1	1	1
2	5	2	1	2	2	2
2	5	2	2	1	1	1
2	6	3	1	1	2	1
2	6	3	2	2	1	2

The first step is to encode the design, with either dummy coding or effects coding. I use dummy coding in this example, and split the encoded design by its four questions:

$\textbf{X}_1=\left[\begin{matrix}0&1&1\\1&0&0\end{matrix}\right],\textbf{X}_2=\left[\begin{matrix}1&0&1\\1&1&0\end{matrix}\right]$

$\textbf{X}_3=\left[\begin{matrix}1&0&1\\0&1&0\end{matrix}\right],\textbf{X}_4=\left[\begin{matrix}1&1&1\\0&0&0\end{matrix}\right]$

For D_P-error, I assume that the respondent parameters are given by $\mathbit{\beta}=[0.5, -0.8, 1.0]$ . The next step is to compute the multinomial logit probabilities, using the formula

$\mathbittextbf{p}_{q,i}=\frac{\expfuncapply(\textbf{X}_{q,i}\mathbit{\beta})}{\sum_{j=1}^{J}{\expfuncapply(\textbf{X}_{q,j}\mathbit{\beta})}}$

where $\mathbittextbf{p}_{q,i}$ refers to the probability of selecting alternative i out of J alternatives in question q. The probabilities are

$\mathbittextbf{p}_1=[0.43,0.57],\mathbittextbf{p}_2=[0.86,0.14],\mathbittextbf{p}_3=[0.91,0.09],\mathbittextbf{p}_4=[0.67,0.33]$

These probabilities are then used to construct the Fisher information matrix, using the formula

$\textbf{M}=\sum_{q=1}^{Q}{{\textbf{X}^\prime}_q(\textrm{diag}(\mathbittextbf{p}_q)-p_q{p_q}\prime)}\textbf{X}_q$

Plugging the values for $\textbf{X}$ and $\mathbittextbf{p}$ into the formula, the information matrix is

$\textbf{M}=\left[\begin{matrix}0.55&-0.11&0.06\\-0.11&0.67&0.26\\0.06&0.26&0.67\end{matrix}\right]$

The D_P-error is ${|\textbf{M}|}^{-1/K}=1.72$ , where $K=3$ is the number of parameters.

D₀-error example

Computing D₀-error is just a special case of D_P-error where $\mathbit{\beta}$ is assumed to be a vector of zeros. In this case, the probabilities $\mathbittextbf{p}_{q,i}=1/J$ (=1/2 in this example) and the information matrix is

$\textbf{M}=\left[\begin{matrix}0.75&-0.25&0.25\\-0.25&1&0\\0.25&0&1\end{matrix}\right]$

The D₀-error is ${|\mathbittextbf{M}|}^{-1/K}=1.17$ .

DB-error (Bayesian) Example

D_B-error is defined as the integral of D_P-error over an assumed prior distribution of the respondent parameters. One way to compute this numerically is known as Monte Carlo estimation. It involves calculating the average D_P-error of many sets of parameters randomly drawn from the prior distribution. To illustrate this, assume that the parameter distribution is multivariate normal with mean $\mu=[0.5, -0.8, 1.0]$ and a covariance matrix that is diagonal with standard deviations $\sigma=[0.4, 0.4, 0.4]$ . I draw 1000 samples from this distribution, as partially shown in the table below:

Draw	Parameter 1	Parameter 2	Parameter 3	Dp-error
1	0.25	-0.73	0.67	1.45
2	1.14	-0.67	0.67	1.90
3	0.69	-0.50	1.23	1.92
...	...	...	...	...
1000	1.15	-1.67	0.57	2.82

D_B-error is estimated as the mean D_P-error, which is 1.90 in this example. Another more computationally efficient but complicated way of computing the integral using quadrature exists¹ but is beyond the scope of this article.

Read more handy How To guides, or check out the rest of our blog!

References

1 Christopher M. Gotwalt, Bradley A. Jones & David M. Steinberg, “Fast Computation of Designs Robust to Parameter Uncertainty for Nonlinear Settings,” Technometrics (2012) 51:1, 88-95, DOI: 10.1198/TECH.2009.0009

TECHNIQUES

TECHNIQUES

OBJECTIVES

CAPABILITIES

DATA SOURCES

LEARN

SUPPORT

LATEST WEBINAR

How to Compute D-error for a Choice Experiment

Prior parameter assumptions

D_P-error example

D₀-error example

DB-error (Bayesian) Example

References

Prepare to watch, play, learn, make, and discover!

Get access to all the premium content on Displayr

Last question, we promise!

What type of survey data are you working with? (select all that apply)

TECHNIQUES

TECHNIQUES

OBJECTIVES

CAPABILITIES

DATA SOURCES

LEARN

SUPPORT

LATEST WEBINAR

How to Compute D-error for a Choice Experiment

Prior parameter assumptions

DP-error example

D0-error example

DB-error (Bayesian) Example

References

Prepare to watch, play, learn, make, and discover!

Get access to all the premium content on Displayr

Last question, we promise!

What type of survey data are you working with? (select all that apply)

D_P-error example

D₀-error example