The Partial Profiles Algorithm for Experimental Designs
In this blog post, I shall describe the Partial Profiles algorithm for generating choice model designs, which is described in a recent research paper (listed in the references). The paper finds that this algorithm outperforms other algorithms when their designs are benchmarked by the D-error metric. The Efficient algorithm is a special case of the Partial Profiles algorithm where no constant attributes are specified.
What are partial profiles?
Partial profiles designs are designs where a specified number of attributes are held constant in each question. Partial profiles are used to reduce the cognitive effort for respondents when there are a relatively large number of attributes. For example, it is much easier to compare alternatives when there are only 3 attributes to consider (with 7 left constant) than when there are 10 attributes to consider, with different levels in each alternative. The downside is that partial profile designs tend to be less efficient than designs generated without partial profiles, and therefore require more questions and versions to extract the same amount of information from respondents.
Partial Profiles Algorithm
Like the simpler Efficient algorithm, the Partial Profiles algorithm also aims to minimize D-error. Due to the length of the algorithm, I will break the steps into blocks and attempt to explain them separately. The initial steps involve choosing the attributes to be set constant for each question and then randomly generating a design with this constraint. This is the only source of randomness in the algorithm and the remaining steps, which try to minimize the D-error of the design, are deterministic. There is an outer loop over questions, as the algorithm optimizes each question separately in sequence.
1. Randomly choose attributes to be constant per question.
2. Randomly generate a design with the chosen attributes kept constant.
3. Compute the D-error of the the design.
4. Begin loop over questions, with the current question denoted by s.
The next block is designed to improve the selection of constant attributes. This is done by iterating over each constant attribute, and seeing if allowing it to vary and setting another attribute to be constant improves D-error.
5. Let Cs denote the set of attributes that are constant for question s.
6. Begin loop over Cs, with the current attribute denoted by c.
7. Let Cs* denote the set Cs with c removed.
8. Compute the D-error of the the design.
9. Begin loop over alternatives, with the current alternative denoted by j.
10. Loop through each level of attribute c to find the level that minimizes D-error when it is inserted into the design at question s, alternative j, attribute c.
11. Update the design with the optimal level at question s, alternative j, attribute c.
12. Continue loop over alternatives j (step 9) until all alternatives have been covered.
13. If the D-error of the design is better than the D-error at step 8, go back to step 9.
14. Loop over attributes not in Cs*, to find the attribute c* that minimizes D-error when it is held constant in the design at question s.
15. Update the design to have c* held constant at question s and update Cs by removing c and adding c*.
16. Continue loop over attributes Cs (step 6) until all attributes have been covered.
Once the selection of constant attributes has been improved, the next block is designed to improve the selection of levels in the varying attributes. This is done by iterating over each alternative and attribute and seeing if changing the level for the attribute of the alternative improves D-error.
17. Compute the D-error of the the design.
18. Begin loop over alternatives, with the current alternative denoted by j.
19. Begin loop over attributes not in Cs, with the current attribute denoted by f.
20. Loop through each level of attribute f to find the level l* that minimizes D-error when it is inserted into the design at question s, alternative j, attribute f.
21. Update the design with l* at question s, alternative j, attribute f.
22. Continue loop over attributes f (step 19) until all attributes have been covered.
23. Continue loop over alternatives j (step 18) until all alternatives have been covered.
24. If the D-error of the design is better than the D-error at step 17, go back to step 18.
The final block is just to wrap up the loop over questions and repeat the cycle if the previous cycle resulted in a reduction in D-error. The algorithm ends when a cycle is not able to improve upon the D-error.
25. Continue loop over questions s (step 4) until all questions have been covered.
26. If the D-error of the design is better than the D-error at step 3, go back to step 4, otherwise, the algorithm is done!
In a nutshell, the Partial Profiles algorithm cycles between optimizing the choice of constant attributes and the choice of levels in the varying attributes, until there are no more improvements in the design. The resulting design is therefore locally-optimal in terms of D-error, and although it is not necessarily the best possible design for the specifications, it has been shown in the original research paper to be very close to optimal.
Readers seeking the original source of this algorithm should refer to the paper(( D. P. Cuervo, R. Kessels, P. Goos and K. Sörensen (2016). An integrated algorithm for the optimal design of stated choice experiments with partial profiles. Transportation Research Part B 93.)) listed in the references below. The algorithm is presented in pseudocode and is known as the integrated algorithm by the authors.