The influence of decision heuristics and overconfidence on multiattribute choice: A process-tracing study

In the present study it was shown that decision heuristics and confidence judgements play important roles in the building of preferences. Based on a dual-process account of thinking, the study compared people who did well versus poorly on a series of decision heuristics and overconfidence judgement tasks. The two groups were found to differ with regard to their information search behaviour in introduced multiattribute choice tasks. High performers on the judgemental tasks were less influenced in their decision processes by numerical information format (probabilities vs. frequencies) compared to low performers. They also looked at more attributes and spent more time on the multiattribute choice tasks. The results reveal that performance on decision heuristics and overconfidence tasks has a bearing both on heuristic and analytic processes in multiattribute decision making.


THE FR EQ U EN TIST A PPR O A C H TO RA T IO N A L THOUGHT
There is a growing body of literature showing that frequentist representa tions cause various cognitive biases to disappear related to confidence judgements (Cosmides & Tooby, 1996) and to decision heuristics.C on fidence and frequency judgements allude to the broader concept of metacognition (for overviews, see Chambers, Izaute, & Marescaux, 2002;Metcalfe, 2000;Metcalfe & Shimamura, 1994;Perfect & Schwartz, 2002;Yzerbyt, Lories, & Dardcnne, 1998).In a generic sense, metacognition designates our icnowlcdgc of our own knowledge and memory or, more specifically, the monitoring and control of the processes and outputs of our cognitions (Femandez-Duque, Baird, & Posner, 2000;Johansson, 2004;Nelson, 1996;Nelson & Narens, 1990).It has been claimed by Gigercnzer (1991Gigercnzer ( , 1993Gigercnzer ( , 1994) ) that applying probability as any form of rational norm, in decision and judgemental research, may be considered controversial.The reason for this is that although probability may be looked upon as a subjective measure or belief, the concept of "probability" may also be interpreted as a series of long-run relative frequencies.This latter position implies, among other things, a refusal to assign probabilities to unique events.In several studies, Gigerenzcr has revealed how well-established judgemental errors may be eliminated if questions are asked in terms of frequencies rather than in terms of probabilities.In addition, he has shown that such judgemental errors may be avoided if a procedure of random sample is strictly applied (Gigerenzer & Murray, 1987).
However, there also exist research findings that do not support the frequentist perspective.For instance, it has recently been demonstrated that subjects who made confidencc judgements concerning unique events and subjects who estimated relative frequencies produced essentially the same responses (Brenner, Koehler, Liberman, & Tversky, 1996).In the same study, it was shown that both confidence judgements and frequency estimates exhibited substantial overconfidence, and that these measures were both highly correlated with independent judgements o f representativeness (Bren ner et al., 1996).Still, it must be noted that only one type of decision heuristic was used in this study.
The role of frequencies and probabilities in confidencc judgements and decision heuristics has until now been the focus of much research.For instance, it has previously been revealed by Gigerenzcr that presentation mode of uncertainty has an impact on performance on confidence judgements and on decision heuristic tasks themselves.
However, little is yet known about how performance in these tasks governs to what degree people are sensitive to presentation format (frequencies/probabilities) in multiattribute choice.We therefore investigate if performance on judgemental tasks (such as decision heuristics and overconfidence tasks) has an influence on peoples' sensitivity to probability-based or frequency-based presentation o f uncertainty in multiattribute choice tasks.The new feature of the present study is that a numerical presentation format is assumed also to play a role in multiattribute choice.More specifically, it is assumed that general performance on decision heuristic tasks and overconfidence tasks governs this influence.

Heuristic processes
Emphasising that performance on judgemental tasks should also m atter to reasoning in multiattribute choice tasks, we build on recent dual-proccsscs accounts on thinking.Such accounts reveal that there exist individual differences in rational thought on quite a general basis (Evans & Over, 1996;Sloman, 1996;Stanovich & West, 1997, 1998, 2000).These individual differences are particularly present in processes that arc characterised as fast, automatic, and largely unconscious to their nature (often referred to as Systeml processes).As a common feature, these processes are relatively undemanding of computational capacity.They are also considered rather personal and contextualised in that they take individual goals and pragmatic considerations into account.Based on these characteristics they are often termed heuristic.Deriving from a common process similarity in this respect (confidence vs. frequency judgements), it could be argued that performance on decision heuristics and confidcnce judgement tasks would have a bearing on performance on multiattribute choice tasks (Kahneman, Slovic, & Tversky, 1982;Kahneman & Tversky, 1996;Tversky & Kahneman, 1974).The reason is that these judgement types share the same heuristic process specificity with the multiattribute choice tasks.
Wc thus hypothesise: III. High performers on decision heuristic tasks and on overconfidence judgement tasks will be less influenced by the numerical presentation format (frequencies vs. probabilities) in multiattribute choice, than low performers on such tasks.

Analytic processes
The other processes proposed by dual-process theorists (System2) arc considered as slower, strategic, and conscious to their nature (Evans & Over, 1996;Sloman, 1996;Stanovich & West, 1997, 1998, 2000).These processes pose higher demands on processing capacity and operate in an impersonal and decontextuahsed way, obeying logic or some other normative system.They arc therefore often labelled as analytic.The dual process theorists agree that Systeml processes have a huge influence on everyday judgement but that abstract and hypothetical System2 processes can guide our reasoning on specific tasks toward normative answers (Verschueren, Schaeken, & d 'Ydewalle, 2004).Hence, high performers on heuristics tasks and overconfidence tasks are also assumed to use more accurate decision strategies in their decision processing (Payne, Bettman, & Johnson, 1993).This means that high performers on judgemental tasks to a higher extent arc supposed to use rational decision processes.Such processes display attention patterns that can be tied to systematic weighting and utility maximisation.Thus, it is also hypothesised that: H2. High performers on decision heuristic tasks as well as on overconfidencc judgement tasks are expected to use more optimal decision strategies, compensatory in nature (characterised by a refusal to make tradeoffs), in multiattribute choice.

P R O C ESS TRACIN G M ET H O D
In order to test the two proposed hypotheses, a proccss tracing method was applied using eye tracking equipment.In this connection, attention-based latency time was used as a means of measuring the cognitive processes occurring in multiattribute choice tasks (see Lohse & Johnson, 1996).Compared with other more traditional process tracing methods like verbal protocols and information boards, eye tracking has in reccnt years received an increasing interest in, for instance, the organisational behaviour and marketing research fields.

M ETH O D
In the present study, participants were initially presented with 24 tasks measuring biases connected to the representativeness, availability, and anchoring and adjustment heuristics (Tversky & Kahneman, 1974).In addition, the tasks were designed to measure biases related to the use of the attribution heuristic (Pious, 1993).Each task was designed so that partici pants were first instructed to choose one o f two options, knowing that one of the answers was correct and the other one incorrect.Subsequently, they were instructed to make a confidencc judgement on a half-range scale, indicating how sure they were of having chosen the correct answer (see, for example, Fischhoff, Slovic, & Lichtenstein, 1977;Lichtenstein & Fischhoff, 1977;Oskamp, 1965).The main reason for adding the dimension of confidence to the fulfilment of the heuristic tasks was that it would add critical information about participants' ability to carry out accurate judgements.
Based on the outcome of this test, it was possible to subdivide the participants according to their performance.High performers were char acterised by a low degree of biases with regard to the achievement on the heuristic tasks (availability, representativeness, anchoring and adjustment, attribution).Another feature was that they were also quite well calibrated.Low performers were on the other hand producing a high degree of biases on the heuristic test, and were also characterised by being not so well calibrated.
After having completed this initial paper and pencil block of tasks, participants were asked to perform a number o f computerised multiattribute preference tasks.A paper and pencil condition was also conducted for control purposes.All the tasks involved deciding about candidates for a job position, and the participants were asked to take the role o f a consulting adviser for a company involved in personnel recruitment.In these tasks, uncertainty was either expressed in terms o f probabilities or in terms of frequencies.During the fulfilment of the tasks, both participants' decision processes and their preferential outcomes were registered.A main question of interest was to determine whether participants performing well on the heuristics tasks would also behave more accurately than low performers in the fulfilment of the preference tasks.

Participants
One hundred and ninety-two undergraduates (96 men and 96 women) at Göteborg University participated in the experiment in return for the equi valent o f $7.These participants had on prior occasions indicated that they were willing to take part in the experiment.The mean age of the participants was 24.4 years (SD =3.6), which fell in range between 18 and 39 years.

Materials
All participants were first requested to complete a paper and pencil test booklet.It consisted of a diagnostic test measuring the degree o f biases in different decision heuristics.The test booklet consisted of 24 questions.These questions were adopted from earlier research on heuristics and biases (Kahneman & Tversky, 1973;Tversky & Kahnem an, 1973, 1974, 1982).The 24 questions were divided into four decision heuristic categories dependent upon which decision heuristic they measured.Each of the four decision heuristic categories consisted of six questions.The four décision heuristic categories included in the booklet were representativeness, avail ability, attribution, or anchoring and adjustment.Each question always had one correct and one incorrect answer.Subsequent to having given an answer, participants were requested to give a confidence rating on a scale ranging from 50 (making a guess) to 100 (absolutely sure) as to how sure they were that they had given the correct answer.The participants were then administered either a computerised version o f a job recruitment task or a paper and pencil version of the same task.The Eyegaze System (cf.Boe, Selart, & Takemura, 2000;Lohse & Johnson, 1996) eye tracking equipment was used together with the computerised version of the job recruitment task.Each scenario contained information about jo b candidates expressed with reference to eight different attributes.Four o f these attributes concerned profit goals (e.g., improving the company's production, share o f the market and profit, and increasing sales) (Cyert & M arch, 1963).The remaining four attributes were related to environmental goals (e.g., decreasing the company's diluent level, improving the company's working environment, environmental policy, and energy saving).An example of one o f the four multiple event scenarios given to participants in the conditions with probability-based information can be found in the Appendix.The eight candidates' ability to obtain these goals was for half o f the participants expressed on a probability scale ranging from 1% to 100%.The other half of the participants were instead presented with information in terms of how frequent it was that the candidates would achieve the goals, for instance in 3 cases out of 10.All participants were given a total of eight different problems.

Design
The design was mixed factorial with frcqucncy-based versus probabihtybased information as one of the between-subjects factors.A second bctweensubjects factor consisted of an experimental control condition measuring participants' responsibility for their performance.A within-subjects factor (profit vs. environmental attributes) was used as a control variable with no connection to the hypotheses.

Procedure
Participants attended the experiment individually in laboratory conditions.They were seated in a private booth in front o f a computer screen and were requested to first fill out the test that measured the degree o f biases in different decision heuristics.After having completed the test, participants were randomly assigned to either an eyc-gaze condition or a paper and pcncil version of the same task.For the participants in the cye-gaze condition, a calibration procedure was thoroughly performed so that the Eyegaze System could be used in the experiment.This calibration procedure usually took about 2 min for each new participant.
The participants were tlien given general instructions on how to perform the experiment.They were also instructed that their task in the experiment was to act as a job recruiter and that they had a variety of different organisations in trade and industry as their clients.Participants were told that their task was to make decisions about job candidates (in some cases groups of candidates).In this respect, it was made clear that their decisions were to be based on as thorough a judgement as possible.Participants were also instructed that the different candidates or groups of candidates would differ in the degree to which they could fulfil a certain company's goal.The task in each situation was to select the best four candidates or groups o f candidates for a post in an organisation.The participants were also informed that the four candidates they sclccted would continue to further interviews or analyses.
In the general instruction it was stressed that participants did not have to rank order the chosen alternatives.Moreover, participants were told that it did not m atter in which order they were selected.All participants were explicitly instructed to carefully consider all information presented on the screen while making their choices.
After having considered the information they pressed the return button and typed in their choices.Thereafter, they pressed the return button again and another scenario was presented.Participants assigned to the paper and pencil version simply wrote down their choices on the bottom of each page before continuing to the next page.The different environmental or profit attributes as well as the positions of the different groups of candidates were randomised for cach scenario.Each participant also rcccived a randomised presentation of the scenarios.After having participated in the sessions, participants were debriefed and paid.The sessions lasted for approximately 50 min.

Measures
Measures o f the heuristics tasks.The performance measure was con structed by summing the number of times the respondents answered correctly across the 24 different heuristic questions.The correct answers were coded as 1, and the incorrect as 0. If participants chose the correct answer, the corresponding confidence rating was treated as positive; otherwise it was coded as a negative value.An index measure of confidence was obtained by taking the mean values of the confidence ratings of the same 24 questions.All participants performing above the mean value on the choices (M = 11.20,S D = 2.60) and on the confidence ratings {M = -4.53,51) = 17.47) were coded as high achievers, and those performing below or equal to the mean values were coded as low achievers.In this way, it was possible to creatc two groups of participants, one consisting of high achievers with high accuracy and calibration, and another group of low achievers with low accuracy and calibration.
Measures o f the preference task.The sclccted alternatives were in both the conditions (the proccss tracing and the paper and pencil condition) for each participant and task assigned the score of 1 while the rejected alternatives were assigned the score of 0. No reactivity was found for the proccss tracing condition in the sense that ?-tests revealed that the mean response scores for a clear majority o f the alternatives did not reliably differ between conditions.
Recoding o f the eye fixations data.In order to investigate whether high achievers used more compensatory decision strategies than low achievers, analyses of the Eyegaze recordings were made.Depth o f search refers to the total am ount of information that is searched (Ford, Schmitt, Schechtman, Hults, & Doherty, 1989;Klayman, 1983;Payne, 1976;Svenson, 1979).In the present experiment, only depth of search was used to examine whether participants used compensatory or noncompensatory decision strategies, due to limitations in the Eyegaze recorder's processing software.Another strategy measure used in the present study was the participants' response latency time (in ms).
A mean value of latency time for each attribute in ms was constructed.Each attribute was measured eight times since there was a total of eight problems for which each of the attributes could be attended to in each problem.A second mean value of the four profit attributes and a third mean value o f the four environmental attributes were likewise constructed (both in ms).
The time required to acquire information using eye fixations varies between 200 ms and 400 ms (Card, M oran, & Newell, 1983;Russo, 1978).An index was therefore constructed based upon the mean value (300 ms) of these two endpoints.All the participant's attention that required less than 300 ms upon an attribute was coded as 0, whereas if it required more than 300 ms it was coded as 1.Summing the four profit attributes (now recodcd as 1 or 0) that participants had been attending to, a measure indicating the number of attributes that had received attention was created.This ranged from 0 to 4. The same procedure was used to construct another measure for the four environmental attributes (also recodcd as 1 or 0).In this way, it was possible to investigate the degree to which participants focused upon profit or environmental attributes.

R E SU L T S The presentation mode of uncertainty
High achievers were expected to be less affectcd than low achievers by whether the information was probability or frequency based.A 2 (group: high achicvers vs. low achievers) x 2 (condition: probability-based vs. frequencybascd information) x 2 (attributes: profit vs. environmental attributes) mixed ANOVA with repeated measures on the last factor performed on the number of attributes yielded a significant main effect of attributes, F (l, 64) =8.63, /? < .01,M S E = 6.55.This effect again confirmed that participants searched for more information concerning profit goals than concerning environmental goals.A main effect of group was also found, F ( l, 64) =5.04,/) < .05,M S E = 15.26,revealing that high achievers attended to the attributes reliably more than low achievers.Separate Bonferonni-corrected /-tests 'ii p = .05per formed on the high achievers revealed that no significant differences existed between the probability-or frcqucncy-based information conditions regard ing the time spent on searching for information concerning the two types of attributes or conccrning the number of attendance paid to these.There were no significant differences between high achievers in the probability-or frequency-based conditions for any of the single profit or environmental attributes.In line with this, low achievers showed no significant differences in the time spent on the environmental or on the profit attributes.However, whether or not low achievers had been searching for information about an attribute was found to have some effects on the time used.Additional separate Bonferonni-corrected /-tests at /? = .05revealed that low achievers' search for information conccrning some o f the attributes (the profit attribute of improved share of the market, and the environmental attributes of improved energy saving, decreased effluent level, and improved working environment) differed in the frequency-and probability-based information conditions.Evidently, in the majority of these cases low achievers revealed a reliably higher value for searches of probability-based information, but one cffcct was in the other direction.The results therefore suggest that qualitative differences may exist in the cognitive processes applied by low performers depending on the nature of the attribute/dimension.Table 1 shows the mean percentages of low achievers' searches for information for the abovementioned attributes.The other environmental and profit attributes, as well as the number of environmental or profit attributes that were searched for revealed no significant differences between the two conditions.
Bonferonni-corrected /-tests at /j = .05on the preference data confirmed Hypothesis 1.As shown in Table 2, low achievers differed significantly between the frequency-and probability-based information groups.Such a difference was not observed for the high achievers.

The decision heuristic test
High achievers were expected to use more compensatory decision strategies when processing information than low achievers.Table 3 shows the mean numbers of attributes attended to by type of goal for high and low achievers.As can be seen, high achievers searched for the information by using more attributes than did low achievers.A 2 (group: high achievers vs. low achievers) x 2 (attributes; profit vs. environmental attributes) mixed ANOVA with repeated measures on the last factor performed on the number of attributes yielded a significant main effect of attributes, F (l, 55) =4.97, p < .05,M S E = 3J0.This effect revealed that participants searched for more information concerning profit goals than environmental goals.A main cffect o f group was also found, /"(I, 55) =5.30, p < .05,M S E = 530.In line with the expectations this effect showed that high achievers attended to more attributes than did low achievers.Separate Bonferonni-corrected i-tests at  /) = .05revealed that this difference was reliable for the environmental attributes, /(66) =2.67, p < .05,but not for the profit attributes.Table 4 reveals the mean values (in ms) for time spent on the attributes by type of goal for high and low achicvers.A nother 2 (group: high achievers vs. low achievers) x 2 (attributes: profit vs. environmental attributes) mixed ANOVA with repeated measures on the last factor performed on the time spent on searchmg the attributes resulted in a significant main effect of attributes, F (l, 55) =5.52, p < .05,M S E =9.03.Again it was revealed that participants attended more to profit attributes than to environmental attributes.However, the main effect o f group did not reach significance, F (l, 55) = 3 .30 ,/? = .074,A/Sii = 10.03.Separate Bonferonni-corrected ttests at /) = .05showed that participants searched the profit attribute information significantly more often than they searched the environmental attribute information, /(77)=3.03,<.01.Furthermore, high achicvers were found to reveal a tendency to attend more to the environmental attributes as compared to low achievers, /(66) =2.67, p = .069.No sig nificant differences were found for the time attended to the profit attributes.

D IS C U S S IO N
The results revealed that participants performing well on heuristics tasks (availability, representativeness, anchoring and adjustment, attribution) and on confidcnce judgement tasks also behaved quite accurately in the fulfilment of preference tasks.Hypothesis 1 For instance, it was revealed that the high performers were not as influenced as the low performers by whether uncertainty was presented in terms of probabilities or in terms of frequencies.Moreover, the high performers spent a more equal amount of time between the two conditions (probability/ frequency) searching for information, compared with the low performers.Also, the high performers were not to the same extent biased by the conditions (probability/frequency) in their preferences for any o f the groups of alternatives compared with the low performers.This seems particularly to have been the case with people who were performing below the average on decision heuristic and confidence judgement tasks.All these re.sults support recent dual-processes accounts on thinking revealing that there exist individual differences in rational thought on a quite general basis (Evans & Over, 1996;Sloman, 1996;Stanovich & West, 1997,1998, 2000).They also give some support to the frequentist view (Gigerenzer, 1991(Gigerenzer, , 1993(Gigerenzer, , 1994) ) in the sense that a frequency format not only matters for performance in decision heuristic tasks but also for confidence judgement tasks.A ddition ally, the fact that performance on decision heuristics and confidence judgement tasks had a bearing on performance on multiattribute choice tasks corroborates the validity of previous research on decision heuristics (Kahneman et al., 1982;Kahneman & Tversky, 1996;Tvcrsky & Kahneman, 1974).It is hereby also suggested that heuristic thinking may be able to explain the different information search behaviour observed between the two groups with regard to presentation format.
Hypothesis 2 Furthermore, it was shown that high performers invested more time in searching for information than did low performers (Evans & Over, 1996;Sloman, 1996;Stanovich & West, 1997, 1998, 2000).It was also established that high performers investigated more attributes in their search for information in comparison with low performers.All these results suggest that the ability to reason in a logically and statistically correct way, combined with good self-calibration, clearly has an impact on the extent to which people use optimal decision strategies in multiattribute decision situations (Vcrschueren et al., 2004).The findings therefore add credit to the opinion that accurate decision behaviour may be determined by choice of decision strategies (the weighted additive strategy being used as a normative yardstick for accurate decision behaviour; see Payne et al., 1993).Hereby, it is suggested that analytical thinking also plays an im portant role for the observed differences between the two groups.

Limitations
In the present study, we have, in line vi'ith Payne ct al. (1993), focuscd on preferences, such as between hypothetical jo b candidates, rather than on inferences about the real world, such as which soccer team will win or which of two cities is larger.This procedure has been criticised by Gigerenzer, Todd, and The ABC Group (1999).They claim that it is difficult with such a procedure to measure the accuracy of strategics in terms of their ability to predict rcal-world outcomes.Instead, such a procedure restricts itself to measure accuracy by how closely a strategy can match the predictions of a weighted additive rule, which is the traditional gold standard for rational preferences.According to Gigerenzer ct al., fast and frugal heuristics may actually sometimes outper form strategies built on systematic weighting and utility maximisation.It should also be mentioned that Gigerenzer (1991Gigerenzer ( , 1993Gigerenzer ( , 1994) ) stresses the use of frequencies in a natural sampling framework (i.e., "natural frequencies"), which would entail the frequencies of candidates that achieve particular goals, nested within frequencies of candidates that achieve other particular goals.
The results of the present study are still generally supportive of the Gigerenzer et al. paradigm, because they propose that frequencies are more generally a format with which the mind is better able to work.On the other hand, it may be inferred that there are validity problems tied to the ecological perspective on accuracy presented by Gigerenzer et al. (1999).For instance, it may be the case that the most well-known stocks are those best at predicting financial success on the market in good times, but that this fact may be altered in times of market crisis.

Practical innplications
It is evident to Gigerenzer (2002) that statistics arc often presented in highly confusing ways, but that our difficulty in thinking about numbers can easily be overcome.With a few helpful techniques we can loam to uncloud our minds, demand helpfully presented information and turn ignorance into insight.In this respect, the choice of presentation format plays an extremely im portant role.It has been revealed in, for instance, both medical diagnostics and legal practice, that the way vital information is presented has a bearing on judgements and decisions.Here, the knowledge that presentation format (frequencies/probabilities) has different impact on decision-making behaviour for high-performing vs. low-performing ra tional-thinking groups is a new im portant fact for practitioners.Also the fact that performance on rational thinking has an impact on information search behaviour in multiattributc choice situations is a result that is highly im portant for real-world applications.

Future research
A question that remains to be addressed in future research is whether high performers are doing better on the nonfrequency tasks because (a) they are using "Systcm2" procedures and thus directly are making normatively correct inferences, or (b) they are better at mentally converting the stimuli to alternative formats (e.g., frequency) formats and then applying appropriate "System l" procedures.PrEview proof published online month/year Im agine that you as an outside c o n su lta n t arc going to choose am ong eight different groups o f candidates applying for a post in an inform ation division in a chem ical company.T he eight different groups o f candidates differ in the degree to which they can be expected to prom ote certain aim s that the com pany has.T he can d id ate groups' ability to achieve these aim s is expressed on a scale ranging from I to 100%.We w an t you to select tlie four best groups o f candidates who will progress to an interview.A s an outside consultant you d o not have to justify your decision to others and the identity o f decision-m akers will in the com pany rem ain anonym ous.

Groups o f camlidales
values of time spent on the attributes (in m s) by type of goal for high and low achievers Original manuscript received April 2004 Revised manu.script received December 2004 PrEview proof publi<!hed online May 2005 APPENDIX An exam ple o f a m ultiple events inform ation scenario given lo participants in the select conditions with probability-based inform ation