Abstract
Stochastic forecasts in complex environments can benefit from combining the estimates of large groups of forecasters (“judges”). But aggregating multiple opinions faces several challenges. First, human judges are notoriously incoherent when their forecasts involve logically complex events. Second, individual judges may have specialized knowledge, so different judges may produce forecasts for different events. Third, the credibility of individual judges might vary, and one would like to pay greater attention to more trustworthy forecasts. These considerations limit the value of simple aggregation methods like linear averaging. In this paper, a new algorithm is proposed for combining probabilistic assessments from a large pool of judges. Two measures of a judge’s likely credibility are introduced and used in the algorithm to determine the judge’s weight in aggregation. The algorithm was tested on a data set of nearly half a million probability estimates of events related to the 2008 U.S. presidential election (∼ 16000 judges).