?:abstract
|
-
Search engine users rarely express an information need using the same query, and small differences in queries can lead to very different result sets. These user query variations have been exploited in past TREC CORE tracks to contribute diverse, highly-effective runs in offline evaluation campaigns with the goal of producing reusable test collections. In this paper, we document the query fusion runs submitted to the first and second round of TREC COVID, using ten queries per topic created by the first author. In our analysis, we focus primarily on the effects of having our second priority run omitted from the judgment pool. This run is of particular interest, as it surfaced a number of relevant documents that were not judged until later rounds of the task. If the additional judgments were included in the first round, the performance of this run increased by 35 rank positions when using RBP p=0.5, highlighting the importance of judgment depth and coverage in assessment tasks.
|