Property | Value |
?:abstract
|
-
Motivation The adaptive B-cell response is driven by the expansion, somatic hypermutation, and selection of B-cell clones. Their number, size and sequence diversity are essential characteristics of B-cell populations. Identifying clones in B-cell populations is central to several repertoire studies such as statistical analysis, repertoire comparisons, and clonal tracking. Several clonal grouping methods have been developed to group sequences from B-cell immune repertoires. Such methods have been principally evaluated on simulated benchmarks since experimental data containing clonally related sequences can be difficult to obtain. However, experimental data might contains multiple sources of sequence variability hampering their artificial reproduction. Therefore, the generation of high precision ground truth data that preserves real repertoire distributions is necessary to accurately evaluate clonal grouping methods. Results We proposed a novel methodology to generate ground truth data sets from real repertoires. Our procedure requires V(D)J annotations to obtain the initial clones, and iteratively apply an optimisation step that moves sequences among clones to increase their cohesion and separation. We first showed that our method was able to identify clonally-related sequences in simulated repertoires with higher mutation rates, accurately. Next, we demonstrated how real benchmarks (generated by our method) constitute a challenge for clonal grouping methods, when comparing the performance of a widely used clonal grouping algorithm on several generated benchmarks. Our method can be used to generate a high number of benchmarks and contribute to construct more accurate clonal grouping tools. Availability and implementation The source code and generated data sets are freely available at github.com/NikaAb/BCR_GTG
|
is
?:annotates
of
|
|
?:creator
|
|
?:doi
|
-
10.1101/2020.11.30.404046
|
?:doi
|
|
?:externalLink
|
|
?:journal
|
|
?:license
|
|
?:pdf_json_files
|
-
document_parses/pdf_json/8f2df6b7e8a611cb7d1631960dfd96fed2f62029.json
|
?:publication_isRelatedTo_Disease
|
|
?:sha_id
|
|
?:source
|
|
?:title
|
-
Automatic generation of ground truth data for the evaluation of clonal grouping methods in B-cell populations
|
?:type
|
|
?:year
|
|