inferAlleleClusters - Allele similarity cluster
Description¶
A wrapper function to infer the allele clusters. See details for cluster inference
Usage¶
inferAlleleClusters(
germline_set,
trim_3prime_side = 318,
mask_5prime_side = 0,
family_threshold = 75,
allele_cluster_threshold = 95
)
Arguments¶
- germline_set
- Either a character vector of strings representing Ig sequence alleles, or a path to to the germline set file (must be gapped by IMGT scheme for optimal results).
- trim_3prime_side
- To which nucleotide position to trim the sequences. Default is 318; NULL will take the entire sequence length.
- mask_5prime_side
- Mimic short sequence libraries, gets the length of nucleotides to mask from the 5’ side, the staring position. Default is 0.
- family_threshold
- The similarity threshold for the family level. Default is 75.
- allele_cluster_threshold
- The similarity threshold for the allele cluster level. Default is 95.
Value¶
An object of type GermlineCluster that includes the following slots:
Details¶
The distance between pairs of the alleles germline set sequences is calculated, then the alleles are clustered based on two similarity thresholds. One for the family cluster and the other for the allele cluster. Then the new allele cluster names are generated and the germline set sequences are renamed and duplicated alleles are removed.
The allele cluster names are by the following scheme: IGHVF1-G1*01 - IGH = chain, V = region, F1 = family cluster numbering, G1 - allele cluster numbering, and 01 = allele numbering (given by clustering order, no connection to the expression)
To plot the allele clusters dendrogram use the plot
function on the GermlineCluster object
Slots¶
germlineSet
-
- A character vector with the modified germline set (3’ trimming and 5’ masking).
alleleClusterSet
-
- A character vector of renamed input germline set to the ASC name scheme (Without 3’ and 5’ modifications).
alleleClusterTable
-
- A data.frame of the allele similarity cluster with the new names and the default thresholds.
threshold
-
- A list of the input family and allele cluster similarity thresholds.
hclustAlleleCluster
-
- An hclust object of the germline set hierarchical clustering,
Examples¶
### Not run:
# load the initial germline set
#
# data(HVGERM)
#
# germline <- HVGERM
#
# asc <- inferAlleleClusters(germline)
#
# ## plotting the clusters
#
# plot(asc)
See also¶
By using the plot function on the returned object, a colorful visualization of the allele clusters dendrogram and threshold is received