rnanorm.TMM¶
- class rnanorm.TMM(m_trim=0.3, a_trim=0.05)[source]¶
Trimmed mean of M-values (TMM) normalization.
In an RNA-seq experiment a small fraction of genes is sometimes extremely overexpressed in some samples but not in others . This can artificially inflate library size and therefore (after library size normalization) cause the remaining genes to be considered under-sampled in those samples. Unless this effect is adjusted for, those genes may falsely appear to be down-regulated in that sample. TMM is one of the approaches to correct for such imbalance. For more explanation on the topic check EdgeR docs.
Procedure for normalization is described in Robinson & Oshlack, 2010, but in short:
Use raw counts
Define the reference sample (
self.ref_
)- Compute scaling factors
Compute M values, filter by double trimming with m_trim
Compute A values, filter by double trimming with m_trim
Compute factors as weighted sum of M values
Factors = 2 ** factors
Rescale factors so that their geometric mean is 1
“Adjusted library size” = library size * normalization factors
Compute CPM normalization with “Adjusted library size”
This implementation is based on edgeR’s and is validated to be identical to it to at least 10 decimal places.
- Parameters:
Examples
>>> from rnanorm.datasets import load_toy_data >>> from rnanorm import TMM >>> X = load_toy_data().exp >>> X Gene_1 Gene_2 Gene_3 Gene_4 Gene_5 Sample_1 200 300 500 2000 7000 Sample_2 400 600 1000 4000 14000 Sample_3 200 300 500 2000 17000 Sample_4 200 300 500 2000 2000 >>> TMM().set_output(transform="pandas").fit_transform(X) Gene_1 Gene_2 Gene_3 Gene_4 Gene_5 Sample_1 20000.0 30000.0 50000.0 200000.0 700000.0 Sample_2 20000.0 30000.0 50000.0 200000.0 700000.0 Sample_3 20000.0 30000.0 50000.0 200000.0 1700000.0 Sample_4 20000.0 30000.0 50000.0 200000.0 200000.0
Methods
__init__
([m_trim, a_trim])Initialize class.
fit
(X[, y])Fit.
fit_transform
(X[, y])Fit to data, then transform it.
get_feature_names_out
([input_features])Get output feature names for transformation.
get_metadata_routing
()Get metadata routing of this object.
get_norm_factors
(X)Get UQ normalization factors (normalized with geometric mean).
get_params
([deep])Get parameters for this estimator.
set_output
(*[, transform])Set output container.
set_params
(**params)Set the parameters of this estimator.
transform
(X)Transform.