Title: | Paired Mass Distance Analysis for GC/LC-MS Based Non-Targeted Analysis and Reactomics Analysis |
---|---|
Description: | Paired mass distance (PMD) analysis proposed in Yu, Olkowicz and Pawliszyn (2018) <doi:10.1016/j.aca.2018.10.062> and PMD based reactomics analysis proposed in Yu and Petrick (2020) <doi:10.1038/s42004-020-00403-z> for gas/liquid chromatography–mass spectrometry (GC/LC-MS) based non-targeted analysis. PMD analysis including GlobalStd algorithm and structure/reaction directed analysis. GlobalStd algorithm could found independent peaks in m/z-retention time profiles based on retention time hierarchical cluster analysis and frequency analysis of paired mass distances within retention time groups. Structure directed analysis could be used to find potential relationship among those independent peaks in different retention time groups based on frequency of paired mass distances. Reactomics analysis could also be performed to build PMD network, assign sources and make biomarker reaction discovery. GUIs for PMD analysis is also included as 'shiny' applications. |
Authors: | Miao YU [aut, cre] |
Maintainer: | Miao YU <[email protected]> |
License: | GPL-2 |
Version: | 0.2.7 |
Built: | 2025-02-14 06:14:41 UTC |
Source: | https://github.com/yufree/pmd |
Perform correlation directed analysis for peaks list.
getcda(list, corcutoff = 0.9, rtcutoff = 10, accuracy = 4)
getcda(list, corcutoff = 0.9, rtcutoff = 10, accuracy = 4)
list |
a list with mzrt profile |
corcutoff |
cutoff of the correlation coefficient, default NULL |
rtcutoff |
cutoff of the distances in retention time hierarchical clustering analysis, default 10 |
accuracy |
measured mass or mass to charge ratio in digits, default 4 |
list with correlation directed analysis results
data(spmeinvivo) cluster <- getcorcluster(spmeinvivo) cbp <- enviGCMS::getfilter(cluster,rowindex = cluster$stdmassindex2) cda <- getcda(cbp)
data(spmeinvivo) cluster <- getcorcluster(spmeinvivo) cbp <- enviGCMS::getfilter(cluster,rowindex = cluster$stdmassindex2) cda <- getcda(cbp)
Get reaction chain for specific mass to charge ratio
getchain( list, diff, mass, digits = 2, accuracy = 4, rtcutoff = 10, corcutoff = 0.6, ppm = 25 )
getchain( list, diff, mass, digits = 2, accuracy = 4, rtcutoff = 10, corcutoff = 0.6, ppm = 25 )
list |
a list with mzrt profile |
diff |
paired mass distance(s) of interests |
mass |
a specific mass for known compound or a vector of masses. You could also input formula for certain compounds |
digits |
mass or mass to charge ratio accuracy for pmd, default 2 |
accuracy |
measured mass or mass to charge ratio in digits, default 4 |
rtcutoff |
cutoff of the distances in retention time hierarchical clustering analysis, default 10 |
corcutoff |
cutoff of the correlation coefficient, default 0.6 |
ppm |
all the peaks within this mass accuracy as seed mass or formula |
a list with mzrt profile and reaction chain dataframe
data(spmeinvivo) # check metabolites of C18H39NO pmd <- getchain(spmeinvivo,diff = c(2.02,14.02,15.99),mass = 286.3101) # remove the retention time for mass only data spmeinvivo$rt <- NULL pmd <- getchain(spmeinvivo,diff = c(2.02,14.02,15.99),mass = 286.3101)
data(spmeinvivo) # check metabolites of C18H39NO pmd <- getchain(spmeinvivo,diff = c(2.02,14.02,15.99),mass = 286.3101) # remove the retention time for mass only data spmeinvivo$rt <- NULL pmd <- getchain(spmeinvivo,diff = c(2.02,14.02,15.99),mass = 286.3101)
Get Pseudo-Spectrum as peaks cluster based on pmd analysis.
getcluster(list, corcutoff = NULL, accuracy = 4)
getcluster(list, corcutoff = NULL, accuracy = 4)
list |
a list from getstd function |
corcutoff |
cutoff of the correlation coefficient, default NULL |
accuracy |
measured mass or mass to charge ratio in digits, default 4 |
list with Pseudo-Spectrum index
data(spmeinvivo) re <- getpaired(spmeinvivo) re <- getstd(re) cluster <- getcluster(re)
data(spmeinvivo) re <- getpaired(spmeinvivo) re <- getstd(re) cluster <- getcluster(re)
Get Pseudo-Spectrum as peaks cluster based on correlation analysis.
getcorcluster(list, corcutoff = 0.9, rtcutoff = 10, accuracy = 4)
getcorcluster(list, corcutoff = 0.9, rtcutoff = 10, accuracy = 4)
list |
a list with peaks intensity |
corcutoff |
cutoff of the correlation coefficient, default 0.9 |
rtcutoff |
cutoff of the distances in cluster, default 10 |
accuracy |
measured mass or mass to charge ratio in digits, default 4 |
list with Pseudo-Spectrum index
data(spmeinvivo) cluster <- getcorcluster(spmeinvivo)
data(spmeinvivo) cluster <- getcorcluster(spmeinvivo)
read in MSP file as list for ms/ms annotation
getms2pmd(file, digits = 2, icf = 10)
getms2pmd(file, digits = 2, icf = 10)
file |
the path to your MSP file |
digits |
mass or mass to charge ratio accuracy for pmd, default 2 |
icf |
intensity cutoff, default 10 percentage |
list a list with MSP information for MS/MS annotation
read in MSP file as list for EI-MS annotation
getmspmd(file, digits = 2, icf = 10)
getmspmd(file, digits = 2, icf = 10)
file |
the path to your MSP file |
digits |
mass or mass to charge ratio accuracy for pmd, default 0 |
icf |
intensity cutoff, default 10 percentage |
list a list with MSP information for EI-MS annotation
Filter ions/peaks based on retention time hierarchical clustering, paired mass distances(PMD) and PMD frequency analysis.
getpaired( list, rtcutoff = 10, ng = NULL, digits = 2, accuracy = 4, mdrange = NULL )
getpaired( list, rtcutoff = 10, ng = NULL, digits = 2, accuracy = 4, mdrange = NULL )
list |
a list with mzrt profile |
rtcutoff |
cutoff of the distances in retention time hierarchical clustering analysis, default 10 |
ng |
cutoff of global PMD's retention time group numbers, If ng = NULL, 20 percent of RT cluster will be used as ng, default NULL. |
digits |
mass or mass to charge ratio accuracy for pmd, default 2 |
accuracy |
measured mass or mass to charge ratio in digits, default 4 |
mdrange |
mass defect range to ignore. Default NULL and c(0.25,0.9) to retain the possible reaction related paired mass |
list with tentative isotope, multi-chargers, adducts, and neutral loss peaks' index, retention time clusters.
data(spmeinvivo) pmd <- getpaired(spmeinvivo)
data(spmeinvivo) pmd <- getpaired(spmeinvivo)
Get pmd for specific reaction
getpmd(list, pmd, rtcutoff = 10, corcutoff = NULL, digits = 2, accuracy = 4)
getpmd(list, pmd, rtcutoff = 10, corcutoff = NULL, digits = 2, accuracy = 4)
list |
a list with mzrt profile |
pmd |
a specific paired mass distance or a vector of pmds |
rtcutoff |
cutoff of the distances in retention time hierarchical clustering analysis, default 10 |
corcutoff |
cutoff of the correlation coefficient, default NULL |
digits |
mass or mass to charge ratio accuracy for pmd, default 2 |
accuracy |
measured mass or mass to charge ratio in digits, default 4 |
list with paired peaks for specific pmd or pmds.
getpaired
,getstd
,getsda
,getrda
data(spmeinvivo) pmd <- getpmd(spmeinvivo,pmd=15.99)
data(spmeinvivo) pmd <- getpmd(spmeinvivo,pmd=15.99)
Get pmd details for specific reaction after the removal of isotopouge.
getpmddf(mz, group = NULL, pmd = NULL, digits = 2, mdrange = c(0.25, 0.9))
getpmddf(mz, group = NULL, pmd = NULL, digits = 2, mdrange = c(0.25, 0.9))
mz |
a vector of mass to charge ratio. |
group |
mass to charge ratio group from either retention time or mass spectrometry imaging segmentation. |
pmd |
a specific paired mass distance or a vector of pmds |
digits |
mass or mass to charge ratio accuracy for pmd, default 2. |
mdrange |
mass defect range to ignore. Default c(0.25,0.9) to retain the possible reaction related paired mass. |
dataframe with paired peaks for specific pmd or pmds. When group is provided, a column named net will be generated to show if certain pmd will be local(within the same group) or global(across the groups)
getpaired
,getstd
,getsda
,getrda
data(spmeinvivo) pmddf <- getpmddf(spmeinvivo$mz,pmd=15.99)
data(spmeinvivo) pmddf <- getpmddf(spmeinvivo$mz,pmd=15.99)
Link pos mode peak list with neg mode peak list by pmd.
getposneg(pos, neg, pmd = 2.02, digits = 2)
getposneg(pos, neg, pmd = 2.02, digits = 2)
pos |
a list with mzrt profile collected from positive mode. |
neg |
a list with mzrt profile collected from negative mode. |
pmd |
numeric or numeric vector |
digits |
mass or mass to charge ratio accuracy for pmd, default 2 |
dataframe with filtered positive and negative peak list
Perform structure/reaction directed analysis for mass only.
getrda( mz, pmd = NULL, freqcutoff = 10, digits = 3, top = 20, formula = NULL, mdrange = c(0.25, 0.9), verbose = FALSE )
getrda( mz, pmd = NULL, freqcutoff = 10, digits = 3, top = 20, formula = NULL, mdrange = c(0.25, 0.9), verbose = FALSE )
mz |
numeric vector for independent mass or mass to charge ratio. Mass to charge ratio from GlobalStd algorithm is suggested. Isomers would be excluded automated |
pmd |
a specific paired mass distance or a vector of pmds, default NULL |
freqcutoff |
pmd frequency cutoff for structures or reactions, default 10 |
digits |
mass or mass to charge ratio accuracy for pmd, default 3 |
top |
top n pmd frequency cutoff when the freqcutoff is too small for large data set |
formula |
vector for formula when you don't have mass or mass to charge ratio data |
mdrange |
mass defect range to ignore. Default c(0.25,0.9) to retain the possible reaction related paired mass |
verbose |
logic, if TURE, return will be llist with paired mass distances table. Default FALSE. |
logical matrix with row as the same order of mz or formula and column as high frequency pmd group when verbose is FALSE
data(spmeinvivo) pmd <- getpaired(spmeinvivo) std <- getstd(pmd) sda <- getrda(spmeinvivo$mz[std$stdmassindex]) sda <- getrda(spmeinvivo$mz, pmd = c(2.016,15.995,18.011,14.016))
data(spmeinvivo) pmd <- getpaired(spmeinvivo) std <- getstd(pmd) sda <- getrda(spmeinvivo$mz[std$stdmassindex]) sda <- getrda(spmeinvivo$mz, pmd = c(2.016,15.995,18.011,14.016))
Get quantitative paired peaks list for specific reaction/pmd
getreact( list, pmd, rtcutoff = 10, digits = 2, accuracy = 4, cvcutoff = 30, outlier = FALSE, method = "static", ... )
getreact( list, pmd, rtcutoff = 10, digits = 2, accuracy = 4, cvcutoff = 30, outlier = FALSE, method = "static", ... )
list |
a list with mzrt profile and data |
pmd |
a specific paired mass distances |
rtcutoff |
cutoff of the distances in retention time hierarchical clustering analysis, default 10 |
digits |
mass or mass to charge ratio accuracy for pmd, default 2 |
accuracy |
measured mass or mass to charge ratio in digits, default 4 |
cvcutoff |
ratio or intensity cv cutoff for quantitative paired peaks, default 30 |
outlier |
logical, if true, outlier of ratio will be removed, default False. |
method |
quantification method can be 'static' or 'dynamic'. See details. |
... |
other parameters for getpmd |
PMD based reaction quantification methods have two options: 'static' will only consider the stable mass pairs across samples and such reactions will be limited by the enzyme or other factors than substrates. 'dynamic' will consider the unstable paired masses by normalization the relatively unstable peak with stable peak between paired masses and such reactions will be limited by one or both peaks in the paired masses.
list with quantitative paired peaks.
getpaired
,getstd
,getsda
,getrda
,getpmd
,
data(spmeinvivo) pmd <- getreact(spmeinvivo,pmd=15.99)
data(spmeinvivo) pmd <- getreact(spmeinvivo,pmd=15.99)
Perform structure/reaction directed analysis for peaks list.
getsda( list, rtcutoff = 10, corcutoff = NULL, digits = 2, accuracy = 4, freqcutoff = NULL )
getsda( list, rtcutoff = 10, corcutoff = NULL, digits = 2, accuracy = 4, freqcutoff = NULL )
list |
a list with mzrt profile |
rtcutoff |
cutoff of the distances in retention time hierarchical clustering analysis, default 10 |
corcutoff |
cutoff of the correlation coefficient, default NULL |
digits |
mass or mass to charge ratio accuracy for pmd, default 2 |
accuracy |
measured mass or mass to charge ratio in digits, default 4 |
freqcutoff |
pmd frequency cutoff for structures or reactions, default NULL. This cutoff will be found by PMD network analysis when it is NULL. |
list with tentative isotope, adducts, and neutral loss peaks' index, retention time clusters.
data(spmeinvivo) pmd <- getpaired(spmeinvivo) std <- getstd(pmd) sda <- getsda(std)
data(spmeinvivo) pmd <- getpaired(spmeinvivo) std <- getstd(pmd) sda <- getsda(std)
Find the independent ions for each retention time hierarchical clustering based on PMD relationship within each retention time cluster and isotope and return the index of the std data for each retention time cluster.
getstd(list, corcutoff = NULL, digits = 2, accuracy = 4)
getstd(list, corcutoff = NULL, digits = 2, accuracy = 4)
list |
a list from getpaired function |
corcutoff |
cutoff of the correlation coefficient, default NULL |
digits |
mass or mass to charge ratio accuracy for pmd, default 2 |
accuracy |
measured mass or mass to charge ratio in digits, default 4 |
list with std mass index
data(spmeinvivo) pmd <- getpaired(spmeinvivo) std <- getstd(pmd)
data(spmeinvivo) pmd <- getpaired(spmeinvivo) std <- getstd(pmd)
Get multiple injections index for selected retention time
gettarget(rt, drt = 10, n = 6)
gettarget(rt, drt = 10, n = 6)
rt |
retention time vector for peaks in seconds |
drt |
retention time drift for targeted analysis in seconds, default 10. |
n |
max ions numbers within retention time drift windows |
index for each injection
data(spmeinvivo) pmd <- getpaired(spmeinvivo) std <- getstd(pmd) index <- gettarget(std$rt[std$stdmassindex]) table(index)
data(spmeinvivo) pmd <- getpaired(spmeinvivo) std <- getstd(pmd) index <- gettarget(std$rt[std$stdmassindex]) table(index)
GlobalStd algorithm with structure/reaction directed analysis
globalstd( list, rtcutoff = 10, ng = NULL, corcutoff = NULL, digits = 2, accuracy = 4, freqcutoff = NULL, mdrange = NULL, sda = FALSE )
globalstd( list, rtcutoff = 10, ng = NULL, corcutoff = NULL, digits = 2, accuracy = 4, freqcutoff = NULL, mdrange = NULL, sda = FALSE )
list |
a peaks list with mass to charge, retention time and intensity data |
rtcutoff |
cutoff of the distances in cluster, default 10 |
ng |
cutoff of global PMD's retention time group numbers, If ng = NULL, 20 percent of RT cluster will be used as ng, default NULL. |
corcutoff |
cutoff of the correlation coefficient, default NULL |
digits |
mass or mass to charge ratio accuracy for pmd, default 2 |
accuracy |
measured mass or mass to charge ratio in digits, default 4 |
freqcutoff |
pmd frequency cutoff for structures or reactions, default NULL. This cutoff will be found by PMD network analysis when it is NULL. |
mdrange |
mass defect range to ignore. Default NULL and c(0.25,0.9) to retain the possible reaction related paired mass |
sda |
logical, option to perform structure/reaction directed analysis, default FALSE. |
list with GlobalStd algorithm processed data.
getpaired
,getstd
,getsda
,plotstd
,plotstdsda
,plotstdrt
data(spmeinvivo) re <- globalstd(spmeinvivo)
data(spmeinvivo) re <- globalstd(spmeinvivo)
A dataframe containing HMDB with unique accurate mass pmd with three digits frequency larger than 1 and accuracy percentage larger than 0.9.
data(hmdb)
data(hmdb)
A dataframe with atoms numbers of C, H, O, N, P, S
accuracy of atom numbers prediction
pmd with two digits
pmd with three digits
A dataframe containing reaction related accurate mass pmd and related reaction formula with KEGG ID
data(keggrall)
data(keggrall)
A dataframe with KEGG reaction, their realted pmd and atoms numbers of C, H, O, N, P, S
KEGG reaction ID
pmd with three digits
mass spectrometry contaminants database for PMD check
data(MaConDa)
data(MaConDa)
A data frame from doi:10.1093/bioinformatics/bts527 with 308 rows and 5 variables:
MaConDa ID
contaminants
contaminants fomula
exact mass of contaminants
type of contaminant
A dataframe containing multiple reaction database ID and their related accurate mass pmd and related reactions
data(omics)
data(omics)
A dataframe with reaction and their realted pmd
KEGG reaction ID
RHEA_ID
reaction direction
master reaction RHEA ID
ec reaction ID
ecocyc reaction ID
macie reaction ID
metacyc reaction ID
reactome reaction ID
reaction related compounds
pmd with two digits
pmd with three digits
Compare matrices using PCA similarity factor
pcasf(x, y, dim = NULL)
pcasf(x, y, dim = NULL)
x |
Matrix with sample in column and features in row |
y |
Matrix is compared to x. |
dim |
number of retained dimensions in the comparison. Defaults to all. |
Ratio of projected variance to total variance
Edgar Zanella Alvarenga
Singhal, A. and Seborg, D. E. (2005), Clustering multivariate time-series data. J. Chemometrics, 19: 427-438. doi: 10.1002/cem.945
c1 <- matrix(rnorm(16),nrow=4) c2 <- matrix(rnorm(16),nrow=4) pcasf(c1, c2)
c1 <- matrix(rnorm(16),nrow=4) c2 <- matrix(rnorm(16),nrow=4) pcasf(c1, c2)
plot PMD KEGG network for certain compounds and output network average distance and degree
plotcn(formula, name, pmd)
plotcn(formula, name, pmd)
formula |
Chemical formula |
name |
Compound name |
pmd |
specific paired mass distances |
plotcn('C6H12O6','Glucose',c(2.016,14.016,15.995))
plotcn('C6H12O6','Glucose',c(2.016,14.016,15.995))
Plot the mass pairs and high frequency mass distances
plotpaired(list, index = NULL, ...)
plotpaired(list, index = NULL, ...)
list |
a list from getpaired function |
index |
index for PMD value |
... |
other parameters for plot function |
data(spmeinvivo) pmd <- getpaired(spmeinvivo) plotpaired(pmd)
data(spmeinvivo) pmd <- getpaired(spmeinvivo) plotpaired(pmd)
Plot the retention time group
plotrtg(list, ...)
plotrtg(list, ...)
list |
a list from getpaired function |
... |
other parameters for plot function |
data(spmeinvivo) pmd <- getpaired(spmeinvivo) plotrtg(pmd)
data(spmeinvivo) pmd <- getpaired(spmeinvivo) plotrtg(pmd)
Plot the specific structure directed analysis(SDA) groups
plotsda(list, ...)
plotsda(list, ...)
list |
a list from getpmd function |
... |
other parameters for plot function |
getstd
, globalstd
,plotstd
,plotpaired
,plotstdrt
data(spmeinvivo) re <- getpmd(spmeinvivo,pmd=78.9) plotsda(re)
data(spmeinvivo) re <- getpmd(spmeinvivo,pmd=78.9) plotsda(re)
Plot the std mass from GlobalStd algorithm
plotstd(list)
plotstd(list)
list |
a list from getstd function |
data(spmeinvivo) pmd <- getpaired(spmeinvivo) std <- getstd(pmd) plotstd(std)
data(spmeinvivo) pmd <- getpaired(spmeinvivo) std <- getstd(pmd) plotstd(std)
Plot the std mass from GlobalStd algorithm in certain retention time groups
plotstdrt(list, rtcluster, ...)
plotstdrt(list, rtcluster, ...)
list |
a list from getstd function |
rtcluster |
retention time group index |
... |
other parameters for plot function |
getstd
, globalstd
,plotstd
,plotpaired
,plotstdsda
data(spmeinvivo) pmd <- getpaired(spmeinvivo) std <- getstd(pmd) plotstdrt(std,rtcluster = 6)
data(spmeinvivo) pmd <- getpaired(spmeinvivo) std <- getstd(pmd) plotstdrt(std,rtcluster = 6)
Plot the std mass from GlobalStd algorithm in structure directed analysis(SDA) groups
plotstdsda(list, index = NULL, ...)
plotstdsda(list, index = NULL, ...)
list |
a list from getsda function |
index |
index for PMD value |
... |
other parameters for plot function |
getstd
, globalstd
,plotstd
,plotpaired
,plotstdrt
data(spmeinvivo) re <- globalstd(spmeinvivo, sda=TRUE) plotstdsda(re)
data(spmeinvivo) re <- globalstd(spmeinvivo, sda=TRUE) plotstdsda(re)
Shiny application for PMD analysis
runPMD()
runPMD()
Shiny application for PMD network analysis
runPMDnet()
runPMDnet()
A dataset containing common Paired mass distances of substructure, ions replacements, and reaction
data(sda)
data(sda)
A data frame with 94 rows and 4 variables:
Paired mass distances
potential sources
references
positive, negative or both mode to find corresponding PMDs
A peaks list dataset containing 9 samples from 3 fish with triplicates samples for each fish from LC-MS.
data(spmeinvivo)
data(spmeinvivo)
A list with 4 variables from 1459 LC-MS peaks:
mass to charge ratios
retention time
intensity matrix
group information