Title: | GC/LC-MS Data Analysis for Environmental Science |
---|---|
Description: | Gas/Liquid Chromatography-Mass Spectrometer(GC/LC-MS) Data Analysis for Environmental Science. This package covered topics such molecular isotope ratio, matrix effects and Short-Chain Chlorinated Paraffins analysis etc. in environmental analysis. |
Authors: | Miao YU [aut, cre] , Thanh Wang [ctb] |
Maintainer: | Miao YU <[email protected]> |
License: | GPL-2 |
Version: | 0.7.3 |
Built: | 2024-10-29 05:18:43 UTC |
Source: | https://github.com/yufree/envigcms |
Get the MIR and related information from the files
batch(file, mz1, mz2)
batch(file, mz1, mz2)
file |
data file, CDF or other format supportted by xcmsRaw |
mz1 |
the lowest mass |
mz2 |
the highest mass |
Molecular isotope ratio
## Not run: mr <- batch(data,mz1 = 79, mz2 = 81) ## End(Not run)
## Not run: mr <- batch(data,mz1 = 79, mz2 = 81) ## End(Not run)
Combine two data with similar retention time while different mass range
cbmd(data1, data2, mzstep = 0.1, rtstep = 0.01)
cbmd(data1, data2, mzstep = 0.1, rtstep = 0.01)
data1 |
data file path of lower mass range |
data2 |
data file path of higher mass range |
mzstep |
the m/z step for generating matrix data from raw mass spectral data |
rtstep |
the alignment accuracy of retention time, e.g. 0.01 means the retention times of combined data should be the same at the accuracy 0.01s. Higher rtstep would return less scans for combined data |
matrix with the row as scantime in second and column as m/z
## Not run: # mz100_200 and mz201_300 were the path to the raw data matrix <- getmd(mz100_200,mz201_300) ## End(Not run)
## Not run: # mz100_200 and mz201_300 were the path to the raw data matrix <- getmd(mz100_200,mz201_300) ## End(Not run)
Perform MS/MS dot product annotation for mgf file
dotpanno(file, db = NULL, ppm = 10, prems = 1.1, binstep = 1, consinc = 0.6)
dotpanno(file, db = NULL, ppm = 10, prems = 1.1, binstep = 1, consinc = 0.6)
file |
mgf file generated from MS/MS data |
db |
database could be list object from 'getMSP' |
ppm |
mass accuracy, default 10 |
prems |
precursor mass range, default 1.1 to include M+H or M-H |
binstep |
bin step for consin similarity |
consinc |
consin similarity cutoff for annotation. Default 0.6. |
list with MSMS annotation results
find line of the regression model for GC-MS
findline(data, threshold = 2, temp = c(100, 320))
findline(data, threshold = 2, temp = c(100, 320))
data |
imported data matrix of GC-MS |
threshold |
the threshold of the response (log based 10) |
temp |
the scale of the oven temperature (constant rate) |
list linear regression model for the matrix
## Not run: data <- getmd(rawdata) findline(data) ## End(Not run)
## Not run: data <- getmd(rawdata) findline(data) ## End(Not run)
Find lipid class of metabolites base on referenced Kendrick mass defect
findlipid(list, mode = "pos")
findlipid(list, mode = "pos")
list |
list with data as peaks list, mz, rt and group information, retention time should be in seconds |
mode |
'pos' for positive mode, 'neg' for negative mode and 'none' for neutral mass, only support [M+H] and [M-H] for each mode |
list list with dataframe with the lipid referenced Kendrick mass defect(RKMD) and logical for class
Method for the Identification of Lipid Classes Based on Referenced Kendrick Mass Analysis. Lerno LA, German JB, Lebrilla CB. Anal Chem. 2010 May 15;82(10):4236–45.
data(list) RKMD <- findlipid(list)
data(list) RKMD <- findlipid(list)
Screen metabolites by Mass Defect
findmet(list, mass, mdr = 50)
findmet(list, mass, mdr = 50)
list |
list with data as peaks list, mz, rt and group information, retention time should be in seconds |
mass |
mass to charge ratio of specific compounds |
mdr |
mass defect range, default 50mDa |
list with filtered metabolites mass to charge index of certain compound
Screen organohalogen compounds by retention time, mass defect analysis and isotope relationship modified by literature report. Also support compounds with [M] and [M+2] ratio cutoff.
findohc( list, sf = 78/77.91051, step = 0.001, stepsd1 = 0.003, stepsd2 = 0.005, mzc = 700, cutoffint = 1000, cutoffr = 0.4, clustercf = 10 )
findohc( list, sf = 78/77.91051, step = 0.001, stepsd1 = 0.003, stepsd2 = 0.005, mzc = 700, cutoffint = 1000, cutoffr = 0.4, clustercf = 10 )
list |
list with data as peaks list, mz, rt and group information, retention time should be in seconds |
sf |
scale factor, default 78/77.91051(Br) |
step |
mass defect step, default 0.001 |
stepsd1 |
mass defect uncertainty for lower mass, default 0.003 |
stepsd2 |
mass defect uncertainty for higher mass, default 0.005 |
mzc |
threshold of lower mass and higher mass, default 700 |
cutoffint |
the cutoff of intensity, default 1000 |
cutoffr |
the cutoff of [M] and [M+2] ratio, default 0.4 |
clustercf |
the cutoff of cluster analysis to separate two different ions groups for retention time, default 10 |
list with filtered organohalogen compounds
Identification of Novel Brominated Compounds in Flame Retarded Plastics Containing TBBPA by Combining Isotope Pattern and Mass Defect Cluster Analysis Ana Ballesteros-Gómez, Joaquín Ballesteros, Xavier Ortiz, Willem Jonker, Rick Helmus, Karl J. Jobst, John R. Parsons, and Eric J. Reiner Environmental Science & Technology 2017 51 (3), 1518-1526 DOI: 10.1021/acs.est.6b03294
Find PFCs based on mass defect analysis
findpfc(list)
findpfc(list)
list |
list with data as peaks list, mz, rt and group information, retention time should be in seconds |
list list with potential PFCs compounds index
Liu, Y.; D’Agostino, L. A.; Qu, G.; Jiang, G.; Martin, J. W. High-Resolution Mass Spectrometry (HRMS) Methods for Nontarget Discovery and Characterization of Poly- and per-Fluoroalkyl Substances (PFASs) in Environmental and Human Samples. TrAC Trends in Analytical Chemistry 2019, 121, 115420.
data(list) pfc <- findpfc(list)
data(list) pfc <- findpfc(list)
Align two peaks vectors by mass to charge ratio and/or retention time
getalign(mz1, mz2, rt1 = NULL, rt2 = NULL, ppm = 10, deltart = 10)
getalign(mz1, mz2, rt1 = NULL, rt2 = NULL, ppm = 10, deltart = 10)
mz1 |
the mass to charge of reference peaks |
mz2 |
the mass to charge of peaks to be aligned |
rt1 |
retention time of reference peaks |
rt2 |
retention time of peaks to be aligned |
ppm |
mass accuracy, default 10 |
deltart |
retention time shift table, default 10 seconds |
data frame with aligned peaks table
mz1 <- c(221.1171, 227.1390, 229.1546, 233.1497, 271.0790 ) mz2 <- c(282.279, 281.113, 227.139, 227.139, 302.207) rt1 <- c(590.8710, 251.3820, 102.9230, 85.8850, 313.8240) rt2 <- c(787.08, 160.02, 251.76, 251.76, 220.26) getalign(mz1,mz2,rt1,rt2)
mz1 <- c(221.1171, 227.1390, 229.1546, 233.1497, 271.0790 ) mz2 <- c(282.279, 281.113, 227.139, 227.139, 302.207) rt1 <- c(590.8710, 251.3820, 102.9230, 85.8850, 313.8240) rt2 <- c(787.08, 160.02, 251.76, 251.76, 220.26) getalign(mz1,mz2,rt1,rt2)
Align mass to charge ratio and/or retention time to remove redundancy
getalign2(mz, rt, ppm = 5, deltart = 5)
getalign2(mz, rt, ppm = 5, deltart = 5)
mz |
the mass to charge of reference peaks |
rt |
retention time of reference peaks |
ppm |
mass accuracy, default 10 |
deltart |
retention time shift table, default 10 seconds |
index for
mz <- c(221.1171, 221.1170, 229.1546, 233.1497, 271.0790 ) rt <- c(590.8710, 587.3820, 102.9230, 85.8850, 313.8240) getalign2(mz,rt)
mz <- c(221.1171, 221.1170, 229.1546, 233.1497, 271.0790 ) rt <- c(590.8710, 587.3820, 102.9230, 85.8850, 313.8240) getalign2(mz,rt)
Get the peak information from samples for SCCPs detection
getarea(data, ismz = 323, ppm = 5, rt = NULL, rts = NULL)
getarea(data, ismz = 323, ppm = 5, rt = NULL, rts = NULL)
data |
list from 'xcmsRaw' function |
ismz |
internal standards m/z |
ppm |
resolution of mass spectrum |
rt |
retention time range of sccps |
rts |
retention time range of internal standards |
list with peak information
Get the peak information from SCCPs standards
getareastd(data = NULL, ismz = 323, ppm = 5, con = 2000, rt = NULL, rts = NULL)
getareastd(data = NULL, ismz = 323, ppm = 5, con = 2000, rt = NULL, rts = NULL)
data |
list from 'xcmsRaw' function |
ismz |
internal standards m/z |
ppm |
resolution of mass spectrum |
con |
concentration of standards |
rt |
retention time range of sccps |
rts |
retention time range of internal standards |
list with peak information
Get the peak list with blank samples' peaks removed
getbgremove( xset, method = "medret", intensity = "into", file = NULL, rsdcf = 30, inscf = 1000 )
getbgremove( xset, method = "medret", intensity = "into", file = NULL, rsdcf = 30, inscf = 1000 )
xset |
the xcmsset object with blank and certain group samples' data |
method |
parameter for groupval function |
intensity |
parameter for groupval function |
file |
file name for further annotation, default NULL |
rsdcf |
rsd cutoff for peaks, default 30 |
inscf |
intensity cutoff for peaks, default 1000 |
diff report
## Not run: library(faahKO) cdfpath <- system.file("cdf", package = "faahKO") xset <- getdata(cdfpath, pmethod = ' ') getbgremove(xset) ## End(Not run)
## Not run: library(faahKO) cdfpath <- system.file("cdf", package = "faahKO") xset <- getdata(cdfpath, pmethod = ' ') getbgremove(xset) ## End(Not run)
Get the report for biological replicates.
getbiotechrep( xset, method = "medret", intensity = "into", file = NULL, rsdcf = 30, inscf = 1000 )
getbiotechrep( xset, method = "medret", intensity = "into", file = NULL, rsdcf = 30, inscf = 1000 )
xset |
the xcmsset object which for all of your technique replicates for bio replicated sample in single group |
method |
parameter for groupval function |
intensity |
parameter for groupval function |
file |
file name for further annotation, default NULL |
rsdcf |
rsd cutoff for peaks, default 30 |
inscf |
intensity cutoff for peaks, default 0 |
dataframe with mean, standard deviation and RSD for those technique replicates & biological replicates combined with raw data
Align multiple peaks list to one peak list
getcompare(..., index = 1, ppm = 5, deltart = 5)
getcompare(..., index = 1, ppm = 5, deltart = 5)
... |
peaks list, mzrt objects |
index |
numeric, the index of reference peaks. |
ppm |
pmd mass accuracy, default 5 |
deltart |
retention time shift table, default 10 seconds |
list object with aligned mzrt objects
Convert an list object to csv file.
getcsv(list, name, mzdigit = 4, rtdigit = 1, type = "o", target = FALSE, ...)
getcsv(list, name, mzdigit = 4, rtdigit = 1, type = "o", target = FALSE, ...)
list |
list with data as peaks list, mz, rt and group information |
name |
result name for csv and/or eic file, default NULL |
mzdigit |
m/z digits of row names of data frame, default 4 |
rtdigit |
retention time digits of row names of data frame, default 1 |
type |
csv format for further analysis, m means Metaboanalyst, a means xMSannotator, p means Mummichog(NA values are imputed by 'getimputation', and F test is used here to generate stats and p value), o means full information csv (for 'pmd' package), default o. mapo could output all those format files. |
target |
logical, preserve original rowname of data or not for target data, default FALSE. |
... |
other parameters for 'write.table' |
NULL, csv file
Li, S.; Park, Y.; Duraisingham, S.; Strobel, F. H.; Khan, N.; Soltow, Q. A.; Jones, D. P.; Pulendran, B. PLOS Computational Biology 2013, 9 (7), e1003123. Xia, J., Sinelnikov, I.V., Han, B., Wishart, D.S., 2015. MetaboAnalyst 3.0—making metabolomics more meaningful. Nucl. Acids Res. 43, W251–W257.
## Not run: data(list) getcsv(list,name='demo') ## End(Not run)
## Not run: data(list) getcsv(list,name='demo') ## End(Not run)
Get xcmsset object in one step with optimized methods.
getdata( path, index = FALSE, BPPARAM = BiocParallel::SnowParam(), pmethod = "hplcorbitrap", minfrac = 0.67, ... )
getdata( path, index = FALSE, BPPARAM = BiocParallel::SnowParam(), pmethod = "hplcorbitrap", minfrac = 0.67, ... )
path |
the path to your data |
index |
the index of the files |
BPPARAM |
used for BiocParallel package |
pmethod |
parameters used for different instrumentals such as 'hplcorbitrap', 'uplcorbitrap', 'hplcqtof', 'hplchqtof', 'uplcqtof', 'uplchqtof'. The parameters were from the reference |
minfrac |
minimum fraction of samples necessary in at least one of the sample groups for it to be a valid group, default 0.67 |
... |
arguments for xcmsSet function |
the parameters are extracted from the papers. If you use name other than the name above, you will use the default setting of XCMS. Also I suggest IPO packages or apLCMS packages to get reasonable data for your own instrumental. If you want to summit the results to a paper, remember to include those parameters.
a xcmsset object for that path or selected samples
Patti, G. J.; Tautenhahn, R.; Siuzdak, G. Nat. Protocols 2012, 7 (3), 508–516.
## Not run: library(faahKO) cdfpath <- system.file('cdf', package = 'faahKO') xset <- getdata(cdfpath, pmethod = ' ') ## End(Not run)
## Not run: library(faahKO) cdfpath <- system.file('cdf', package = 'faahKO') xset <- getdata(cdfpath, pmethod = ' ') ## End(Not run)
Get XCMSnExp object in one step from structured folder path for xcms 3.
getdata2( path, index = FALSE, snames = NULL, sclass = NULL, phenoData = NULL, BPPARAM = BiocParallel::SnowParam(), mode = "onDisk", ppp = xcms::CentWaveParam(ppm = 5, peakwidth = c(5, 25), prefilter = c(3, 5000)), rtp = xcms::ObiwarpParam(binSize = 1), gpp = xcms::PeakDensityParam(sampleGroups = 1, minFraction = 0.67, bw = 2, binSize = 0.025), fpp = xcms::FillChromPeaksParam() )
getdata2( path, index = FALSE, snames = NULL, sclass = NULL, phenoData = NULL, BPPARAM = BiocParallel::SnowParam(), mode = "onDisk", ppp = xcms::CentWaveParam(ppm = 5, peakwidth = c(5, 25), prefilter = c(3, 5000)), rtp = xcms::ObiwarpParam(binSize = 1), gpp = xcms::PeakDensityParam(sampleGroups = 1, minFraction = 0.67, bw = 2, binSize = 0.025), fpp = xcms::FillChromPeaksParam() )
path |
the path to your data |
index |
the index of the files |
snames |
sample names. By default the file name without extension is used |
sclass |
sample classes. |
phenoData |
data.frame or NAnnotatedDataFrame defining the sample names and classes and other sample related properties. If not provided, the argument sclass or the subdirectories in which the samples are stored will be used to specify sample grouping. |
BPPARAM |
used for BiocParallel package |
mode |
'inMemory' or 'onDisk' see ‘?MSnbase::readMSData' for details, default ’onDisk' |
ppp |
parameters for peaks picking, e.g. xcms::CentWaveParam() |
rtp |
parameters for retention time correction, e.g. xcms::ObiwarpParam() |
gpp |
parameters for peaks grouping, e.g. xcms::PeakDensityParam() |
fpp |
parameters for peaks filling, e.g. xcms::FillChromPeaksParam(), PeakGroupsParam() |
This is a wrap function for metabolomics data process for xcms 3.
a XCMSnExp object with processed data
Generate the group level rsd and average intensity based on DoE,
getdoe( list, inscf = 5, rsdcf = 100, rsdcft = 30, imputation = "l", tr = FALSE, BPPARAM = BiocParallel::bpparam() )
getdoe( list, inscf = 5, rsdcf = 100, rsdcft = 30, imputation = "l", tr = FALSE, BPPARAM = BiocParallel::bpparam() )
list |
list with data as peaks list, mz, rt and group information |
inscf |
Log intensity cutoff for peaks across samples. If any peaks show a intensity higher than the cutoff in any samples, this peaks would not be filtered. default 5 |
rsdcf |
the rsd cutoff of all peaks in all group |
rsdcft |
the rsd cutoff of all peaks in technical replicates |
imputation |
parameters for 'getimputation' function method |
tr |
logical. TRUE means dataset with technical replicates at the base level folder |
BPPARAM |
An optional BiocParallelParam instance determining the parallel back-end to be used during evaluation. |
list with group mean, standard deviation, and relative standard deviation for all peaks, and filtered peaks index
getdata2
,getdata
, getmzrt
, getimputation
, getmr
,getpower
data(list) getdoe(list)
data(list) getdoe(list)
Density weighted intensity for one sample
getdwtus(peak, n = 512, log = FALSE)
getdwtus(peak, n = 512, log = FALSE)
peak |
peaks intensity one sample |
n |
the number of equally spaced points at which the density is to be estimated, default 512 |
log |
log transformation |
Density weighted intensity for one sample
data(list) getdwtus(list$data[,1])
data(list) getdwtus(list$data[,1])
Get the features from anova, with p value, q value, rsd and power restriction
getfeaturesanova( list, power = 0.8, pt = 0.05, qt = 0.05, n = 3, ng = 3, rsdcf = 100, inscf = 5, imputation = "l", index = NULL )
getfeaturesanova( list, power = 0.8, pt = 0.05, qt = 0.05, n = 3, ng = 3, rsdcf = 100, inscf = 5, imputation = "l", index = NULL )
list |
list with data as peaks list, mz, rt and group information (more than two groups) |
power |
defined power |
pt |
p value threshold |
qt |
q value threshold, BH adjust |
n |
sample numbers in one group |
ng |
group numbers |
rsdcf |
the rsd cutoff of all peaks in all group |
inscf |
Log intensity cutoff for peaks across samples. If any peaks show a intensity higher than the cutoff in any samples, this peaks would not be filtered. default 5 |
imputation |
parameters for 'getimputation' function method |
index |
the index of peaks considered, default NULL |
dataframe with peaks fit the setting above
Get the features from t test, with p value, q value, rsd and power restriction
getfeaturest(list, power = 0.8, pt = 0.05, qt = 0.05, n = 3, imputation = "l")
getfeaturest(list, power = 0.8, pt = 0.05, qt = 0.05, n = 3, imputation = "l")
list |
list with data as peaks list, mz, rt and group information (two groups) |
power |
defined power |
pt |
p value threshold |
qt |
q value threshold, BH adjust |
n |
sample numbers in one group |
imputation |
parameters for 'getimputation' function method |
dataframe with peaks fit the setting above
Filter the data based on row and column index
getfilter(list, rowindex = TRUE, colindex = TRUE, name = NULL, type = "o", ...)
getfilter(list, rowindex = TRUE, colindex = TRUE, name = NULL, type = "o", ...)
list |
list with data as peaks list, mz, rt and group information |
rowindex |
logical, row index to keep |
colindex |
logical, column index to keep |
name |
file name for csv and/or eic file, default NULL |
type |
csv format for further analysis, m means Metaboanalyst, a means xMSannotator, p means Mummichog(NA values are imputed by 'getimputation', and F test is used here to generate stats and p value), o means full information csv (for 'pmd' package), default o. mapo could output all those format files. |
... |
other parameters for 'getcsv' |
list with remain peaks, and filtered peaks index
getdata2
,getdata
, getmzrt
, getimputation
, getmr
, getcsv
data(list) li <- getdoe(list) lif <- getfilter(li,rowindex = li$rsdindex)
data(list) li <- getdoe(list) lif <- getfilter(li,rowindex = li$rsdindex)
Get chemical formula for mass to charge ratio.
getformula( mz, charge = 0, window = 0.001, elements = list(C = c(1, 50), H = c(1, 50), N = c(0, 50), O = c(0, 50), P = c(0, 1), S = c(0, 1)) )
getformula( mz, charge = 0, window = 0.001, elements = list(C = c(1, 50), H = c(1, 50), N = c(0, 50), O = c(0, 50), P = c(0, 1), S = c(0, 1)) )
mz |
a vector with mass to charge ratio |
charge |
The charge value of the formula, default 0 for autodetect |
window |
The window accuracy in the same units as mass |
elements |
Elements list to take into account. |
list with chemical formula
Get the report for samples with biological and technique replicates in different groups
getgrouprep( xset, file = NULL, method = "medret", intensity = "into", rsdcf = 30, inscf = 1000 )
getgrouprep( xset, file = NULL, method = "medret", intensity = "into", rsdcf = 30, inscf = 1000 )
xset |
the xcmsset object all of samples with technique replicates |
file |
file name for the peaklist to MetaboAnalyst |
method |
parameter for groupval function |
intensity |
parameter for groupval function |
rsdcf |
rsd cutoff for peaks, default 30 |
inscf |
intensity cutoff for peaks, default 1000 |
dataframe with mean, standard deviation and RSD for those technique replicates & biological replicates combined with raw data in different groups if file are defaults NULL.
Impute the peaks list data
getimputation(list, method = "l")
getimputation(list, method = "l")
list |
list with data as peaks list, mz, rt and group information |
method |
'r' means remove, 'l' means use half the minimum of the values across the peaks list, 'mean' means mean of the values across the samples, 'median' means median of the values across the samples, '0' means 0, '1' means 1. Default 'l'. |
list with imputed peaks
getdata2
,getdata
, getmzrt
,getdoe
, getmr
data(list) getimputation(list)
data(list) getimputation(list)
GetIntegration was mainly used for get the integration of certain ion's chromatogram data and plot the data
GetIntegration( data, rt = c(8.3, 9), n = 5, m = 5, slope = c(2, 2), baseline = 10, noslope = TRUE, smoothit = TRUE, half = FALSE )
GetIntegration( data, rt = c(8.3, 9), n = 5, m = 5, slope = c(2, 2), baseline = 10, noslope = TRUE, smoothit = TRUE, half = FALSE )
data |
file should be a dataframe with the first column RT and second column intensity of the SIM ions. |
rt |
a rough RT range contained only one peak to get the area |
n |
points in the moving average smooth box, default value is 5 |
m |
numbers of points for regression to get the slope |
slope |
the threshold value for start/stop peak as percentage of max slope |
baseline |
numbers of the points for the baseline of the signal |
noslope |
logical, if using a horizon line to get area or not |
smoothit |
logical, if using an average smooth box or not. If using, n will be used |
half |
logical, if using the left half peak to calculate the area |
integration data such as peak area, peak height, signal and the slope data.
## Not run: list <- GetIntegration(data) ## End(Not run)
## Not run: list <- GetIntegration(data) ## End(Not run)
Get the selected isotopologues at certain MS data
Getisotopologues(formula = "C12OH6Br4", charge = 1, width = 0.3)
Getisotopologues(formula = "C12OH6Br4", charge = 1, width = 0.3)
formula |
the molecular formula. C12OH6Br4 means BDE-47 as default |
charge |
the charge of that molecular. 1 in EI mode as default |
width |
the width of the peak width on mass spectrum. 0.3 as default for low resolution mass spectrum. |
# show isotopologues for BDE-47 Getisotopologues(formula = 'C12OH6Br4')
# show isotopologues for BDE-47 Getisotopologues(formula = 'C12OH6Br4')
Get the exact mass of the isotopologues from a chemical formula or reaction's isotope patterns with the highest abundances
getmass(data)
getmass(data)
data |
a chemical formula or reaction e.g. 'Cl-H', 'C2H4' |
numerical vector
getmass('CH2')
getmass('CH2')
Get mass defect with certain scaled factor
getmassdefect(mass, sf)
getmassdefect(mass, sf)
mass |
vector of mass |
sf |
scaled factors |
dataframe with mass, scaled mass and scaled mass defect
mass <- c(100.1022,245.2122,267.3144,400.1222,707.2294) sf <- 0.9988 mf <- getmassdefect(mass,sf)
mass <- c(100.1022,245.2122,267.3144,400.1222,707.2294) sf <- 0.9988 mf <- getmassdefect(mass,sf)
Import data and return the annotated matrix for GC/LC-MS by m/z range and retention time
getmd(data, mzstep = 0.1, mzrange = FALSE, rtrange = FALSE)
getmd(data, mzstep = 0.1, mzrange = FALSE, rtrange = FALSE)
data |
file type which xcmsRaw could handle |
mzstep |
the m/z step for generating matrix data from raw mass spectral data |
mzrange |
vector range of the m/z, default all |
rtrange |
vector range of the retention time, default all |
matrix with the row as increasing m/z second and column as increasing scantime
## Not run: library(faahKO) cdfpath <- system.file('cdf', package = 'faahKO') cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE) matrix <- getmd(cdffiles[1]) ## End(Not run)
## Not run: library(faahKO) cdfpath <- system.file('cdf', package = 'faahKO') cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE) matrix <- getmd(cdffiles[1]) ## End(Not run)
Get the high order unit based Mass Defect
getmdh(mz, cus = c("CH2,H2"), method = "round")
getmdh(mz, cus = c("CH2,H2"), method = "round")
mz |
numeric vector for exact mass |
cus |
chemical formula or reaction |
method |
you could use 'round', 'floor' or 'ceiling' |
high order Mass Defect with details
getmdh(getmass('C2H4'))
getmdh(getmass('C2H4'))
Get the raw Mass Defect
getmdr(mz)
getmdr(mz)
mz |
numeric vector for exact mass |
raw Mass Defect
getmdr(getmass('C2H4'))
getmdr(getmass('C2H4'))
Get the mzrt profile and group information for batch correction and plot as a list directly from path with default setting
getmr( path, index = FALSE, BPPARAM = BiocParallel::SnowParam(), pmethod = "hplcorbitrap", minfrac = 0.67, ... )
getmr( path, index = FALSE, BPPARAM = BiocParallel::SnowParam(), pmethod = "hplcorbitrap", minfrac = 0.67, ... )
path |
the path to your data |
index |
the index of the files |
BPPARAM |
used for BiocParallel package |
pmethod |
parameters used for different instrumentals such as 'hplcorbitrap', 'uplcorbitrap', 'hplcqtof', 'hplchqtof', 'uplcqtof', 'uplchqtof'. The parameters were from the references |
minfrac |
minimum fraction of samples necessary in at least one of the sample groups for it to be a valid group, default 0.67 |
... |
arguments for xcmsSet function |
list with rtmz profile and group infomation
getdata
,getupload
, getmzrt
, getdoe
## Not run: library(faahKO) cdfpath <- system.file('cdf', package = 'faahKO') list <- getmr(cdfpath, pmethod = ' ') ## End(Not run)
## Not run: library(faahKO) cdfpath <- system.file('cdf', package = 'faahKO') list <- getmr(cdfpath, pmethod = ' ') ## End(Not run)
Annotation of MS1 data by compounds database by predefined paired mass distance
getms1anno(pmd, mz, ppm = 10, db = NULL)
getms1anno(pmd, mz, ppm = 10, db = NULL)
pmd |
adducts formula or paired mass distance for ions |
mz |
unknown mass to charge ratios vector |
ppm |
mass accuracy |
db |
compounds database as dataframe. Two required columns are name and monoisotopic molecular weight with column names of name and mass |
list or data frame
read in MSP file as list for ms/ms or ms(EI) annotation
getMSP(file)
getMSP(file)
file |
the path to your MSP file |
list a list with MSP information for annotation
Get the mzrt profile and group information as a mzrt list and/or save them as csv or rds for further analysis.
getmzrt( xset, name = NULL, mzdigit = 4, rtdigit = 1, method = "medret", value = "into", eic = FALSE, type = "o" )
getmzrt( xset, name = NULL, mzdigit = 4, rtdigit = 1, method = "medret", value = "into", eic = FALSE, type = "o" )
xset |
xcmsSet/XCMSnExp objects |
name |
file name for csv and/or eic file, default NULL |
mzdigit |
m/z digits of row names of data frame, default 4 |
rtdigit |
retention time digits of row names of data frame, default 1 |
method |
parameter for groupval or featureDefinitions function, default medret |
value |
parameter for groupval or featureDefinitions function, default into |
eic |
logical, save xcmsSet and xcmsEIC objects for further investigation with the same name of files, you will need raw files in the same directory as defined in xcmsSet to extract the EIC based on the binned data. You could use ‘plot' to plot EIC for specific peaks. For example, 'plot(xcmsEIC,xcmsSet,groupidx = ’M123.4567T278.9')' could show the EIC for certain peaks with m/z 206 and retention time 2789. default F |
type |
csv format for further analysis, m means Metaboanalyst, a means xMSannotator, p means Mummichog(NA values are imputed by 'getimputation', and F test is used here to generate stats and p value), o means full information csv (for 'pmd' package), default o. mapo could output all those format files. |
mzrt object, a list with mzrt profile and group information
Smith, C.A., Want, E.J., O’Maille, G., Abagyan, R., Siuzdak, G., 2006. XCMS: Processing Mass Spectrometry Data for Metabolite Profiling Using Nonlinear Peak Alignment, Matching, and Identification. Anal. Chem. 78, 779–787.
getdata
,getdata2
, getdoe
, getcsv
, getfilter
## Not run: library(faahKO) cdfpath <- system.file('cdf', package = 'faahKO') xset <- getdata(cdfpath, pmethod = ' ') getmzrt(xset, name = 'demo', type = 'mapo') ## End(Not run)
## Not run: library(faahKO) cdfpath <- system.file('cdf', package = 'faahKO') xset <- getdata(cdfpath, pmethod = ' ') getmzrt(xset, name = 'demo', type = 'mapo') ## End(Not run)
Get the mzrt profile and group information for batch correction and plot as a list for xcms 3 object
getmzrt2(xset, name = NULL)
getmzrt2(xset, name = NULL)
xset |
a XCMSnExp object with processed data |
name |
file name for csv file, default NULL |
list with rtmz profile and group information
getdata2
,getupload2
, getmzrt
, getdoe
,getmzrtcsv
## Not run: library(faahKO) cdfpath <- system.file('cdf', package = 'faahKO') xset <- getdata2(cdfpath, ppp = xcms::MatchedFilterParam(), rtp = xcms::ObiwarpParam(), gpp = xcms::PeakDensityParam()) getmzrt2(xset) ## End(Not run)
## Not run: library(faahKO) cdfpath <- system.file('cdf', package = 'faahKO') xset <- getdata2(cdfpath, ppp = xcms::MatchedFilterParam(), rtp = xcms::ObiwarpParam(), gpp = xcms::PeakDensityParam()) getmzrt2(xset) ## End(Not run)
Covert the peaks list csv file into list
getmzrtcsv(path)
getmzrtcsv(path)
path |
the path to your csv file |
list with rtmz profile and group information as the first row
Get the overlap peaks by mass and retention time range
getoverlappeak(list1, list2)
getoverlappeak(list1, list2)
list1 |
list with data as peaks list, mz, rt, mzrange, rtrange and group information to be overlapped |
list2 |
list with data as peaks list, mz, rt, mzrange, rtrange and group information to overlap |
logical index for list 1's peaks
getmzrt
, getimputation
, getmr
,getdoe
Merge positive and negative mode data
getpn(pos, neg, ppm = 5, pmd = 2.02, digits = 2, cutoff = 0.9)
getpn(pos, neg, ppm = 5, pmd = 2.02, digits = 2, cutoff = 0.9)
pos |
a list with mzrt profile collected from positive mode. The sample order should match the negative mode. |
neg |
a list with mzrt profile collected from negative mode.The sample order should match the positive mode. |
ppm |
pmd mass accuracy, default 5 |
pmd |
numeric or numeric vector |
digits |
mass or mass to charge ratio accuracy for pmd, default 2 |
cutoff |
correlation coefficients, default 0.9 |
mzrt object with group information from pos mode
Get the index with power restriction for certain study with BH adjusted p-value and certain power.
getpower(list, pt = 0.05, qt = 0.05, powert = 0.8, imputation = "l")
getpower(list, pt = 0.05, qt = 0.05, powert = 0.8, imputation = "l")
list |
list with data as peaks list, mz, rt and group information |
pt |
p value threshold, default 0.05 |
qt |
q value threshold, BH adjust, default 0.05 |
powert |
power cutoff, default 0.8 |
imputation |
parameters for 'getimputation' function method |
list with current power and sample numbers for each peaks
getdata2
,getdata
, getmzrt
, getimputation
, getmr
,getdoe
data(list) getpower(list)
data(list) getpower(list)
Compute pooled QC linear index according to run order
getpqsi(data, order, n = 5)
getpqsi(data, order, n = 5)
data |
peaks intensity list with row as peaks and column as samples |
order |
run order of pooled QC samples |
n |
samples numbers used for linear regression |
vector for the peaks proportion with significant changes in linear regression after FDR control.
get the data of QC compound for a group of data
getQCraw(path, mzrange, rtrange, index = NULL)
getQCraw(path, mzrange, rtrange, index = NULL)
path |
data path for your QC samples |
mzrange |
mass of the QC compound |
rtrange |
retention time of the QC compound |
index |
index of the files contained QC compounds, default is all of the compounds |
number vector, each number indicate the peak area of that mass and retention time range
Get a mzrt list and/or save mz and rt range as csv file.
getrangecsv(list, name, ...)
getrangecsv(list, name, ...)
list |
list with data as peaks list, mz, rt and group information |
name |
result name for csv and/or eic file, default NULL |
... |
other parameters for 'write.table' |
NULL, csv file
Perform peaks list alignment and return features table
getretcor(list, ts = 1, ppm = 10, deltart = 5, FUN)
getretcor(list, ts = 1, ppm = 10, deltart = 5, FUN)
list |
each element should be a data.frame with mz, rt and ins as m/z, retention time in seconds and intensity of certain peaks. |
ts |
template sample index in the list, default 1 |
ppm |
mass accuracy, default 10 |
deltart |
retention time shift table, default 5 seconds |
FUN |
function to deal with multiple aligned peaks from one sample |
mzrt object without group information
Get the Relative Mass Defect
getrmd(mz)
getrmd(mz)
mz |
numeric vector for exact mass |
Relative Mass Defect
getrmd(getmass('C2H4'))
getrmd(getmass('C2H4'))
Quantitative analysis for short-chain chlorinated paraffins(SCCPs)
getsccp( pathstds, pathsample, ismz = 323, ppm = 5, con = 2000, rt = NULL, rts = NULL, log = TRUE )
getsccp( pathstds, pathsample, ismz = 323, ppm = 5, con = 2000, rt = NULL, rts = NULL, log = TRUE )
pathstds |
mzxml file path for SCCPs standards |
pathsample |
mzxml file path for samples |
ismz |
internal standards m/z |
ppm |
resolution of mass spectrum |
con |
concentration of standards |
rt |
retention time range of sccps |
rts |
retention time range of internal standards |
log |
log transformation for response factor |
list with peak information
output the similarity of two dataset
getsim(xset1, xset2)
getsim(xset1, xset2)
xset1 |
the first dataset |
xset2 |
the second dateset |
similarity on retention time and rsd
Get the report for technique replicates.
gettechrep( xset, method = "medret", intensity = "into", file = NULL, rsdcf = 30, inscf = 1000 )
gettechrep( xset, method = "medret", intensity = "into", file = NULL, rsdcf = 30, inscf = 1000 )
xset |
the xcmsset object which for all of your technique replicates for one sample |
method |
parameter for groupval function |
intensity |
parameter for groupval function |
file |
file name for further annotation, default NULL |
rsdcf |
rsd cutoff for peaks, default 30 |
inscf |
intensity cutoff for peaks, default 1000 |
dataframe with mean, standard deviation and RSD for those technique replicates combined with raw data
Get the time series or two factor DoE report for samples with biological and technique replicates in different groups
gettimegrouprep( xset, file = NULL, method = "medret", intensity = "into", rsdcf = 30, inscf = 1000 )
gettimegrouprep( xset, file = NULL, method = "medret", intensity = "into", rsdcf = 30, inscf = 1000 )
xset |
the xcmsset object all of samples with technique replicates in time series or two factor DoE |
file |
file name for the peaklist to MetaboAnalyst |
method |
parameter for groupval function |
intensity |
parameter for groupval function |
rsdcf |
rsd cutoff for peaks, default 30 |
inscf |
intensity cutoff for peaks, default 1000 |
dataframe with time series or two factor DoE mean, standard deviation and RSD for those technique replicates & biological replicates combined with raw data in different groups if file are defaults NULL.
Get the csv files from xcmsset/XCMSnExp/list object
getupload( xset, method = "medret", value = "into", name = "Peaklist", type = "m", mzdigit = 4, rtdigit = 1 )
getupload( xset, method = "medret", value = "into", name = "Peaklist", type = "m", mzdigit = 4, rtdigit = 1 )
xset |
the xcmsset/XCMSnExp/list object which you want to submitted to Metaboanalyst |
method |
parameter for groupval function |
value |
parameter for groupval function |
name |
file name |
type |
m means Metaboanalyst, a means xMSannotator, o means full information csv |
mzdigit |
m/z digits of row names of data frame |
rtdigit |
retention time digits of row names of data frame |
dataframe with data needed for Metaboanalyst/xMSannotator/pmd if your want to perform local analysis.
## Not run: library(faahKO) cdfpath <- system.file('cdf', package = 'faahKO') xset <- getdata(cdfpath, pmethod = ' ') getupload(xset) ## End(Not run)
## Not run: library(faahKO) cdfpath <- system.file('cdf', package = 'faahKO') xset <- getdata(cdfpath, pmethod = ' ') getupload(xset) ## End(Not run)
Get the csv files to be submitted to Metaboanalyst
getupload2(xset, value = "into", name = "Peaklist")
getupload2(xset, value = "into", name = "Peaklist")
xset |
a XCMSnExp object with processed data which you want to submitted to Metaboanalyst |
value |
value for 'xcms::featureValues' |
name |
file name |
dataframe with data needed for Metaboanalyst if your want to perform local analysis.
## Not run: library(faahKO) cdfpath <- system.file('cdf', package = 'faahKO') xset <- getdata2(cdfpath) getupload2(xset) ## End(Not run)
## Not run: library(faahKO) cdfpath <- system.file('cdf', package = 'faahKO') xset <- getdata2(cdfpath) getupload2(xset) ## End(Not run)
Get the csv files to be submitted to Metaboanalyst
getupload3(list, name = "Peaklist")
getupload3(list, name = "Peaklist")
list |
list with data as peaks list, mz, rt and group information |
name |
file name |
dataframe with data needed for Metaboanalyst if your want to perform local analysis.
## Not run: library(faahKO) cdfpath <- system.file('cdf', package = 'faahKO') xset <- getdata2(cdfpath, ppp = xcms::MatchedFilterParam(), rtp = xcms::ObiwarpParam(), gpp = xcms::PeakDensityParam()) xset <- enviGCMS::getmzrt2(xset) getupload3(xset) ## End(Not run)
## Not run: library(faahKO) cdfpath <- system.file('cdf', package = 'faahKO') xset <- getdata2(cdfpath, ppp = xcms::MatchedFilterParam(), rtp = xcms::ObiwarpParam(), gpp = xcms::PeakDensityParam()) xset <- enviGCMS::getmzrt2(xset) getupload3(xset) ## End(Not run)
plot scatter plot for rt-mz profile and output gif file for multiple groups
gifmr( list, ms = c(100, 500), rsdcf = 30, inscf = 5, imputation = "i", name = "test", ... )
gifmr( list, ms = c(100, 500), rsdcf = 30, inscf = 5, imputation = "i", name = "test", ... )
list |
list with data as peaks list, mz, rt and group information |
ms |
the mass range to plot the data |
rsdcf |
the rsd cutoff of all peaks in all group |
inscf |
Log intensity cutoff for peaks across samples. If any peaks show a intensity higher than the cutoff in any samples, this peaks would not be filtered. default 5 |
imputation |
parameters for 'getimputation' function method |
name |
file name for gif file, default test |
... |
parameters for 'plot' function |
gif file
## Not run: data(list) gifmr(list) ## End(Not run)
## Not run: data(list) gifmr(list) ## End(Not run)
Just integrate data according to fixed rt and fixed noise area
Integration(data, rt = c(8.3, 9), brt = c(8.3, 8.4), smoothit = TRUE)
Integration(data, rt = c(8.3, 9), brt = c(8.3, 8.4), smoothit = TRUE)
data |
file should be a dataframe with the first column RT and second column intensity of the SIM ions. |
rt |
a rough RT range contained only one peak to get the area |
brt |
a rough RT range contained only one peak and enough noises to get the area |
smoothit |
logical, if using an average smooth box or not. If using, n will be used |
area integration data
## Not run: area <- Integration(data) ## End(Not run)
## Not run: area <- Integration(data) ## End(Not run)
Demo data
data(list)
data(list)
A list object with data, mass to charge ratio, retention time and group information. The list is generated from faahKO package by 'getmr' function.
filter data by average moving box
ma(x, n)
ma(x, n)
x |
a vector |
n |
A number to identify the size of the moving box. |
The filtered data
ma(rnorm(1000),5)
ma(rnorm(1000),5)
define the Mode function
Mode(x)
Mode(x)
x |
vector |
Mode of the vector
Show MS/MS pmd annotation result
plotanno(anno, ...)
plotanno(anno, ...)
anno |
list from MSMS anno function |
... |
other parameter for plot function |
plot the calibration curve with error bar, r squared and equation.
plotcc(x, y, upper, lower = upper, ...)
plotcc(x, y, upper, lower = upper, ...)
x |
concentration |
y |
response |
upper |
upper error bar |
lower |
lower error bar |
... |
parameters for 'plot' function |
## Not run: plotcc(x,y,upper) ## End(Not run)
## Not run: plotcc(x,y,upper) ## End(Not run)
plot the density for multiple samples
plotden(data, lv, index = NULL, name = NULL, lwd = 1, ...)
plotden(data, lv, index = NULL, name = NULL, lwd = 1, ...)
data |
data row as peaks and column as samples |
lv |
group information |
index |
index for selected peaks |
name |
name on the figure for samples |
lwd |
the line width for density plot, default 1 |
... |
parameters for 'plot' function |
data(list) plotden(list$data, lv = as.character(list$group$sample_group),ylim = c(0,1))
data(list) plotden(list$data, lv = as.character(list$group$sample_group),ylim = c(0,1))
plot density weighted intensity for multiple samples
plotdwtus(list, n = 512, ...)
plotdwtus(list, n = 512, ...)
list |
list with data as peaks list, mz, rt and group information |
n |
the number of equally spaced points at which the density is to be estimated, default 512 |
... |
parameters for 'plot' function |
Density weighted intensity for multiple samples
data(list) plotdwtus(list)
data(list) plotdwtus(list)
plot EIC and boxplot for all peaks and return diffreport
plote(xset, name = "test", test = "t", nonpara = "n", ...)
plote(xset, name = "test", test = "t", nonpara = "n", ...)
xset |
xcmsset object |
name |
filebase of the sub dir |
test |
't' means two-sample welch t-test, 't.equalvar' means two-sample welch t-test with equal variance, 'wilcoxon' means rank sum wilcoxon test, 'f' means F-test, 'pairt' means paired t test, 'blockf' means Two-way analysis of variance, default 't' |
nonpara |
'y' means using nonparametric ranked data, 'n' means original data |
... |
other parameters for 'diffreport' |
diffreport and pdf figure for EIC and boxplot
## Not run: library(faahKO) cdfpath <- system.file('cdf', package = 'faahKO') xset <- getdata(cdfpath, pmethod = ' ') plote(xset) ## End(Not run)
## Not run: library(faahKO) cdfpath <- system.file('cdf', package = 'faahKO') xset <- getdata(cdfpath, pmethod = ' ') plote(xset) ## End(Not run)
Plot the response group of GC-MS
plotgroup(data, threshold = 2)
plotgroup(data, threshold = 2)
data |
imported data matrix of GC-MS |
threshold |
the threshold of the response (log based 10) to separate the group |
list linear regression model for the data matrix
## Not run: data <- getmd(rawdata) plotgroup(data) ## End(Not run)
## Not run: data <- getmd(rawdata) plotgroup(data) ## End(Not run)
plot the density of the GC-MS data with EM algorithm to separate the data into two log normal distribution.
plothist(data)
plothist(data)
data |
imported data matrix of GC-MS |
## Not run: matrix <- getmd(rawdata) plothist(matrix) ## End(Not run)
## Not run: matrix <- getmd(rawdata) plothist(matrix) ## End(Not run)
Plot the heatmap of mzrt profiles
plothm(data, lv, index = NULL)
plothm(data, lv, index = NULL)
data |
data row as peaks and column as samples |
lv |
group information |
index |
index for selected peaks |
data(list) plothm(list$data, lv = as.factor(list$group$sample_group))
data(list) plothm(list$data, lv = as.factor(list$group$sample_group))
plot the information of integration
plotint(list, name = NULL)
plotint(list, name = NULL)
list |
list from getinteagtion |
name |
the title of the plot |
## Not run: list <- getinteagtion(rawdata) plotint(list) ## End(Not run)
## Not run: list <- getinteagtion(rawdata) plotint(list) ## End(Not run)
plot the slope information of integration
plotintslope(list, name = NULL)
plotintslope(list, name = NULL)
list |
list from getintegration |
name |
the title of the plot |
## Not run: list <- getinteragtion(rawdata) plotintslope(list) ## End(Not run)
## Not run: list <- getinteragtion(rawdata) plotintslope(list) ## End(Not run)
plot the kendrick mass defect diagram
plotkms(data, cutoff = 1000)
plotkms(data, cutoff = 1000)
data |
vector with the name m/z |
cutoff |
remove the low intensity |
## Not run: mz <- c(10000,5000,20000,100,40000) names(mz) <- c(100.1022,245.2122,267.3144,400.1222,707.2294) plotkms(mz) ## End(Not run)
## Not run: mz <- c(10000,5000,20000,100,40000) names(mz) <- c(100.1022,245.2122,267.3144,400.1222,707.2294) plotkms(mz) ## End(Not run)
plot the scatter plot for peaks list with threshold
plotmr( list, rt = NULL, ms = NULL, inscf = 5, rsdcf = 30, imputation = "l", ... )
plotmr( list, rt = NULL, ms = NULL, inscf = 5, rsdcf = 30, imputation = "l", ... )
list |
list with data as peaks list, mz, rt and group information |
rt |
vector range of the retention time |
ms |
vector vector range of the m/z |
inscf |
Log intensity cutoff for peaks across samples. If any peaks show a intensity higher than the cutoff in any samples, this peaks would not be filtered. default 5 |
rsdcf |
the rsd cutoff of all peaks in all group, default 30 |
imputation |
parameters for 'getimputation' function method |
... |
parameters for 'plot' function |
data fit the cutoff
data(list) plotmr(list)
data(list) plotmr(list)
plot the diff scatter plot for peaks list with threshold between two groups
plotmrc(list, ms = c(100, 800), inscf = 5, rsdcf = 30, imputation = "l", ...)
plotmrc(list, ms = c(100, 800), inscf = 5, rsdcf = 30, imputation = "l", ...)
list |
list with data as peaks list, mz, rt and group information |
ms |
the mass range to plot the data |
inscf |
Log intensity cutoff for peaks across samples. If any peaks show a intensity higher than the cutoff in any samples, this peaks would not be filtered. default 5 |
rsdcf |
the rsd cutoff of all peaks in all group |
imputation |
parameters for 'getimputation' function method |
... |
parameters for 'plot' function |
data(list) plotmrc(list)
data(list) plotmrc(list)
plot GC/LC-MS data as a heatmap with TIC
plotms(data, log = FALSE)
plotms(data, log = FALSE)
data |
imported data matrix of GC-MS |
log |
transform the intensity into log based 10 |
heatmap
## Not run: library(faahKO) cdfpath <- system.file('cdf', package = 'faahKO') cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE) matrix <- getmd(cdffiles[1]) png('test.png') plotms(matrix) dev.off() ## End(Not run)
## Not run: library(faahKO) cdfpath <- system.file('cdf', package = 'faahKO') cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE) matrix <- getmd(cdffiles[1]) png('test.png') plotms(matrix) dev.off() ## End(Not run)
Plot EIC of certain m/z and return dataframe for integration
plotmsrt(data, ms, rt, n = FALSE)
plotmsrt(data, ms, rt, n = FALSE)
data |
imported data matrix of GC-MS |
ms |
m/z to be extracted |
rt |
vector range of the retention time |
n |
logical smooth or not |
dataframe with with the first column RT and second column intensity of the SIM ions.
## Not run: matrix <- getmd(rawdata) plotmsrt(matrix,rt = c(500,1000),ms = 300) ## End(Not run)
## Not run: matrix <- getmd(rawdata) plotmsrt(matrix,rt = c(500,1000),ms = 300) ## End(Not run)
plot GC/LC-MS data as scatter plot
plotmz(data, inscf = 5, ...)
plotmz(data, inscf = 5, ...)
data |
imported data matrix of GC-MS |
inscf |
Log intensity cutoff for peaks, default 5 |
... |
parameters for 'plot' function |
scatter plot
## Not run: library(faahKO) cdfpath <- system.file('cdf', package = 'faahKO') cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE) matrix <- getmd(cdffiles[1]) png('test.png') plotmz(matrix) dev.off() ## End(Not run)
## Not run: library(faahKO) cdfpath <- system.file('cdf', package = 'faahKO') cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE) matrix <- getmd(cdffiles[1]) png('test.png') plotmz(matrix) dev.off() ## End(Not run)
plot the PCA for multiple samples
plotpca( data, lv = NULL, index = NULL, center = TRUE, scale = TRUE, xrange = NULL, yrange = NULL, pch = NULL, ... )
plotpca( data, lv = NULL, index = NULL, center = TRUE, scale = TRUE, xrange = NULL, yrange = NULL, pch = NULL, ... )
data |
data row as peaks and column as samples |
lv |
group information |
index |
index for selected peaks |
center |
parameters for PCA |
scale |
parameters for scale |
xrange |
x axis range for return samples, default NULL |
yrange |
y axis range for return samples, default NULL |
pch |
default pch would be the first character of group information or samples name |
... |
other parameters for 'plot' function |
if xrange and yrange are not NULL, return file name of all selected samples on 2D score plot
data(list) plotpca(list$data, lv = as.character(list$group$sample_group))
data(list) plotpca(list$data, lv = as.character(list$group$sample_group))
plot intensity of peaks across samples or samples across peaks
plotpeak(data, lv = NULL, indexx = NULL, indexy = NULL, ...)
plotpeak(data, lv = NULL, indexx = NULL, indexy = NULL, ...)
data |
matrix |
lv |
factor vector for the column |
indexx |
index for matrix row |
indexy |
index for matrix column |
... |
parameters for 'title' function |
parallel coordinates plot
data(list) # selected peaks across samples plotpeak(t(list$data), lv = as.factor(c(rep(1,5),rep(2,nrow(list$data)-5))),1:10,1:10) # selected samples across peaks plotpeak(list$data, lv = as.factor(list$group$sample_group),1:10,1:10)
data(list) # selected peaks across samples plotpeak(t(list$data), lv = as.factor(c(rep(1,5),rep(2,nrow(list$data)-5))),1:10,1:10) # selected samples across peaks plotpeak(list$data, lv = as.factor(list$group$sample_group),1:10,1:10)
plot ridgeline density plot
plotridge(data, lv = NULL, indexx = NULL, indexy = NULL, ...)
plotridge(data, lv = NULL, indexx = NULL, indexy = NULL, ...)
data |
matrix |
lv |
factor vector for the column |
indexx |
index for matrix row |
indexy |
index for matrix column |
... |
parameters for 'title' function |
ridgeline density plot
data(list) plotridge(t(list$data),indexy=c(1:10),xlab = 'Intensity',ylab = 'peaks') plotridge(log(list$data),as.factor(list$group$sample_group),xlab = 'Intensity',ylab = 'peaks')
data(list) plotridge(t(list$data),indexy=c(1:10),xlab = 'Intensity',ylab = 'peaks') plotridge(log(list$data),as.factor(list$group$sample_group),xlab = 'Intensity',ylab = 'peaks')
Relative Log Abundance Ridge (RLAR) plots for samples or peaks
plotridges(data, lv, type = "g")
plotridges(data, lv, type = "g")
data |
data row as peaks and column as samples |
lv |
factor vector for the group information of samples |
type |
'g' means group median based, other means all samples median based. |
Relative Log Abundance Ridge(RLA) plots
data(list) plotridges(list$data, as.factor(list$group$sample_group))
data(list) plotridges(list$data, as.factor(list$group$sample_group))
Relative Log Abundance (RLA) plots
plotrla(data, lv, type = "g", ...)
plotrla(data, lv, type = "g", ...)
data |
data row as peaks and column as samples |
lv |
factor vector for the group information |
type |
'g' means group median based, other means all samples median based. |
... |
parameters for boxplot |
Relative Log Abundance (RLA) plots
data(list) plotrla(list$data, as.factor(list$group$sample_group))
data(list) plotrla(list$data, as.factor(list$group$sample_group))
plot the rsd influences of data in different groups
plotrsd(list, ms = c(100, 800), inscf = 5, rsdcf = 100, imputation = "l", ...)
plotrsd(list, ms = c(100, 800), inscf = 5, rsdcf = 100, imputation = "l", ...)
list |
list with data as peaks list, mz, rt and group information |
ms |
the mass range to plot the data |
inscf |
Log intensity cutoff for peaks across samples. If any peaks show a intensity higher than the cutoff in any samples, this peaks would not be filtered. default 5 |
rsdcf |
the rsd cutoff of all peaks in all group |
imputation |
parameters for 'getimputation' function method |
... |
other parameters for 'plot' function |
data(list) plotrsd(list)
data(list) plotrsd(list)
Plot mass spectrum of certain retention time and return mass spectrum vector (MSP file) for NIST search
plotrtms(data, rt, ms, msp = FALSE)
plotrtms(data, rt, ms, msp = FALSE)
data |
imported data matrix of GC-MS |
rt |
vector range of the retention time |
ms |
vector range of the m/z |
msp |
logical, return MSP files or not, default False |
plot, vector and MSP files for NIST search
## Not run: matrix <- getmd(rawdata) plotrtms(matrix,rt = c(500,1000),ms = (300,500)) ## End(Not run)
## Not run: matrix <- getmd(rawdata) plotrtms(matrix,rt = c(500,1000),ms = (300,500)) ## End(Not run)
plot 1-d density for multiple samples
plotrug(data, lv = NULL, indexx = NULL, indexy = NULL, ...)
plotrug(data, lv = NULL, indexx = NULL, indexy = NULL, ...)
data |
matrix |
lv |
factor vector for the column |
indexx |
index for matrix row |
indexy |
index for matrix column |
... |
parameters for 'title' function |
data(list) plotrug(list$data) plotrug(log(list$data), lv = as.factor(list$group$sample_group))
data(list) plotrug(list$data) plotrug(log(list$data), lv = as.factor(list$group$sample_group))
Plot the intensity distribution of GC-MS
plotsms(meanmatrix, rsdmatrix)
plotsms(meanmatrix, rsdmatrix)
meanmatrix |
mean data matrix of GC-MS(n=5) |
rsdmatrix |
standard deviation matrix of GC-MS(n=5) |
## Not run: data1 <- getmd(‘sample1-1’) data2 <- getmd(‘sample1-2’) data3 <- getmd(‘sample1-3’) data4 <- getmd(‘sample1-4’) data5 <- getmd(‘sample1-5’) data <- (data1+data2+data3+data4+data5)/5 datasd <- sqrt(((data1-data)^2+(data2-data)^2+(data3-data)^2+(data4-data)^2+(data5-data)^2)/4) databrsd <- datasd/data plotsms(meanmatrix,rsdmatrix) ## End(Not run)
## Not run: data1 <- getmd(‘sample1-1’) data2 <- getmd(‘sample1-2’) data3 <- getmd(‘sample1-3’) data4 <- getmd(‘sample1-4’) data5 <- getmd(‘sample1-5’) data <- (data1+data2+data3+data4+data5)/5 datasd <- sqrt(((data1-data)^2+(data2-data)^2+(data3-data)^2+(data4-data)^2+(data5-data)^2)/4) databrsd <- datasd/data plotsms(meanmatrix,rsdmatrix) ## End(Not run)
Plot the background of data
plotsub(data)
plotsub(data)
data |
imported data matrix of GC-MS |
## Not run: matrix <- getmd(rawdata) plotsub(matrix) ## End(Not run)
## Not run: matrix <- getmd(rawdata) plotsub(matrix) ## End(Not run)
plot GC-MS data as a heatmap for constant speed of temperature rising
plott(data, log = FALSE, temp = c(100, 320))
plott(data, log = FALSE, temp = c(100, 320))
data |
imported data matrix of GC-MS |
log |
transform the intensity into log based 10 |
temp |
temperature range for constant speed |
heatmap
## Not run: matrix <- getmd(rawdata) plott(matrix) ## End(Not run)
## Not run: matrix <- getmd(rawdata) plott(matrix) ## End(Not run)
Plot Total Ion Chromatogram (TIC)
plottic(data, n = FALSE)
plottic(data, n = FALSE)
data |
imported data matrix of GC-MS |
n |
logical smooth or not |
plot
## Not run: matrix <- getmd(rawdata) plottic(matrix) ## End(Not run)
## Not run: matrix <- getmd(rawdata) plottic(matrix) ## End(Not run)
Get the MIR from the file
qbatch(file, mz1, mz2, rt = c(8.65, 8.74), brt = c(8.74, 8.85))
qbatch(file, mz1, mz2, rt = c(8.65, 8.74), brt = c(8.74, 8.85))
file |
data file, CDF or other format supportted by xcmsRaw |
mz1 |
the lowest mass |
mz2 |
the highest mass |
rt |
a rough RT range contained only one peak to get the area |
brt |
a rough RT range contained only one peak and enough noises to get the area |
arearatio
## Not run: arearatio <- qbatch(datafile) ## End(Not run)
## Not run: arearatio <- qbatch(datafile) ## End(Not run)
Shiny application for interactive mass defect plots analysis
runMDPlot()
runMDPlot()
Shiny application for Short-Chain Chlorinated Paraffins analysis
runsccp()
runsccp()
A dataset containing the ions, formula, Cl
data(sccp)
data(sccp)
A data frame with 24 rows and 8 variables:
Chlorine atom numbers
Carbon atom numbers
molecular formula
hydrogen atom numbers
[M-Cl]- ions
m/z for the isotopologues with highest intensity
abundance of the isotopologues with highest intensity
Chlorine contents
Get the differences of two GC/LC-MS data
submd(data1, data2, mzstep = 0.1, rtstep = 0.01)
submd(data1, data2, mzstep = 0.1, rtstep = 0.01)
data1 |
data file path of first data |
data2 |
data file path of second data |
mzstep |
the m/z step for generating matrix data from raw mass spectral data |
rtstep |
the alignment accuracy of retention time, e.g. 0.01 means the retention times of combined data should be the same at the accuracy 0.01s. Higher rtstep would return less scans for combined data |
list four matrix with the row as scantime in second and column as m/z, the first matrix refer to data 1, the second matrix refer to data 2, the third matrix refer to data1 - data2 while the fourth refer to data2 - data1, minus values are imputed by 0
## Not run: library(faahKO) cdfpath <- system.file('cdf', package = 'faahKO') cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE) matrix <- submd(cdffiles[1],cdffiles[7]) ## End(Not run)
## Not run: library(faahKO) cdfpath <- system.file('cdf', package = 'faahKO') cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE) matrix <- submd(cdffiles[1],cdffiles[7]) ## End(Not run)
Plot the influences of DoE and Batch effects on each peaks
svabatch(df, dfsv, dfanova)
svabatch(df, dfsv, dfanova)
df |
data output from 'svacor' function |
dfsv |
data output from 'svaplot' function for corrected data |
dfanova |
data output from 'svaplot' function for raw data |
influences plot
## Not run: library(faahKO) cdfpath <- system.file("cdf", package = "faahKO") cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE) xset <- xcmsSet(cdffiles) xset <- group(xset) xset2 <- retcor(xset, family = "symmetric", plottype = "mdevden") xset2 <- group(xset2, bw = 10) xset3 <- fillPeaks(xset2) df <- svacor(xset3) dfsv <- svaplot(xset3) dfanova <- svaplot(xset3, pqvalues = "anova") svabatch(df,dfsv,dfanova) ## End(Not run)
## Not run: library(faahKO) cdfpath <- system.file("cdf", package = "faahKO") cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE) xset <- xcmsSet(cdffiles) xset <- group(xset) xset2 <- retcor(xset, family = "symmetric", plottype = "mdevden") xset2 <- group(xset2, bw = 10) xset3 <- fillPeaks(xset2) df <- svacor(xset3) dfsv <- svaplot(xset3) dfanova <- svaplot(xset3, pqvalues = "anova") svabatch(df,dfsv,dfanova) ## End(Not run)
Surrogate variable analysis(SVA) to correct the unknown batch effects
svacor(xset, lv = NULL, method = "medret", intensity = "into")
svacor(xset, lv = NULL, method = "medret", intensity = "into")
xset |
xcmsset object |
lv |
group information |
method |
parameter for groupval function |
intensity |
parameter for groupval function |
this is used for reviesed version of SVA to correct the unknown batch effects
list object with various components such raw data, corrected data, signal part, random errors part, batch part, p-values, q-values, mass, rt, Posterior Probabilities of Surrogate variables and Posterior Probabilities of Mod. If no surrogate variable found, corresponding part would miss.
## Not run: library(faahKO) cdfpath <- system.file("cdf", package = "faahKO") cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE) xset <- xcmsSet(cdffiles) xset <- group(xset) xset2 <- retcor(xset, family = "symmetric", plottype = "mdevden") xset2 <- group(xset2, bw = 10) xset3 <- fillPeaks(xset2) df <- svacor(xset3) ## End(Not run)
## Not run: library(faahKO) cdfpath <- system.file("cdf", package = "faahKO") cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE) xset <- xcmsSet(cdffiles) xset <- group(xset) xset2 <- retcor(xset, family = "symmetric", plottype = "mdevden") xset2 <- group(xset2, bw = 10) xset3 <- fillPeaks(xset2) df <- svacor(xset3) ## End(Not run)
Filter the data with p value and q value
svadata(list, pqvalues = "sv", pt = 0.05, qt = 0.05)
svadata(list, pqvalues = "sv", pt = 0.05, qt = 0.05)
list |
results from svacor function |
pqvalues |
method for ANOVA or SVA |
pt |
threshold for p value, default is 0.05 |
qt |
threshold for q value, default is 0.05 |
data, corrected data, mz and retention for filerted data
## Not run: library(faahKO) cdfpath <- system.file("cdf", package = "faahKO") cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE) xset <- xcmsSet(cdffiles) xset <- group(xset) xset2 <- retcor(xset, family = "symmetric", plottype = "mdevden") xset2 <- group(xset2, bw = 10) xset3 <- fillPeaks(xset2) df <- svacor(xset3) svadata(df) ## End(Not run)
## Not run: library(faahKO) cdfpath <- system.file("cdf", package = "faahKO") cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE) xset <- xcmsSet(cdffiles) xset <- group(xset) xset2 <- retcor(xset, family = "symmetric", plottype = "mdevden") xset2 <- group(xset2, bw = 10) xset3 <- fillPeaks(xset2) df <- svacor(xset3) svadata(df) ## End(Not run)
Principal component analysis(PCA) for SVA corrected data and raw data
svapca(list, center = TRUE, scale = TRUE, lv = NULL)
svapca(list, center = TRUE, scale = TRUE, lv = NULL)
list |
results from svacor function |
center |
parameters for PCA |
scale |
parameters for scale |
lv |
group information |
plot
## Not run: library(faahKO) cdfpath <- system.file("cdf", package = "faahKO") cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE) xset <- xcmsSet(cdffiles) xset <- group(xset) xset2 <- retcor(xset, family = "symmetric", plottype = "mdevden") xset2 <- group(xset2, bw = 10) xset3 <- fillPeaks(xset2) df <- svacor(xset3) svapca(df) ## End(Not run)
## Not run: library(faahKO) cdfpath <- system.file("cdf", package = "faahKO") cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE) xset <- xcmsSet(cdffiles) xset <- group(xset) xset2 <- retcor(xset, family = "symmetric", plottype = "mdevden") xset2 <- group(xset2, bw = 10) xset3 <- fillPeaks(xset2) df <- svacor(xset3) svapca(df) ## End(Not run)
Filter the data with p value and q value and show them
svaplot(list, pqvalues = "sv", pt = 0.05, qt = 0.05, lv = NULL, index = NULL)
svaplot(list, pqvalues = "sv", pt = 0.05, qt = 0.05, lv = NULL, index = NULL)
list |
results from svacor function |
pqvalues |
method for ANOVA or SVA |
pt |
threshold for p value, default is 0.05 |
qt |
threshold for q value, default is 0.05 |
lv |
group information |
index |
index for selected peaks |
heatmap for the data
## Not run: library(faahKO) cdfpath <- system.file("cdf", package = "faahKO") cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE) xset <- xcmsSet(cdffiles) xset <- group(xset) xset2 <- retcor(xset, family = "symmetric", plottype = "mdevden") xset2 <- group(xset2, bw = 10) xset3 <- fillPeaks(xset2) df <- svacor(xset3) svaplot(df) ## End(Not run)
## Not run: library(faahKO) cdfpath <- system.file("cdf", package = "faahKO") cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE) xset <- xcmsSet(cdffiles) xset <- group(xset) xset2 <- retcor(xset, family = "symmetric", plottype = "mdevden") xset2 <- group(xset2, bw = 10) xset3 <- fillPeaks(xset2) df <- svacor(xset3) svaplot(df) ## End(Not run)
Get the corrected data after SVA for metabolanalyst
svaupload(xset, lv = NULL)
svaupload(xset, lv = NULL)
xset |
xcmsset object |
lv |
group information |
csv files for both raw and corrected data for metaboanalyst if SVA could be applied
## Not run: library(faahKO) cdfpath <- system.file("cdf", package = "faahKO") cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE) xset <- xcmsSet(cdffiles) xset <- group(xset) xset2 <- retcor(xset, family = "symmetric", plottype = "mdevden") xset2 <- group(xset2, bw = 10) xset3 <- fillPeaks(xset2) svaupload(xset3) ## End(Not run)
## Not run: library(faahKO) cdfpath <- system.file("cdf", package = "faahKO") cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE) xset <- xcmsSet(cdffiles) xset <- group(xset) xset2 <- retcor(xset, family = "symmetric", plottype = "mdevden") xset2 <- group(xset2, bw = 10) xset3 <- fillPeaks(xset2) svaupload(xset3) ## End(Not run)
Demo data for TBBPA metabolism in Pumpkin
data(TBBPA)
data(TBBPA)
A list object with data, mass to charge ratio, retention time and group information. Three pumpkin seeding root samples' peaks list is extracted by xcms online.
Hou, X., Yu, M., Liu, A., Wang, X., Li, Y., Liu, J., Schnoor, J.L., Jiang, G., 2019. Glycosylation of Tetrabromobisphenol A in Pumpkin. Environ. Sci. Technol. https://doi.org/10.1021/acs.est.9b02122
Write MSP file for NIST search
writeMSP(list, name = "unknown", sep = FALSE)
writeMSP(list, name = "unknown", sep = FALSE)
list |
a list with spectra information |
name |
name of the compounds |
sep |
numeric or logical the numbers of spectra in each file and FALSE to include all of the spectra in one msp file |
none a MSP file will be created.
## Not run: ins <- c(10000,20000,10000,30000,5000) mz <- c(101,143,189,221,234) writeMSP(list(list(spectra = cbind.data.frame(mz,ins))), name = 'test') ## End(Not run)
## Not run: ins <- c(10000,20000,10000,30000,5000) mz <- c(101,143,189,221,234) writeMSP(list(list(spectra = cbind.data.frame(mz,ins))), name = 'test') ## End(Not run)
Perform MS/MS X rank annotation for mgf file
xrankanno(file, db = NULL, ppm = 10, prems = 1.1, intc = 0.1, quantile = 0.75)
xrankanno(file, db = NULL, ppm = 10, prems = 1.1, intc = 0.1, quantile = 0.75)
file |
mgf file generated from MS/MS data |
db |
database could be list object from 'getms2pmd' |
ppm |
mass accuracy, default 10 |
prems |
precursor mass range, default 1.1 to include M+H or M-H |
intc |
intensity cutoff for peaks. Default 0.1 |
quantile |
X rank quantiles cutoff for annotation. Default 0.75. |
list with MSMS annotation results