bigPint 1.2.2
The bigPint
package allows users to superimpose a subset of the dataset onto the full dataset. If the bigPint
package is being applied to RNA-seq data, then this subset of genes is often differentially expressed genes (DEGs). Except for the plotSMApp()
function, all functions offer users three options for superimposing data subsets in the bigPint
package. We briefly discuss these options below.
The dataMetrics
input is NULL by default in bigPint
package functions. However, a user can create a dataMetrics
object as is explained in the article Creating data metrics object. If a user does input a dataMetrics
object, then two other input parameters will be used, threshVar
and threshVal
. These two inputs will be used to create the data subset from the dataMetrics
input. Below are their definitions in the help files of bigPint
package functions:
CHARACTER STRING
Name of column in dataMetrics object used to threshold significance; default “FDR”INTEGER
Maximum value to threshold significance from threshVar object; default 0.05Below is an example of superimposing genes that have an FDR value less than 1e-10 (Lauter and Graham 2016).
library(bigPint)
data("soybean_ir_sub")
data("soybean_ir_sub_metrics")
soybean_ir_sub[,-1] <- log(soybean_ir_sub[,-1] + 1)
ret <- plotSM(data=soybean_ir_sub, dataMetrics=soybean_ir_sub_metrics,
threshVar = "FDR", threshVal = 1e-10, pointSize = 0.1, saveFile = FALSE)
ret[["N_P"]]
We can alternatively use the geneList
input object to superimpose a subset of the data onto the full data frame. The geneList
object is NULL by default. However, the user can set it to be equal to the list of the IDs that should be superimposed. For example, we can achieve the same plot above by using the following code.
library(dplyr)
sigGenes = soybean_ir_sub_metrics[["N_P"]] %>% filter(FDR < 1e-10) %>% select(ID)
sigGenes = sigGenes[,1]
ret <- plotSM(data=soybean_ir_sub, geneList = sigGenes, pointSize = 0.1,
saveFile = FALSE)
ret[["N_P"]]
We note that the geneList
object is more flexible than the dataMetrics
object. This is because the dataMetrics
object can only create the subset of data by thresholding one quantitative variable. However, the geneList
object can be created in many more ways. For example, below we can examine genes that have an FDR value less than 1e-10 and a log fold change value greater than the absolute value of 6.
library(dplyr)
sigGenes = soybean_ir_sub_metrics[["N_P"]] %>% filter(FDR < 1e-10) %>%
filter(abs(logFC) > 6) %>% select(ID)
sigGenes = sigGenes[,1]
ret <- plotSM(data=soybean_ir_sub, geneList = sigGenes, pointSize = 0.5,
pointColor = "magenta", saveFile = FALSE)
ret[["N_P"]]
Because of this, if both dataMetrics
and geneList
are both not their default NULL value, then geneList
will take priority and dataMetrics
will be ignored.
The last possibility is to leave both dataMetrics
and geneList
as their default NULL value. This will allow a user to examine the distribution of the full dataset without superimposing any subset of data. We end with an example of this technique.
ret <- plotSM(data=soybean_ir_sub, saveFile = FALSE)
ret[["N_P"]]
Lauter, AN Moran, and MA Graham. 2016. “NCBI Sra Bioproject Accession: PRJNA318409.”