This package provides the barcode, UMI, and set (BUS) format of the following datasets from 10X genomics:
The original fastq files have already been processed into the BUS format, which is a table with the following columns: barcode, UMI, equivalence class/set, and count (i.e. number of reads for the same barcode, UMI, and set). The datasets have been uploaded to ExperimentHub
. This vignette demonstrates how to download the first dataset above with this package. See the BUSpaRse website for more detailed vignettes.
library(TENxBUSData)
library(ExperimentHub)
#> Loading required package: BiocGenerics
#> Loading required package: parallel
#>
#> Attaching package: 'BiocGenerics'
#> The following objects are masked from 'package:parallel':
#>
#> clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
#> clusterExport, clusterMap, parApply, parCapply, parLapply,
#> parLapplyLB, parRapply, parSapply, parSapplyLB
#> The following objects are masked from 'package:stats':
#>
#> IQR, mad, sd, var, xtabs
#> The following objects are masked from 'package:base':
#>
#> Filter, Find, Map, Position, Reduce, anyDuplicated, append,
#> as.data.frame, basename, cbind, colnames, dirname, do.call,
#> duplicated, eval, evalq, get, grep, grepl, intersect,
#> is.unsorted, lapply, mapply, match, mget, order, paste, pmax,
#> pmax.int, pmin, pmin.int, rank, rbind, rownames, sapply,
#> setdiff, sort, table, tapply, union, unique, unsplit, which,
#> which.max, which.min
#> Loading required package: AnnotationHub
#> Loading required package: BiocFileCache
#> Loading required package: dbplyr
See which datasets are available with this package.
eh <- ExperimentHub()
#> snapshotDate(): 2019-10-22
listResources(eh, "TENxBUSData")
#> [1] "100 1:1 Mixture of Fresh Frozen Human (HEK293T) and Mouse (NIH3T3) Cells"
#> [2] "1k 1:1 Mixture of Fresh Frozen Human (HEK293T) and Mouse (NIH3T3) Cells (v3 chemistry)"
#> [3] "1k PBMCs from a Healthy Donor (v3 chemistry)"
#> [4] "10k Brain Cells from an E18 Mouse (v3 chemistry)"
#> [5] "SRR8599150 Mouse Retina"
In this vignette, we download the 100 cell dataset. The force
argument will force redownload even if the files are already present.
TENxBUSData(".", dataset = "hgmm100", force = TRUE)
#> snapshotDate(): 2019-10-22
#> see ?TENxBUSData and browseVignettes('TENxBUSData') for documentation
#> downloading 1 resources
#> retrieving 1 resource
#> loading from cache
#> The downloaded files are in /tmp/RtmpgwFWoA/Rbuild58903a40c0b0/TENxBUSData/vignettes/out_hgmm100
#> [1] "/tmp/RtmpgwFWoA/Rbuild58903a40c0b0/TENxBUSData/vignettes/out_hgmm100"
Which files are downloaded?
list.files("./out_hgmm100")
#> [1] "matrix.ec" "output.sorted" "output.sorted.txt"
#> [4] "transcripts.txt"
These should be sufficient to construct a sparse matrix with package BUSpaRse
.