Author: Zuguang Gu ( z.gu@dkfz.de )
Date: 2016-05-15
The annotation graphics actually are quite general. The only common characteristic for annotations
is that they are aligned to the columns or rows of the heatmap. Here there is a HeatmapAnnotation
class which is used to
define annotations on columns or rows.
A simple annotation is defined as a vector which contains discrete classes or continuous values.
Since the simple annotation is represented as a vector, multiple simple annotations can be specified
as a data frame. Colors for the simple annotations can be specified by col
with a vector or
color mapping functions, depending on whether the simple annotations are discrete or continuous.
In the heatmap, simple annotations will be represented as rows of grids.
There is a draw()
method for the HeatmapAnnotation
class. draw()
is used internally and here
we just use it for illustration.
library(ComplexHeatmap)
library(circlize)
df = data.frame(type = c(rep("a", 5), rep("b", 5)))
ha = HeatmapAnnotation(df = df)
ha
## A HeatmapAnnotation object with 1 annotation.
##
## An annotation with discrete color mapping
## name: type
## position: column
## show legend: TRUE
draw(ha, 1:10)
The color of simple annotation should be specified as a list with names for which names in the color list (here it is type
in following example)
correspond to the names in the data frame. Each color vector should better has names as well to map to
the levels of annotations.
ha = HeatmapAnnotation(df = df, col = list(type = c("a" = "red", "b" = "blue")))
ha
## A HeatmapAnnotation object with 1 annotation.
##
## An annotation with discrete color mapping
## name: type
## position: column
## show legend: TRUE
draw(ha, 1:10)
For continuous annotation, colors should be a color mapping function.
ha = HeatmapAnnotation(df = data.frame(age = sample(1:20, 10)),
col = list(age = colorRamp2(c(0, 20), c("white", "red"))))
ha
## A HeatmapAnnotation object with 1 annotation.
##
## An annotation with continuous color mapping
## name: age
## position: column
## show legend: TRUE
draw(ha, 1:10)
Color for NA
can be set by na_col
:
df2 = data.frame(type = c(rep("a", 5), rep("b", 5)),
age = sample(1:20, 10))
df2$type[5] = NA
df2$age[5] = NA
ha = HeatmapAnnotation(df = df2,
col = list(type = c("a" = "red", "b" = "blue"),
age = colorRamp2(c(0, 20), c("white", "red"))),
na_col = "grey")
draw(ha, 1:10)
Put more than one annotations by a data frame.
df = data.frame(type = c(rep("a", 5), rep("b", 5)),
age = sample(1:20, 10))
ha = HeatmapAnnotation(df = df,
col = list(type = c("a" = "red", "b" = "blue"),
age = colorRamp2(c(0, 20), c("white", "red")))
)
ha
## A HeatmapAnnotation object with 2 annotations.
##
## An annotation with continuous color mapping
## name: age
## position: column
## show legend: TRUE
##
## An annotation with discrete color mapping
## name: type
## position: column
## show legend: TRUE
draw(ha, 1:10)
Also individual annotations can be specified as vectors:
ha = HeatmapAnnotation(type = c(rep("a", 5), rep("b", 5)),
age = sample(1:20, 10),
col = list(type = c("a" = "red", "b" = "blue"),
age = colorRamp2(c(0, 20), c("white", "red")))
)
ha
## A HeatmapAnnotation object with 2 annotations.
##
## An annotation with continuous color mapping
## name: age
## position: column
## show legend: TRUE
##
## An annotation with discrete color mapping
## name: type
## position: column
## show legend: TRUE
draw(ha, 1:10)
To put column annotation to the heatmap, specify top_annotation
and bottom_annotation
in Heatmap()
.
ha1 = HeatmapAnnotation(df = df,
col = list(type = c("a" = "red", "b" = "blue"),
age = colorRamp2(c(0, 20), c("white", "red")))
)
ha2 = HeatmapAnnotation(df = data.frame(age = sample(1:20, 10)),
col = list(age = colorRamp2(c(0, 20), c("white", "red"))))
set.seed(123)
mat = matrix(rnorm(80, 2), 8, 10)
mat = rbind(mat, matrix(rnorm(40, -2), 4, 10))
rownames(mat) = paste0("R", 1:12)
colnames(mat) = paste0("C", 1:10)
Heatmap(mat, top_annotation = ha1, bottom_annotation = ha2)
Besides simple annotations, there are complex annotations. The complex annotations are always represented as self-defined graphic functions. Actually, for each column annotation, there will be a viewport created waiting for graphics. The annotation function here defines how to put the graphics to this viewport. The only argument of the function is an index of column which is already adjusted by column clustering.
In following example, an annotation of points is created. Please note how we define xscale
so that positions
of points correspond to middle points of the columns if the annotation is added to the heatmap.
value = rnorm(10)
column_anno = function(index) {
n = length(index)
# since middle of columns are in 1, 2, ..., n and each column has width 1
# then the most left should be 1 - 0.5 and the most right should be n + 0.5
pushViewport(viewport(xscale = c(0.5, n + 0.5), yscale = range(value)))
# since order of columns will be adjusted by clustering, here we also
# need to change the order by `[index]`
grid.points(index, value[index], pch = 16, default.unit = "native")
# this is very important in order not to mess up the layout
upViewport()
}
ha = HeatmapAnnotation(points = column_anno) # here the name is arbitrary
ha
## A HeatmapAnnotation object with 1 annotation.
##
## An annotation with self-defined function
## name: points
## position: column
draw(ha, 1:10)
Above code is only for demonstration. You don't realy need to define a points annotation,
there are already several annotation generators provided in the package such as anno_points()
or anno_barplot()
which generate such complex annotation function:
anno_points()
anno_barplot()
anno_boxplot()
anno_histogram()
anno_density()
anno_text()
The input value for these anno_*
functions is quite straightforward. It should be a numeric vector
(e.g. for anno_points()
and anno_barplot()
), a matrix or list (for anno_boxplot()
, anno_histogram()
or anno_density()
), or a character vector (for anno_text()
).
ha = HeatmapAnnotation(points = anno_points(value))
draw(ha, 1:10)
ha = HeatmapAnnotation(barplot = anno_barplot(value))
draw(ha, 1:10)
anno_boxplot()
generates boxplot for each column in the matrix.
ha = HeatmapAnnotation(boxplot = anno_boxplot(mat))
draw(ha, 1:10)
You can mix simple annotaitons and complex annotations:
ha = HeatmapAnnotation(df = df,
points = anno_points(value),
col = list(type = c("a" = "red", "b" = "blue"),
age = colorRamp2(c(0, 20), c("white", "red"))))
ha
## A HeatmapAnnotation object with 3 annotations.
##
## An annotation with self-defined function
## name: points
## position: column
##
## An annotation with continuous color mapping
## name: age
## position: column
## show legend: TRUE
##
## An annotation with discrete color mapping
## name: type
## position: column
## show legend: TRUE
draw(ha, 1:10)
Since simple annotations can also be specified as vectors, actually you arrange annotations in any order:
ha = HeatmapAnnotation(type = c(rep("a", 5), rep("b", 5)),
points = anno_points(value),
age = sample(1:20, 10),
bars = anno_barplot(value),
col = list(type = c("a" = "red", "b" = "blue"),
age = colorRamp2(c(0, 20), c("white", "red"))))
ha
## A HeatmapAnnotation object with 4 annotations.
##
## An annotation with self-defined function
## name: bars
## position: column
##
## An annotation with continuous color mapping
## name: age
## position: column
## show legend: TRUE
##
## An annotation with self-defined function
## name: points
## position: column
##
## An annotation with discrete color mapping
## name: type
## position: column
## show legend: TRUE
draw(ha, 1:10)
For some of the anno_*
functions, graphic parameters can be set by gp
argument.
Also note how we specify baseline
in anno_barplot()
.
ha = HeatmapAnnotation(barplot1 = anno_barplot(value, baseline = 0, gp = gpar(fill = ifelse(value > 0, "red", "green"))),
points = anno_points(value, gp = gpar(col = rep(1:2, 5))),
barplot2 = anno_barplot(value, gp = gpar(fill = rep(3:4, 5))))
ha
## A HeatmapAnnotation object with 3 annotations.
##
## An annotation with self-defined function
## name: barplot2
## position: column
##
## An annotation with self-defined function
## name: points
## position: column
##
## An annotation with self-defined function
## name: barplot1
## position: column
draw(ha, 1:10)
If there are more than one annotations, you can control height of each annotation by annotation_height
.
The value of annotation_height
can either be numeric values or unit
objects.
# set annotation height as relative values
ha = HeatmapAnnotation(df = df, points = anno_points(value), boxplot = anno_boxplot(mat),
col = list(type = c("a" = "red", "b" = "blue"),
age = colorRamp2(c(0, 20), c("white", "red"))),
annotation_height = c(1, 2, 3, 4))
draw(ha, 1:10)
# set annotation height as absolute units
ha = HeatmapAnnotation(df = df, points = anno_points(value), boxplot = anno_boxplot(mat),
col = list(type = c("a" = "red", "b" = "blue"),
age = colorRamp2(c(0, 20), c("white", "red"))),
annotation_height = unit.c((unit(1, "npc") - unit(4, "cm"))*0.5, (unit(1, "npc") - unit(4, "cm"))*0.5,
unit(2, "cm"), unit(2, "cm")))
draw(ha, 1:10)
With the annotation constructed, you can assign to the heatmap either by top_annotation
or bottom_annotation
.
Also you can control the size of total column annotations by top_annotation_height
and bottom_annotation_height
if the height of the annotations are relative values.
If the annotation has proper size (high enough), it would be helpful to add axis on it. anno_points()
, anno_barplot()
and anno_boxplot()
support axes. Please note we didn't pre-allocate space for axes particularly,
we only assume there are already empty spaces for showing axes.
ha = HeatmapAnnotation(df = df, points = anno_points(value),
col = list(type = c("a" = "red", "b" = "blue"),
age = colorRamp2(c(0, 20), c("white", "red"))))
ha_boxplot = HeatmapAnnotation(boxplot = anno_boxplot(mat, axis = TRUE))
Heatmap(mat, name = "foo", top_annotation = ha, bottom_annotation = ha_boxplot,
bottom_annotation_height = unit(3, "cm"))
Gaps below each annotation can be specified by gap
in HeatmapAnnotation()
.
ha = HeatmapAnnotation(df = df, points = anno_points(value), gap = unit(c(2, 4), "mm"),
col = list(type = c("a" = "red", "b" = "blue"),
age = colorRamp2(c(0, 20), c("white", "red"))))
Heatmap(mat, name = "foo", top_annotation = ha)
You can suppress some of the annotation legend by specifying show_legend
to FALSE
when creating the HeatmapAnnotation
object.
ha = HeatmapAnnotation(df = df, show_legend = c(FALSE, TRUE),
col = list(type = c("a" = "red", "b" = "blue"),
age = colorRamp2(c(0, 20), c("white", "red"))))
Heatmap(mat, name = "foo", top_annotation = ha)
More types of annotations which show data distribution in corresponding columns are supported
by anno_histogram()
and anno_density()
.
ha_mix_top = HeatmapAnnotation(histogram = anno_histogram(mat, gp = gpar(fill = rep(2:3, each = 5))),
density_line = anno_density(mat, type = "line", gp = gpar(col = rep(2:3, each = 5))),
violin = anno_density(mat, type = "violin", gp = gpar(fill = rep(2:3, each = 5))),
heatmap = anno_density(mat, type = "heatmap"))
Heatmap(mat, name = "foo", top_annotation = ha_mix_top, top_annotation_height = unit(8, "cm"))
Text is also one of the annotaiton graphics. anno_text()
supports adding text as heatmap annotations. With this annotation
function, it is easy to simulate column names with rotations.
Note you need to calcualte the space for the text annotations by hand and the package doesn't garentee
that all the rotated text are shown in the plot (In following figure, if row names and legend are not drawn,
'C10C10C10' will show completely, but there are some tricks which can be found in the Examples vignette).
long_cn = do.call("paste0", rep(list(colnames(mat)), 3)) # just to construct long text
ha_rot_cn = HeatmapAnnotation(text = anno_text(long_cn, rot = 45, just = "left", offset = unit(2, "mm")))
Heatmap(mat, name = "foo", top_annotation = ha_rot_cn, top_annotation_height = unit(2, "cm"))
Row annotation is also defined by the HeatmapAnnotation
class, but with specifying
which
to row
.
df = data.frame(type = c(rep("a", 6), rep("b", 6)))
ha = HeatmapAnnotation(df = df, col = list(type = c("a" = "red", "b" = "blue")),
which = "row", width = unit(1, "cm"))
draw(ha, 1:12)
There is a helper function rowAnnotation()
which is same as HeatmapAnnotation(..., which = "row")
.
ha = rowAnnotation(df = df, col = list(type = c("a" = "red", "b" = "blue")), width = unit(1, "cm"))
anno_*
functions also works for row annotations, by you need to add which = "row"
in the function.
E.g:
ha = rowAnnotation(points = anno_points(runif(10), which = "row"))
Similar as rowAnnotation()
, there are corresponding wrapper anno_*
functions. There functions
are almost same as the original functions except pre-defined which
argument to row
.
row_anno_points()
row_anno_barplot()
row_anno_boxplot()
row_anno_histogram()
row_anno_density()
row_anno_text()
Similar, there can be more than one row annotations.
ha_combined = rowAnnotation(df = df, boxplot = row_anno_boxplot(mat),
col = list(type = c("a" = "red", "b" = "blue")),
annotation_width = c(1, 3))
draw(ha_combined, 1:12)
Essentially, row annotations and column annotations are identical graphics, but in practice, there is some difference. In ComplexHeatmap package, row annotations have the same place as the heatmap while column annotations are just like accessory components of heatmaps. The idea here is that row annotations can be corresponded to all the heatmaps in the list while column annotations can only be corresponded to its own heatmap. For row annotations, similar as heatmaps, you can append the row annotations to heatmap or heatmap list or even row annotation object itself. The order of elements in row annotations will be adjusted by the clustering of heatmaps.
ha = rowAnnotation(df = df, col = list(type = c("a" = "red", "b" = "blue")),
width = unit(1, "cm"))
ht1 = Heatmap(mat, name = "ht1")
ht1 = Heatmap(mat, name = "ht2")
ht1 + ha + ht2
## Warning in .local(object, ...): Heatmap/row annotaiton names are duplicated: ht2
If km
or split
is set in the main heatmap, the row annotations are
splitted as well.
ht1 = Heatmap(mat, name = "ht1", km = 2)
ha = rowAnnotation(df = df, col = list(type = c("a" = "red", "b" = "blue")),
boxplot = row_anno_boxplot(mat, axis = TRUE),
width = unit(6, "cm"))
ha + ht1
When row split is applied, graphical parameters for annotation function can be specified as with the same length as the number of row slices.
ha = rowAnnotation(boxplot = row_anno_boxplot(mat, gp = gpar(fill = c("red", "blue"))),
width = unit(2, "cm"))
ha + ht1
Since only row clustering and row titles for the main heatmap are kept, they can be adjusted to the most left or right side
of the plot by setting row_hclust_side
and row_sub_title_side
:
draw(ha + ht1, row_dend_side = "left", row_sub_title_side = "right")
Self-defining row annotations is same as self-defining column annotations. The only
difference is that x coordinate and y coordinate are switched. If row annotations
are split by rows, the argument index
will automatically be the index in the 'current' row slice.
value = rowMeans(mat)
row_anno = function(index) {
n = length(index)
pushViewport(viewport(xscale = range(value), yscale = c(0.5, n + 0.5)))
grid.rect()
# recall row order will be adjusted, here we specify `value[index]`
grid.points(value[index], seq_along(index), pch = 16, default.unit = "native")
upViewport()
}
ha = rowAnnotation(points = row_anno, width = unit(1, "cm"))
ht1 + ha
For the self-defined annotation function, there can be a second argument k
which gives the index of 'current' row slice.
row_anno = function(index, k) {
n = length(index)
col = c("blue", "red")[k]
pushViewport(viewport(xscale = range(value), yscale = c(0.5, n + 0.5)))
grid.rect()
grid.points(value[index], seq_along(index), pch = 16, default.unit = "native", gp = gpar(col = col))
upViewport()
}
ha = rowAnnotation(points = row_anno, width = unit(1, "cm"))
ht1 + ha
If you only want to visualize meta data of your matrix, you can set the matrix with zero row. In this case, only one heatmap is allowed.
ha = HeatmapAnnotation(df = data.frame(value = runif(10), type = rep(letters[1:2], 5)),
barplot = anno_barplot(runif(10)),
points = anno_points(runif(10)))
zero_row_mat = matrix(nrow = 0, ncol = 10)
colnames(zero_row_mat) = letters[1:10]
Heatmap(zero_row_mat, top_annotation = ha, column_title = "only annotations")
This feature is very useful if you want to compare multiple metrics. Axes and labels in following plot are added by heatmap decoration. Also notice how we adjust paddings of the plotting regions to give enough space for hte axis labels.
ha = HeatmapAnnotation(df = data.frame(value = runif(10), type = rep(letters[1:2], 5)),
barplot = anno_barplot(runif(10), axis = TRUE),
points = anno_points(runif(10), axis = TRUE),
annotation_height = unit(c(0.5, 0.5, 4, 4), "cm"))
zero_row_mat = matrix(nrow = 0, ncol = 10)
colnames(zero_row_mat) = letters[1:10]
ht = Heatmap(zero_row_mat, top_annotation = ha, column_title = "only annotations")
draw(ht, padding = unit(c(2, 20, 2, 2), "mm"))
decorate_annotation("value", {grid.text("value", unit(-2, "mm"), just = "right")})
decorate_annotation("type", {grid.text("type", unit(-2, "mm"), just = "right")})
decorate_annotation("barplot", {
grid.text("barplot", unit(-10, "mm"), just = "bottom", rot = 90)
grid.lines(c(0, 1), unit(c(0.2, 0.2), "native"), gp = gpar(lty = 2, col = "blue"))
})
decorate_annotation("points", {
grid.text("points", unit(-10, "mm"), just = "bottom", rot = 90)
})
If no heatmap is needed to draw and users only want to arrange a list of row annotations, an empty matrix with no column can be added to the heatmap list. Within the zero-column matrix, you can either split row annotaitons:
ha_boxplot = rowAnnotation(boxplot = row_anno_boxplot(mat), width = unit(3, "cm"))
ha = rowAnnotation(df = df, col = list(type = c("a" = "red", "b" = "blue")), width = unit(2, "cm"))
text = paste0("row", seq_len(nrow(mat)))
ha_text = rowAnnotation(text = row_anno_text(text), width = max_text_width(text))
nr = nrow(mat)
Heatmap(matrix(nrow = nr, ncol = 0), split = sample(c("A", "B"), nr, replace = TRUE)) +
ha_boxplot + ha + ha_text
or add dendrograms to the row annotations:
dend = hclust(dist(mat))
Heatmap(matrix(nrow = nr, ncol = 0), cluster_rows = dend) +
ha_boxplot + ha + ha_text
Remember it is not allowed to only concantenate row annotations because row annotations don't provide information of number of rows.
Finally, if your row annotations are simple annotations, I recommand to use heatmap instead. Following two methods generate similar figures.
df = data.frame(type = c(rep("a", 6), rep("b", 6)))
Heatmap(mat) + rowAnnotation(df = df, col = list(type = c("a" = "red", "b" = "blue")),
width = unit(1, "cm"))
Heatmap(mat) + Heatmap(df, name = "type", col = c("a" = "red", "b" = "blue"),
width = unit(1, "cm"))
Axes for complex annotations are important to show range and direction of the data. anno_*
functions
provide axis
and axis_side
arguments to control the axes.
ha1 = HeatmapAnnotation(b1 = anno_boxplot(mat, axis = TRUE),
p1 = anno_points(colMeans(mat), axis = TRUE))
ha2 = rowAnnotation(b2 = row_anno_boxplot(mat, axis = TRUE),
p2 = row_anno_points(rowMeans(mat), axis = TRUE), width = unit(2, "cm"))
Heatmap(mat, top_annotation = ha1, top_annotation_height = unit(2, "cm")) + ha2
For row annotations, by default direction of the data is from left to right. But it may confuse people
if the row annotation is placed on the left of the heatmap. You can change axis directions for row annotations
by axis_direction
. Compare following two plots:
pushViewport(viewport(layout = grid.layout(nr = 1, nc = 2)))
pushViewport(viewport(layout.pos.row = 1, layout.pos.col = 1))
ha = rowAnnotation(boxplot = row_anno_boxplot(mat, axis = TRUE), width = unit(3, "cm"))
ht_list = ha + Heatmap(mat)
draw(ht_list, column_title = "normal axis direction", newpage = FALSE)
upViewport()
pushViewport(viewport(layout.pos.row = 1, layout.pos.col = 2))
ha = rowAnnotation(boxplot = row_anno_boxplot(mat, axis = TRUE, axis_direction = "reverse"),
width = unit(3, "cm"))
ht_list = ha + Heatmap(mat)
draw(ht_list, column_title = "reverse axis direction", newpage = FALSE)
upViewport(2)
In the layout of the heatmap components, column names are put directly below the heatmap body. This will cause problems when annotations are put at the bottom of the heatmap as well:
ha = HeatmapAnnotation(type = df$type,
col = list(type = c("a" = "red", "b" = "blue")))
Heatmap(mat, bottom_annotation = ha)
To solve this problem, we can replace column names with text annotations, which is, we suppress columns when making the heamtap and create a text annotation which is formed by column names.
ha = HeatmapAnnotation(type = df$type,
colname = anno_text(colnames(mat), rot = 90, just = "right", offset = unit(1, "npc") - unit(2, "mm")),
col = list(type = c("a" = "red", "b" = "blue")),
annotation_height = unit.c(unit(5, "mm"), max_text_width(colnames(mat)) + unit(2, "mm")))
Heatmap(mat, show_column_names = FALSE, bottom_annotation = ha)
When add a text annotation, the maximum width of the text should be calculated and set as the height of the
text annotation viewport so that all text can be completely shown in the plot. Sometimes, you also need to
set rot
, just
and offset
to align the text to the correct anchor positions.
From version 1.8.0, a new annotation function anno_link()
was added which connects labels and subset of the rows
by links. It is helpful when there are many rows/columns and we want to mark some of the rows (e.g. in a gene expression
matrix, we want to mark some important genes of interest.)
mat = matrix(rnorm(10000), nr = 1000)
rownames(mat) = sprintf("%.2f", rowMeans(mat))
subset = sample(1000, 20)
labels = rownames(mat)[subset]
Heatmap(mat, show_row_names = FALSE, show_row_dend = FALSE, show_column_dend = FALSE) +
rowAnnotation(link = row_anno_link(at = subset, labels = labels),
width = unit(1, "cm") + max_text_width(labels))
# here unit(1, "cm") is width of segments
There are also two shortcut functions: row_anno_link()
and column_anno_link()
.
sessionInfo()
## R version 3.3.0 (2016-05-03)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 14.04.4 LTS
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
## [4] LC_COLLATE=C LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
## [10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats4 parallel grid stats graphics grDevices utils datasets methods
## [10] base
##
## other attached packages:
## [1] dendextend_1.1.8 dendsort_0.3.3 cluster_2.0.4 HilbertCurve_1.2.2
## [5] GenomicRanges_1.24.0 GenomeInfoDb_1.8.2 IRanges_2.6.0 S4Vectors_0.10.0
## [9] BiocGenerics_0.18.0 circlize_0.3.6 ComplexHeatmap_1.10.2 knitr_1.13
## [13] markdown_0.7.7
##
## loaded via a namespace (and not attached):
## [1] whisker_0.3-2 XVector_0.12.0 magrittr_1.5 zlibbioc_1.18.0
## [5] lattice_0.20-33 colorspace_1.2-6 rjson_0.2.15 stringr_1.0.0
## [9] tools_3.3.0 png_0.1-7 RColorBrewer_1.1-2 formatR_1.4
## [13] HilbertVis_1.30.0 GlobalOptions_0.0.10 shape_1.4.2 evaluate_0.9
## [17] mime_0.4 stringi_1.0-1 GetoptLong_0.1.3