Chapter 1 Introduction

Underlying cancer hallmarks are genome instability, which generates the genetic diversity that expedites their acquisition, and inflammation, which fosters multiple hallmark functions (Hanahan 2011). Cancer genomes typically harbors more than 1,000 mutations in small (e.g., point mutations, short insertions and deletions) and large scale (e.g., copy number variations, rearrangements). Genomic contexts where mutation may accumulate in response to both endogenous processes and exogeneous exposures. In recent years, computational approaches (typically non-negative matrix factorization (NMF)) have been applied to the mutation catalog analysis of human/mouse tumors to detect characteristic mutational patterns, also known as “mutational signatures”.

1.1 Biological Significance of Mutational Signature

To illustrate the biological significance of mutational signatures, we show some well organized figures here.

The illustration of SBS signature, fig source:

Figure 1.1: The illustration of SBS signature, fig source:

The illustration of SBS signature (2), fig source:

Figure 1.2: The illustration of SBS signature (2), fig source:

SBS (short for single base substitution) signature is a famous type of mutational signature. SBS signatures are well studied and related to single-strand changes, typically caused by defective DNA repair. Common etiologies contain aging, defective DNA mismatch repair, smoking, ultraviolet light exposure and APOBEC.

Currently, all SBS signatures are summarized in COSMIC database, including two versions: v2 and v3.

Recently, Alexandrov et al. (2020) extends the concept of mutational signature to three types of alteration: SBS, DBS (short for doublet base substitution) and INDEL (short for short insertion and deletion). All reported common signatures are recorded in COSMIC (, so we usually call them COSMIC signatures.

The illustration of copy number signatures, fig source:

Figure 1.3: The illustration of copy number signatures, fig source:

Copy number signatures are less studied and many works are still to be done. The introduction is described in Chapter 3.

Genome rearrangement signatures are limited to whole genome sequencing data and also less studied, the implementation is not available in current version of Sigminer. We are happy to accept a PR if you are interested in create an extension function to Sigminer.

More details about mutational signatures you can read the wiki page.

1.2 Sigminer

Here, we present an easy-to-use and scalable toolkit for mutational signature analysis and visualization in R. We named it sigminer (signature + miner). This tool can help users to extract, analyze and visualize signatures from genomic alteration records, thus providing new insight into cancer study.

Currently, sigminer supports four types of signature:

  • SBS signature in the form of 96 (6, 24, 384, 1536 and 6144) components.
  • DBS signature in the form of 78 (186) components.
  • ID (INDEL) signature in the form of 83 (28) components.
  • Copy number signature by the method either from Macintyre et al. (2018) or from our group work (Wang et al. 2020).

Component here refer to a classification for a record (e.g. a mutation), in some papers mutation type or just type means the same thing. We use ‘component’ to represent a more broad concept.

1.3 Installation

The stable release version of sigminer package can be installed from the CRAN:

install.packages("sigminer", dependencies = TRUE)
# Or
BiocManager::install("sigminer", dependencies = TRUE)

Set dependencies = TRUE is recommended because many packages are required for full features in sigminer.

The development version of sigminer package can be installed from Github:

# install.packages("remotes")
remotes::install_github("ShixiangWang/sigminer", dependencies = TRUE)

1.4 Issues or Suggestions

Any issue or suggestion can be posted on GitHub issue, we will reply ASAP.

Any pull requrest is welcome.

1.5 Preparation

To reproduce the examples shown in this manual, users should load the following packages firstly. sigminer is requred to have version >= 1.0.0.


Current manual uses sigminer 1.0.19. More info about sigminer can be given as:

#> Thanks for using 'sigminer' package!
#> =========================================================================
#> Version: 1.0.19
#> Run citation('sigminer') to see how to cite sigminer in publications.
#> Project home :
#> Bug report   :
#> Documentation:
#> =========================================================================

1.6 Overview of Contents

The contents of this manual have been divided into 4 sections:

  • Common workflow.
    • de novo signature discovery.
    • single sample exposure quantification.
    • subtype prediction.
  • Target visualization.
    • copy number profile.
    • copy number distribution.
    • catalogue profile.
    • signature profile.
    • exposure profile.
  • Universal analysis.
    • association analysis.
    • group analysis.
  • Other utilities.

All functions are well organized and documented at (For Chinese users, you can also read it at For usage of a specific function fun, run ?fun in your R console to see its documentation.

1.7 Citation and LICENSE

#> To cite sigminer in publications use:
#>   Wang, Shixiang, et al. "Copy number signature analyses in prostate cancer reveal
#>   distinct etiologies and clinical outcomes" medRxiv (2020).
#> A BibTeX entry for LaTeX users is
#>   @Article{,
#>     title = {Copy number signature analyses in prostate cancer reveal distinct etiologies and clinical outcomes},
#>     author = {Shixiang Wang and Huimin Li and Minfang Song and Zaoke He and Tao Wu and Xuan Wang and Ziyu Tao and Kai Wu and Xue-Song Liu},
#>     journal = {medRxiv},
#>     year = {2020},
#>     url = {},
#>   }

The software is made available for non commercial research purposes only under the MIT. However, notwithstanding any provision of the MIT License, the software currently may not be used for commercial purposes without explicit written permission after contacting Shixiang Wang or Xue-Song Liu .

MIT © 2019-2020 Shixiang Wang, Xue-Song Liu

MIT © 2018 Geoffrey Macintyre

MIT © 2018 Anand Mayakonda

Cancer Biology Group @ShanghaiTech

Research group led by Xue-Song Liu in ShanghaiTech University


Alexandrov, Ludmil B, Jaegil Kim, Nicholas J Haradhvala, Mi Ni Huang, Alvin Wei Tian Ng, Yang Wu, Arnoud Boot, et al. 2020. “The Repertoire of Mutational Signatures in Human Cancer.” Nature 578 (7793): 94–101.

Hanahan, Douglas. 2011. “Hallmarks of Cancer: The Next Generation.” Cell, March, 29.

Macintyre, Geoff, Teodora E Goranova, Dilrini De Silva, Darren Ennis, Anna M Piskorz, Matthew Eldridge, Daoud Sie, et al. 2018. “Copy Number Signatures and Mutational Processes in Ovarian Carcinoma.” Nature Genetics 50 (9): 1262–70.

Wang, Shixiang, Huimin Li, Minfang Song, Zaoke He, Tao Wu, Xuan Wang, Ziyu Tao, Kai Wu, and Xue-Song Liu. 2020. “Copy Number Signature Analyses in Prostate Cancer Reveal Distinct Etiologies and Clinical Outcomes.” medRxiv.