To ensure the classifier is as accurate as possible, it is best to quantify the gene expression values in the same way as was done for the training samples, as follows:

1) We used GENCODE v34 as reference transcriptome, with some filtering as described in the Methods section of the paper. The resulting reference transcriptome can be downloaded here:

2) Quantify transcript abundance with kallisto.
The index can be constructed like this:

kallisto index -i kallisto_index gencode.v34.transcripts.selected.fa

and the transcripts in a sample can be quantified for instance like this:

kallisto quant -i /path/to/kallisto_index --single -l 200 -s 20 --rf-stranded sample.fastq

or with paired-end or unstranded settings as needed. Please refer to the kallisto manual for details.

3) Sum the estimated transcript abundance to the gene level; a script to summarize to the gene level and construct a table from multiple samples is provided here. The resulting table can be uploaded as input to the classifier.

Back to classifier page

Cancer senescence classifier