SSClust Documentation
SSClust can run on any computer running
the statistical package R. Platforms include
Windows, Mac, and Linux. You can download R here.
Running SSClust
SSClust is run from the R command line by typing:
source("SSClust.R")
Note: The master control text file SSClust.R must be edited before running SSClust.
It includes the location of your input data, number of clusters, number of RCEM chains to run, and RCEM threshold.
SSClust is run repeatedly, each time increasing the number of clusters specified, until a minimum BIC score is achieved.
For further details consult the SSClust Manual and Ma et al. 2006..
Input
The input for SSClust is a simple tab-delimited file consisting of expression values
with genes listed on rows and time-points in columns.
Output
The output of each SSClust run includes:
- a set of text files containing the names of genes that fall into the same clusters.
- a text file listing all the input genes examined.
- a postscript image file showing the mean curve and 95% confidence bands for each cluster.
- a postscript image file showing the raw expression curves of all the genes in each cluster.
- a summary file containing the model likelihood, BIC score, and numerical value of each mean curve.
Example SSC output: a cluster showing gene expression mean curve and 95% confidence bands.
Further details on installing the program, troubleshooting, and parameter settings can be found in the
SSClust Manual.
Smoothing Spline Clustering - for time course gene expression data.
Jun S. Liu Lab of Computational Biology
Department of Statistics
Harvard University
Cristian I. Castillo-Davis
ccastillo-davis@stat.harvard.edu