Francesca Chiaromonte

Francesca Chiaromonte

Visiting Associate Professor of Mathematics, Biology
Ph.D. in Statistics, University of Minnesota, Minneapolis MN, 1996; Laurea (cum laude) in Statistic and Economic Sciences, University La Sapienza, Rome ITALY, 1990.

Office Address: 251 Mercer Street
New York, NY 10012
Email:
Phone: 212-998-3307
Fax: 212) 995-4121

Research

My interests as a statistician cover Multivariate analysis and Regression (including dimension reduction, supervised and unsupervised classification, non-parametric tools), computational techniques (including re-sampling, perturbation and permutation schemes for the empirical assessment of significance), and Markov modeling. In collaboration with R. Dennis Cook (Statistics, UMN), Bing Li and Hongyuan Zha (Statistics, PSU) I do research on Sufficient Dimension Reduction (SDR). SDR is a body of theory and methods for handling high-dimensional regression and classification problems prior to the use of parametric models or non-parametric fits, and it is closely related to graphics and data visualization. Its popularity and application scope have increased steadily in the last decade, along with the availability of large-scale, high-dimensional data in many scientific fields. Our recent work concerns foundational aspects, SDR in regressions with a mix of quantitative and categorical predictors, novel SDR techniques, and an ongoing attempt to extend SDR's theoretical framework and methodology to non-linear dimension reduction.

In the last years, I have become heavily involved in the analysis and modeling of large-scale genomic data; most of my research has been at the crossroads between statistics and genomics, computational biology and bioinformatics. This work comprises collaborations with Webb Miller, Ross Hardison, Kateryna Makova and other researchers at the Center for Comparative Genomics and Bioinformatics (PSU), as well as David Haussler and other researchers at the Center for Biomolecular Science and Engeneering (UCSC), and has seen us participate to the Mouse, Rat and Chicken Genome Consortia. Pair-wise and/or multiple whole-genome alignments between human and such species, allowed us to exploit comparative information to investigate various aspects of evolution and function.  Among others, I have been involved in projects concerning alignment scoring methodology, genome-wide variation and co-variation of divergence processes, estimation of the share of the human genome under purifying selection, genome-wide scores to aid in the prediction of regulatory elements (RP scores), etc. Ongoing work with graduate students at PSU (James Taylor, Svitlana Tyekucheva) concerns data reduction, modeling and computational issues involved in using short alignment patterns information for supervised and unsupervised classification of genomic elements. I also worked on the analysis of global gene expression data (e.g. from microarrays),  which offer an excellent application ground for the type of statistical methods I am interested in.

In a recent collaboration with Jenni Evans (Meteorology, PSU), we are also applying these methods to large-scale meteorological data, for investigating structure and evolution of  cyclones.

(the above research is currently funded by NIH and NSF grants)

Areas of Research/Interest

Multivariate and computational statistics. Analysis and modeling of genomic data. Comparative genomics. Bioinformatic methods for the investigation of regulatory sequences genome-wide.

Publications

Chiaromonte F., Yang S., Elnitski L., Bing Yap V., Miller W., Hardison R.C. (2001) Association between divergence and interspersed repeats in mammalian noncoding genomic DNA. Proceedings Nat'l Acad. of Sciences USA 98(25): 14503-14508. 

Chiaromonte F., Bing Yap V., Miller W. (2002) Scoring pairwise genomic sequence alignments. Proceedings Pacific Symposium on Biocomputing 2002. 

Chiaromonte F., Martinelli J.A. (2002) Dimension reduction strategies for analyzing global gene expression data with a response. Math'l Biosciences 176 (1): 123-144. 

Chiaromonte F., Cook R.D., Li B. (2002) Sufficient dimension reduction in regressions with categorical predictors. Annals of Statistics 30(2): 475-497

Chiaromonte F., Cook R.D. (2002) Sufficient dimension reduction and graphics in regression. Annals of the Institute of Statistical Mathematics 54(4): 768-795.

Waterston, R. et al., Mouse genome sequencing consortium (2002) Initial sequencing and comparative analysis of the mouse genome. Nature 420: 520-562.

Elnitski L., Hardison R.C., Li J., Yang S., Kolbe D., Eswara P., O Connor M.J., Schwartz S., Miller W., Chiaromonte F. (2003) Distinguishing regulatory DNA from neutral sites. Genome Research 13: 64-72.

Hardison R.C., Roskin K.M., Yang S., Diekhans M., Kent J.W., Weber R., Elnitski L., Li J., O Connor M., Kolbe D., Schwartz S., Furey T.S., Whelan S., Goldman N., Smit A., Miller W., Chiaromonte F., Haussler D. (2003) Co-variation in frequencies of substitution, deletion, transposition and recombination during eutherian evolution. Genome Research 13: 13-26.

Chiaromonte F., Miller W. and Bouhassira E. (2003) Gene length and proximity to neighbors affect genome-wide expression levels. Genome Research 13: 2602-2608.

Li B., Cook R.D., Chiaromonte F. (2003) Dimension reduction for the conditional mean in regressions with categorical predictors. Annals of Statistics 30: 1636-1668.

Hardison R.C., Chiaromonte F., Kolbe D., Wang H., Petrykowska H., Elnitski L., Yang S., Giardine B., Zhang Y., Riemer C., Schwartz S., Haussler D., Roskin K., Weber R., Diekhans M., Kent W.J., Weiss M.J., Welch J. and Miller W. (2004) Global prediction and tests for erythroid regulatory regions. Cold Spring Harbor Symposia in Quantitative Biology: The Genome of Homo Sapiens 68: 335-345.

Chiaromonte F., Weber R. J., Roskin K.M., Diekhans M., Kent W.J. and Haussler D. (2004) The share of human genomic DNA under selection estimated from human-mouse genomic alignments. Cold Spring Harbor Symposia in Quantitative Biology: The Genome of Homo Sapiens 68: 245-254.

Gibbs, R. et al., Rat Genome Sequencing Project Consortium (2004) Genome sequence of the Brown Norway rat yields insights into mammalian evolution. Nature 428: 493-521.

Makova K.D., Yang S. and Chiaromonte F. (2004) Insertions and deletions are male-biased too: a whole-genome analysis in rodents. Genome Research 14: 567-573.

Kolbe D., Taylor J., Elnitski L., Eswara P., Li J., Miller W., Hardison R.C. and Chiaromonte F. (2004) Regulatory potential scores from genome-wide 3-way alignments of human, mouse and rat. Genome Research 14: 700-707.

Yang S., Smit A.F., Schwartz S., Chiaromonte F., Roskin K. M., Haussler D., Miller W. and Hardison R.C. (2004) Patterns of insertions and their covariation with substitutions in the rat, mouse and human genomes. Genome Research 14: 517-527.

Arnott J., Evans J. and Chiaromonte F. (2004) Characterization of extratropical transition using cluster analysis. Monthly Weather Review (to appear).

Li B., Zha H. and Chiaromonte F. (2004) Contour regression: a general approach to dimension reduction. Annals of Statistics, accepted pending minor revisions.

Hillier, L., et al., International Chicken Sequencing Consortium (2004) Sequencing and comparative analysis of the chicken genome. In revision for Nature.