NIH/NHGRI – Investigator Initiated Research in Computational Genomics and Data Science (R01, R21 Clinical Trial Not Allowed)

July 2, 2018 by School of Medicine Webmaster

The following description was taken from the R01 version of this FOA.

Since its inception the field of genomics has been grounded in computational approaches. All facets of genomic research, such as processing raw sequencing signals, assembling genomes, calling variants, deriving insight from population sequencing studies, and designing and studying the implementation of genomics in clinical settings, are dependent upon computational, analytical, statistical and bioinformatics approaches. The scale of genomic data and the commitment of genomics researchers to share digital data resources have necessitated new computational paradigms for data processing, storage, organization, and access.

As the science of genomics continues to develop, producing data is no longer rate-limiting for genomic discovery; instead, processing, storing, accessing, analyzing, and deriving insight from genomic data, all computationally-based efforts, are emerging as the major challenges and bottlenecks. Understanding the complex relationships through which genotypes influence phenotypes, a key goal of the NHGRI, is increasingly dependent upon analytical, statistical and computational approaches. The rapid pace of sequencing technology development remains a driving force in genomics, and new genomic data types produced by novel technologies demand new modes of analytical and computational support. Genomics increasingly underlies the study of complex networks and systems ranging in scale from single cells to complete organisms, presenting opportunities for computational approaches to address previously intractable problems in basic biological sciences. The broadening adoption of genomics in clinical settings also requires new computational approaches to enable improved outcomes, while the sensitive nature of some genomic data demands new computational methods to balance data sharing and privacy considerations. Existing tools also require improvement and hardening, and the exponential growth of genomic data demands new scalable algorithms and new solutions for making genomic data findable, accessible, interoperable, and reusable (FAIR).

In recognition of the central role of computation in genomics and to identify future needs and emerging opportunities, the NHGRI held an Informatics and Data Science workshop on September 29-30, 2016. Participants considered bioinformatics for genomics in both basic biology and clinical sciences, and prioritized scientific opportunities for the NHGRI Computational Genomics and Data Science program over the next 3-5 years. Details from this workshop, including a workshop report, can be found here:  Workshop participants identified several areas where continued or expanded support by NHGRI was thought important. Key recommendations highlighted the importance of maintained or enhanced support for development of: interactive tools for visualization and analysis of genomic data in both basic and clinical sciences; computational methods to investigate how genotype translates to phenotype; tools and approaches to enhance genomic data sharing; scalable algorithms for analysis of genomic data; methods to make genomic and phenotypic data and metadata FAIR, and others (for full list of recommendations, please see the workshop report).


Through this FOA, NHGRI seeks to fund innovative research efforts in computational genomics, data science, statistics, and bioinformatics for basic or clinical genomic sciences, and broadly applicable to human health and disease, as well as research leading to improvement of existing software or approaches demonstrated to be in broad use by the genomics community.

Research topics appropriate for this FOA include, but are not limited to, development of novel computational, bioinformatics, statistical, or analytical approaches, tools, or software for:

  • Interactive analysis and visualization of large genomic data sets.
  • Identification or prioritization of disease-causal genetic variants.
  • Causal statistical modeling related to genomic research.
  • Analysis of single-cell or sub-cellular genomic data both in situ and in dissociated cells.
  • Integrating model organism data and information with human data.
  • Integrating and interpreting various genomic data types, including sequence data, functional data, phenotypic data, and clinical data.
  • Processing and integrating genome sequence data to enhance representation of population variation.
  • Processing sequence data for sequence assembly, variant detection (SNPs and SVs), imputation, and resolution of haplotypes.
  • Development of efficient and scalable algorithms for compute-intensive genomic applications.
  • Achieving major cost reductions in genomic data processing and analysis.
  • Enabling scalable and cost-effective curation of FAIR metadata for genomic and phenotypic data.
  • Enhancing secure sharing and use of genomic data in combination with clinical data.
  • Processing or analyzing new genomic data types, or major improvement in processing or analyzing existing genomic data types.
  • Rigorous benchmarking of tools, methods, or algorithms for genomics.
  • Hardening an existing widely-used genomic data processing pipeline to enable its reproducible implementation by the biomedical research community.

This FOA does not support:

  • Development, maintenance, or curation of genomic databases and other genomic data resources. Applicants considering developing such resources are directed to the Genomic Community Resources (U24) program:
  • Research relevant to only one or a few diseases or biological systems. Research utilizing a small number of disease models or biological systems for proof-of-concept studies may be acceptable when the resulting methods, tools, approaches, or software are generalizable.
  • Development and application of ontologies or controlled vocabularies, or manual curation efforts.
  • Basic data science research that is not developed for genomics.
  • Significant experimental work. Applicants may propose limited experimental work to test predictions generated as a result of computational approaches and/or inform modeling efforts, but this should not be a major focus of the application.
  • Approaches not clearly pertaining to computational genomics and data science and/or lacking relevance to human health and disease.

Applicants are strongly encouraged to contact NHGRI Program Staff to discuss the alignment of their proposed work with the goals of this FOA prior to submitting an application.


  • non-AIDS – November 16, 2018; July 16, 2019; November 16, 2019; July 16, 2020; November 16, 2020; July 16, 2021
  • AIDS –  January 7, 2019; September 7, 2019; January 7, 2020; September 7, 2020; January 7, 2021; September 7, 2021
  • Letters of intent are due 30 days prior to the deadline


Filed Under: Funding Opportunities