About NIH
The National Institute of Health is a government agency
leading public health and medical research in South Korea
About NIH
Browse the latest research papers,
patents, and technological achievements
About NIH
Recent news from the National Institute of Health
The Clinical & Omics Data Archive(CODA)
Overview
The Clinical & Omics Data Archive, CODA
Details
The Clinical & Omics Data Archive (CODA) is a national repository for archiving and sharing various biomedical data including multi-omics data. CODA was established in 2016 by Korea National Institute of Health (KNIH) and archive the data generated from the research projects by the Ministry of Health and Welfare(MoHW) and Korea Diseases Control and Prevention Agency.
CODA is a trust based archive and aims to provide the collected data to researchers and pass it on to future generation. It has been collecting biomedical data such as epidemiological information, health records, images data and multi-omics data (microarray, whole-exome, whole-genome, transcriptome, metabolome) from various studies.
To date, an unprecedented amount of data (about 831,755 samples, 2.9PB) from national research projects are already been collected in CODA. And CODA has provided qualified research data to many researchers so that they can download it or analyze it in a closed network. It also provides researchers with several pipelines to optimize their analysis. Researcher can search for information on the web(https://coda.nih.go.kr) and apply for data needed for research.
Key Achievements
Public Resources Status
- We have collected and refined 176 DBs covering diseases, infectious diseases,
| No. | DB Name | Release | Data summary |
|---|---|---|---|
| 1 | KoGES Whole Genome DB | ’22.8., ’23.4. |
5,000 participants, Clinical Epidemiology Information(1st 199 variables, 2nd 202 variables), Omics Data(FASTQ, BAM, gVCF, VCF) 569.8TB |
| 2 | Colorectal Cancer DB | ’22.8., ’23.6. |
322 patients, Clinical Epidemiology Information(1st 88 variables, 2nd 95 variables), Omics Data(BAM, VCF) 48.17TB |
| 3 | Autism DB | ’22.8., ’23.5. |
892 participants, Clinical Epidemiology Information(1st 69 variables, 2nd 73 variables), Omics Data(BAM, gVCF, VCF) 42.4TB |
| 4 | Rare Disease DB | ’22.9., ’23.5. |
14,905 patients, Clinical Epidemiology Information(1st 19 variables, 2nd 23 variables, 3rd 26 variables), Omics Data(FASTQ, BAM, gVCF, VCF) 1,860.97TB |
| 5 | Ulsan genome DB | ’22.12., ’23.4. |
2,504 participants, Clinical Epidemiology Information(1st 112 variables, 2nd 117 variables), Omics Data(BAM, gVCF, VCF) 94.84TB |
| 6 | K-MASTER DB | ’23.4. | 7,305 patients, Clinical Epidemiology Information(128 variables), Omics Data(FASTQ, BAM, VCF) 48TB |
| 7 | Lung cancer DB | ’23.7. | 84 patients, Clinical Epidemiology Information(19 variables), Omics Data(VCF) 1.4GB |
| 8 | Dementia DB | ’23.7. | 995 participants, Clinical Epidemiology Information(139 variables), Omics Data(BAM, BAI, VCF) 48.58TB |
| 9 | COVID-19 DB | ’23.8. | 659 patients, Clinical Epidemiology Information(2020 245 variables, 2021 320 variables), multi-Omics Data(WGS, Cytokine, COVID-seq, HLA typing, Bulk TCR-seq, Bulk BCR-seq, scRNA-seq, SNP array) 118.13TB |
| 10 | Korea Nurses' Health StudyKNHS) DB | ’23.9., ’24.10., ’25.8. |
87,625 nurses, Clinical Epidemiology Information(1st 404 variables, 2nd 768 variables, 3rd 401 variables, 4th 422 variables, 5th 469 variables, 6th 263 variables, 7th 360 variables) |
| 11 | KoGES Ansan and Ansung study | ’24.2., ’25.3. |
10,030 participants, Clinical Epidemiology Information(Baseline 2,479 variables, 1st 2,310 variables, 2nd 3,023 variables, 3rd 2,627 variables, 4th 2,989 variables, 5th 3,141 variables, 6th 2,930 variables, 7th 2,395 variables, 8th 2,482 variables, 9th 2,545 variables, 10th 1,955 variables) |
| 12 | KoGES Cardiovascular disease association study(CAVAS) | ’24.2. | 28,337 participants, Clinical Epidemiology Information(Baseline 1,578 variables, 1st 1,405 variables, 2nd 1,405 variables, 3rd 1,404 variables, 4th 746 variables) |
| 13 | KoGES Health examinees study(HEXA) | 173,195 participants, Clinical Epidemiology Information(Baseline 2,401 variables, pilot 1st 959 variables, 1st 1,606 variables) | |
| 14 | KoGES baseline study Combined data | 211,562 participants, Clinical Epidemiology Information(201 variables) | |
| 15 | KoGES follow up study Combined data | 10,030 participants, Clinical Epidemiology Information(502 variables) | |
| 16 | KoGES Twin and family study | 3,202 participants, Clinical Epidemiology Information(1,221 variables) | |
| 3,202 participants, Clinical Epidemiology Information(1,221 variables) | |||
| 17 | Undiagnosed disease DB | ’24.5. | 56 patients, Clinical Epidemiology Information(9 variables), Omics Data(WES VCF) 194MB |