This folder contains the data necessary for the analysis described in GrAnnoT's paper (doi), and the files produced with this data. The command lines used to process this data and produce the outputs are described in the file "grannot_analysis_command_lines.txt". The only unprovided data are the 12 genomes sequences, issued from the paper from 2020 by Zhou, Y., Chebotarov, D., Kudrna, D. et al., "A platinum standard pan-genome resource that represents the population structure of Asian rice" (doi:10.1038/s41597-020-0438-2). These genomes were used to build the rice pangenome graph (along with the Nipponbare reference (doi:10.1186/1939-8433-6-4)), and for the Liftoff transfers. The rice annotation comes from the Rice Genome Annotation Project, available at https://rice.uga.edu/ The E.coli genomes used to build the pangenome graph come from the paper available at http://dx.doi.org/10.7554/eLife.78834 The K12_MG1655 annotation is adapted from : https://www.ncbi.nlm.nih.gov/nuccore/U00096.3 to match the pangenome graph. The graph was made by the Human Pangenome Reference Consortium, and is available at https://s3-us-west-2.amazonaws.com/human-pangenomics/index.html?prefix=pangenomes/scratch/2022_03_11_minigraph_cactus/ The human genomes for the Liftoff transfer come from https://projects.ensembl.org/hprc/ The CHM13 annotation is adapted from : https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/009/914/755/GCF_009914755.1_T2T-CHM13v2.0/ to match the pangenome graph. This folder is organised as such : . ├── data │   ├── ecoli │   │   ├── EcoliGraph_MGC.gfa │   │   ├── feature_types.txt │   │   ├── K_12_MG1655_09949b0.fasta │   │   ├── O127_H6_E2348_69_193637c.fasta │   │   └── sequence_filter_rename_K_12_MG1655_09949b0.gff3 │   ├── human │   │   ├── CHM13_chr1.gff │   │   ├── chm13.draft_v1.1_chr1.fasta │   │   ├── feature_types.txt │   │   ├── GCA_000001405.15_GRCh38_no_alt_analysis_set_chr1.fna │   │   └── HumanChr1Graph_renamePaths.gfa │   └── rice │   ├── GCA_009830595.1_AzucenaRS1_genomic.fna │ ├── nb_allFeatures.fa │   ├── nb_allFeatures.gff3 │   ├── nb_allFeatures_renamed_filter.bed │   ├── nb_allFeatures_renamepath_annotate.gff3 │   ├── refpath_odgi │   ├── refpath_vg │   ├── RiceGraph_MGC.gfa │   ├── RiceGraph_MGC_paths.gfa │   ├── RiceGraph_MGC_refOs127652RS1.gfa │   ├── TIGRv7_ok.fasta │   └── TIGRv7_ok.genome ├── grannot_analysis_command_lines.txt ├── outputs │   ├── ecoli │   │   ├── intermediate_files │   │   │   ├── reference_all_genes.fa │   │   │   └── reference_all_to_target_all.sam │   │   ├── liftoff_transfer_k12_to_0127.gff │   │   ├── O127_H6_E2348_69_193637c │   │   │   └── O127_H6_E2348_69_193637c.gff │   │   └── unmapped_features.txt │   ├── human │   │   ├── GRCh38 │   │   │   └── GRCh38.gff │   │   ├── intermediate_files │   │   │   ├── reference_all_genes.fa │   │   │   └── reference_all_to_target_all.sam │   │   ├── liftoff_transfer_chm13_to_grch38.gff │   │   └── unmapped_features.txt │   └── rice │   ├── back_forth_transfer │   │   ├── grannot │   │   │   ├── AzucenaRS1.gff │   │   │   └── IRGSP.gff │   │   └── liftoff │   │   ├── AzucenaRS1.gff3 │   │   └── IRGSP.gff3 │   ├── grannot │   │   ├── AzucenaRS1 │   │   │   ├── AzucenaRS1.gff │   │   │   ├── AzucenaRS1_var_sorted.txt │   │   │   └── AzucenaRS1_var.txt │ │ ├── AzucenaRS1_refOs127652RS1.gff │ │ ├── RiceGraph_MGC.gaf │   │   └── segments.txt │   ├── grannot_multi │   │   ├── AzucenaRS1 │   │   │   └── AzucenaRS1.gff │   │   ├── Os117425RS1 │   │   │   └── Os117425RS1.gff │   │   ├── etc... │   │   └── PAV_matrix.txt │ ├── graphaligner │ │   └── graphaligner_rice_transfer.gaf │   ├── liftoff_multi │   │   ├── AzucenaRS1_named.db.gff │   │   ├── AzucenaRS1_named.gff │   │   ├── AzucenaRS1_named_unmappeddb.txt │   │   ├── AzucenaRS1_named_unmapped.txt │   │   ├── Os117425RS1_named.db.gff │   │   ├── Os117425RS1_named.gff │   │   ├── Os117425RS1_named_unmappeddb.txt │   │   ├── Os117425RS1_named_unmapped.txt │   │   └── etc... │   ├── odgi │   │   └── odgi_transfer_nb_azu.bed │   └── vg │   ├── nb_allFeatures_annotate.gaf │   ├── nb_allFeatures_annotate.gam │   ├── nb_allFeatures_renamed_filter.bam │   ├── nb_allFeatures_renamed_filter.gaf │   ├── nb_allFeatures_renamed_filter.sam │   └── RiceGraph_MGC_paths.xg └── readme.txt
Featured Dataverses

In order to use this feature you must have at least one published or linked dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

Advanced Search

1 to 10 of 113 Results
Oct 29, 2025
Marthe, Nina; Sabot, Francois, 2025, "Data used for the analysis described in GrAnnoT paper", https://doi.org/10.23708/DO1RTF, DataSuds, V6
GrAnnoT is an annotation transfer tool for pangenome graph. It was tested and compared to other approaches, and the data necessary for these tests is available in this dataset. It includes mainly pangenome graphs, genomes, and annotations.
Plain Text - 5.4 KB - MD5: 3754471d5b9f8b8e5bd6739a2cfe643f
Documentation
Contains a description of all files in this dataset.
Unknown - 5.4 MB - MD5: cbf164a04f045bcb8b669b1897d56ee0
Fasta file of the genome O157_H7_EC4115_0a2c271, NCBI ID: NC_011353.1.
Unknown - 3.6 MB - MD5: dc9764aaead09be486297f1466bc4ef7
Annotation file of the genome O157_H7_EC4115_0a2c271, adapted from https://www.ncbi.nlm.nih.gov/datasets/genome/GCA_000021125.1/ to match the pangenome graph.
Unknown - 3.3 MB - MD5: c855c9ae86eeec6061576058aa1273bf
Annotation file of the genome O157_H7_EC4115_0a2c271, adapted from https://www.ncbi.nlm.nih.gov/datasets/genome/GCA_000021125.1/ to match the corresponding fasta file.
Unknown - 4.9 MB - MD5: 726296987f982677091183f37748ed1a
Fasta file of the genome S88_fa4fe08, NCBI ID: NC_011742.1.
Oct 29, 2025
Marthe, Nina; Sabot, Francois, 2025, "Output from the analysis described in GrAnnoT paper", https://doi.org/10.23708/RRSKRA, DataSuds, V6
GrAnnoT is an annotation transfer tool for pangenome graph. It was tested and compared to other approaches, and the outputs from these tests are available in this dataset. It includes mainly transferred annotations.
Gzip Archive - 244.8 MB - MD5: 1d22e11c2e9b8348811efd77a13bf4cd
Archive containing the results of the transfer on A.thaliana graph with GrAnnoT.
Plain Text - 8.1 KB - MD5: 6f5be639cf4f13c66772a12e41b711f2
Documentation
Contains all the command lines used to produce the files in this dataset. The input data is available at : https://doi.org/10.23708/DO1RTF
Plain Text - 4.9 KB - MD5: 55ecaf7ec9b62c3778d5866dfde2d81b
Documentation
Contains a description of all folders in this dataset.
Add Data

Log in to create a dataverse or add a dataset.

Share Dataverse

Share this dataverse on your favorite social media networks.

Link Dataverse
Reset Modifications

Are you sure you want to reset the selected metadata fields? If you do this, any customizations (hidden, required, optional) you have done will no longer appear.