CO1-haplotype frequencies of long-spined sea urchin (Diadema setosum) from the Indo-Malay archipelago

Version 1.0

Vimono, Indra B.; Borsa, Philippe; Pouyaud, Laurent, 2022, "CO1-haplotype frequencies of long-spined sea urchin (Diadema setosum) from the Indo-Malay archipelago", https://doi.org/10.23708/ZWQEFN, DataSuds, V1, UNF:6:Gq/xRLgto2IIMOuZ+Jjuwg== [fileUNF]

Learn about Data Citation Standards.

Contact Owner

Dataset Metrics

12 Downloads

Description	The dataset contains three files in text (.txt) format or in tabulation-separated value (.tsv) format, that together characterize haplotype composition at the COI locus in the long-spined sea urchin Diadema setosum from the Indo-Malay archipelago. The dataset was produced in the course of a phylogeographic study of this tropical Indo-Pacific species, itself part of IBV’s PhD project on the comparative phylogeography of Indo-Pacific sea-urchins of the genera Diadema and Echinometra. METHODS Briefly, long-spined sea urchins were sampled throughout the Indo-Malay archipelago between July 2019 and November 2021. Gonad tissue was dissected and preserved in ethanol. Genomic DNA was extracted and a 1157-nucleotide long segment beginning at the 5’ end of the mitochondrial cytochrome oxidase subunit I (COI) gene was amplified by polymerase-chain reaction according to Ivanova & Grainger’s (2007) protocols. Amplicons were sequenced according to the Sanger protocol. Sequence chromatograms were verified under Chromas v. 2.6.5 (Technelysium, Brisbane, Australia). All nucleotide sequences were deposited in GenBank (Clark et al. 2016) and allocated accession numbers OP310072 to OP310789. Nucleotide sequences were aligned using the ClustalW algorithm implemented in MegaX v. 10.0.4 (Kumar et al. 2018). DESCRIPTION OF DATASET FILE1 is a table in .tsv format, containing the sampling details of Diadema setosum in the Indo-Malay archipelago. The contents of the columns are the following: “Sequence_ID”: a unique identifier for each nucleotide sequence (see legend to File 2); “Organism”: here, Diadema setosum; “Isolate”: unique identifier given to the DNA extract, identical to the sequence identifier; “Location”: information is inserted in inverted commas and includes country, oceanic region, and precise name of sampling site; “lat_lon”: latitude, followed by longitude, both in decimal degrees; “Sampling_date”: three-letter or four-letter abbreviation for month, followed by year; “GenBank_no”: GenBank accession number (two letters immediately followed by six digits). FILE2 is in FASTA (.txt) format. It contains the alignment of nucleotide sequences of 718 Diadema setosum individuals from the Indo-Malay archipelago, over 1157-nucleotide long portion of the CO1 gene. Sequence labels, each preceded by a greater-than sign (“>”) and ended by a line break, were constructed as following: identifier of individual (e.g., “SBD-1”), followed by abbreviation of sampling site (“Sa1”), followed by haplotype identifier (“H1”). The line that follows each sequence label is the nucleotide sequence under that label. FILE3, in .tsv format, is a table containing haplotype frequencies by sample. The first line is the headings of the columns; abbreviations for samples are as in File 1. The following 259 lines contain the numbers of haplotypes by sample. Haplotypes are identified by the same identifiers (“Haplotype_ID”) as those used for individual labels in File 1. The last line of the table contains the sums of each column, equal to sample sizes. The last column contains the sums of each line, equal to haplotype frequencies in the total sample. REFERENCES Clark, K., Karsch-Mizrachi, I., Lipman, D. J., Ostell, J., Sayers, E. W. (2016). GenBank. Nucleic Acids Research 44, D67–D72. doi:10.1093/nar/gkv1276. Ivanova, N., Grainger, C. (2007). COI amplification: Taq polymerase choice. CCDB Protocols (Can Ctr DNA Barcoding, Guelph). www.dnabarcoding.ca. Kumar S, Stecher G, Li M, Knyaz C, Tamura K. 2018. MEGA X: Molecular evolutionary genetics analysis across computing platforms. Molecular Biology and Evolution 35:1547-1549. doi:10.1093/molbev/msy096.
Subject	Medicine, Health and Life Sciences
Keyword	mitochondrial DNA, cytochrome-oxidase subunit 1 gene, nucleotide sequence, phylogeography, Coral Triangle, Indonesian seas
Related Publication	Vimono I., Borsa P., Hocdé R., Pouyaud L. Phylogeography of the long-spined sea urchin Diadema setosum across the Indo-Malay archipelago. (in prep.)
Notes	Data type: Analysis data
License/Data Use Agreement	Custom Dataset Terms

Change View

Table

Tree

Filter by

	1 to 4 of 4 Files	Original Format Archival Format (.tab)
	Diadema_setosum_dataset_contents.txt Diadema_setosum/Plain Text - 3.6 KB Published Oct 5, 2022 12 Downloads MD5: 05639279a39197dedb46afa03b0fefbc Documentation	Preview "Diadema_setosum/Diadema_setosum_dataset_contents.txt" Access File File Access Public Download Options Plain Text Download Metadata Data File Citation EndNote XML RIS BibTeX
	Diadema_setosum_FILE1_sampling_details_GenBank_nos.tab Diadema_setosum/Tabular Data - 88.7 KB Published Oct 5, 2022 0 Downloads 7 Variables, 718 Observations UNF:6:4bvZDYJ1M3O5NuwHcnBSKQ== Table in TSV format (tabulation separated values) containing the sampling details of Diadema setosum in the Indo-Malay archipelago. DataGeospatial	Access File File Access Restricted Users may not request access to files. Download Metadata Data File Citation EndNote XML RIS BibTeX
	Diadema_setosum_FILE2_COI_alignment.txt Diadema_setosum/Plain Text - 824.1 KB Published Oct 5, 2022 0 Downloads MD5: fc82449aa4460006096a8dedacd2f29c Alignment of nucleotide sequences of 718 Diadema setosum individuals (FASTA text format) from the Indo-Malay archipelago, over 1157-nucleotide long portion of the CO1 gene. Data	Access File File Access Restricted Users may not request access to files. Download Metadata Data File Citation EndNote XML RIS BibTeX
	Diadema_setosum_FILE3_haplotype_frequencies.tab Diadema_setosum/Tabular Data - 7.8 KB Published Oct 5, 2022 0 Downloads 22 Variables, 260 Observations UNF:6:XGdxjl5mQy7T8qBASxP5vA== Table in TSV format containing haplotype frequencies by sample. DataGenomics	Access File File Access Restricted Users may not request access to files. Download Metadata Data File Citation EndNote XML RIS BibTeX

Citation Metadata

Persistent Identifier	doi:10.23708/ZWQEFN
Publication Date	2022-10-05
Title	CO1-haplotype frequencies of long-spined sea urchin (Diadema setosum) from the Indo-Malay archipelago
Subtitle	Alignment of long-spined sea urchin Diadema setosum nucleotide sequences from the Indo-Malay archipelago, sampling details and haplotype frequencies by sample
Author	Vimono, Indra B. (Badan Riset dan Inovasi Nasional - Indonesia) - ORCID: 0000-0001-8676-1799 Borsa, Philippe (UMR Entropie - IRD, Univ.La Réunion, CNRS, Ifremer, UNC - France) - ORCID: 0000-0001-9469-8304 Pouyaud, Laurent (UMR ISEM - University of Montpellier, CNRS, IRD, EPHE, CIRAD, INRAP - France) - ORCID: 0000-0003-4415-9198
Point of Contact	Use email button above to contact. Borsa, Philippe (UMR Entropie - IRD, Univ.La Réunion, CNRS, Ifremer, UNC - France)
Description	The dataset contains three files in text (.txt) format or in tabulation-separated value (.tsv) format, that together characterize haplotype composition at the COI locus in the long-spined sea urchin Diadema setosum from the Indo-Malay archipelago. The dataset was produced in the course of a phylogeographic study of this tropical Indo-Pacific species, itself part of IBV’s PhD project on the comparative phylogeography of Indo-Pacific sea-urchins of the genera Diadema and Echinometra. METHODS Briefly, long-spined sea urchins were sampled throughout the Indo-Malay archipelago between July 2019 and November 2021. Gonad tissue was dissected and preserved in ethanol. Genomic DNA was extracted and a 1157-nucleotide long segment beginning at the 5’ end of the mitochondrial cytochrome oxidase subunit I (COI) gene was amplified by polymerase-chain reaction according to Ivanova & Grainger’s (2007) protocols. Amplicons were sequenced according to the Sanger protocol. Sequence chromatograms were verified under Chromas v. 2.6.5 (Technelysium, Brisbane, Australia). All nucleotide sequences were deposited in GenBank (Clark et al. 2016) and allocated accession numbers OP310072 to OP310789. Nucleotide sequences were aligned using the ClustalW algorithm implemented in MegaX v. 10.0.4 (Kumar et al. 2018). DESCRIPTION OF DATASET FILE1 is a table in .tsv format, containing the sampling details of Diadema setosum in the Indo-Malay archipelago. The contents of the columns are the following: “Sequence_ID”: a unique identifier for each nucleotide sequence (see legend to File 2); “Organism”: here, Diadema setosum; “Isolate”: unique identifier given to the DNA extract, identical to the sequence identifier; “Location”: information is inserted in inverted commas and includes country, oceanic region, and precise name of sampling site; “lat_lon”: latitude, followed by longitude, both in decimal degrees; “Sampling_date”: three-letter or four-letter abbreviation for month, followed by year; “GenBank_no”: GenBank accession number (two letters immediately followed by six digits). FILE2 is in FASTA (.txt) format. It contains the alignment of nucleotide sequences of 718 Diadema setosum individuals from the Indo-Malay archipelago, over 1157-nucleotide long portion of the CO1 gene. Sequence labels, each preceded by a greater-than sign (“>”) and ended by a line break, were constructed as following: identifier of individual (e.g., “SBD-1”), followed by abbreviation of sampling site (“Sa1”), followed by haplotype identifier (“H1”). The line that follows each sequence label is the nucleotide sequence under that label. FILE3, in .tsv format, is a table containing haplotype frequencies by sample. The first line is the headings of the columns; abbreviations for samples are as in File 1. The following 259 lines contain the numbers of haplotypes by sample. Haplotypes are identified by the same identifiers (“Haplotype_ID”) as those used for individual labels in File 1. The last line of the table contains the sums of each column, equal to sample sizes. The last column contains the sums of each line, equal to haplotype frequencies in the total sample. REFERENCES Clark, K., Karsch-Mizrachi, I., Lipman, D. J., Ostell, J., Sayers, E. W. (2016). GenBank. Nucleic Acids Research 44, D67–D72. doi:10.1093/nar/gkv1276. Ivanova, N., Grainger, C. (2007). COI amplification: Taq polymerase choice. CCDB Protocols (Can Ctr DNA Barcoding, Guelph). www.dnabarcoding.ca. Kumar S, Stecher G, Li M, Knyaz C, Tamura K. 2018. MEGA X: Molecular evolutionary genetics analysis across computing platforms. Molecular Biology and Evolution 35:1547-1549. doi:10.1093/molbev/msy096.
Subject	Medicine, Health and Life Sciences
Keyword	mitochondrial DNA cytochrome-oxidase subunit 1 gene nucleotide sequence phylogeography Coral Triangle Indonesian seas
Scientific Theme	Continental waters and oceans: generalities (NumeriSud) https://uri.ird.fr/so/kos/tnu/030 Zoology (NumeriSud) https://uri.ird.fr/so/kos/tnu/080
Related Publication	Vimono I., Borsa P., Hocdé R., Pouyaud L. Phylogeography of the long-spined sea urchin Diadema setosum across the Indo-Malay archipelago. (in prep.)
Notes	Data type: Analysis data
Language	English
Production Date	2022-09-23
Production Location	Montpellier, France
Contributor	Project Member : Hocdé, Régis (UMR Marbec)
Funding Information	LMI SELAMAT (BRIN - IRD) IRD, through recurrent funding to UMR 226 ISEM and UMR 250 Entropie Institut français d’Indonésie (French ministry of Europe and foreign affairs) Campus France (French ministry of higher education and research)
Depositor	Borsa, Philippe
Deposit Date	2022-09-29
Date of Collection	Start Date: 2019-08-01 ; End Date: 2021-11-30
Software	Chromas, Version: 2.6.5 MegaX, Version: 10.0.4

Geospatial Metadata

Geographic Coverage	Indonesia
Geographic Unit	Indo-Malay archipelago
Geographic Bounding Box	95.254 140.737 5.891 -8.7785

Dataset Terms

License/Data Use Agreement

Our Community Norms as well as good scientific practices expect that proper credit is given via citation. Please use the data citation shown on the dataset page.

Custom Dataset Terms — the following Custom Dataset Terms have been defined for this dataset.

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Restrictions

Data files are under embargo, which will be lifted when the related article is submitted for publication.

Citation Requirements

Please cite the dataset as indicated in the dataset main page.

Restricted Files + Terms of Access

Restricted Files

There are 3 restricted files in this dataset.

Terms of Access for Restricted Files

Data files are under embargo, which will be lifted when the related article is submitted for publication.

Request Access

Users may not request access to files.

Dataset Version	Summary	Edited by	Published on
No records found.

Edit File

This file has already been deleted (or replaced) in the current version. It may not be edited.

Restrict Access

Restricting limits access to published files. People who want to use the restricted files can request access by default. If you disable request access, you must add information about access to the Terms of Access field.

Learn about restricting files and dataset access in the User Guide.

Request Access

Enable access request

You must enable request access or add terms of access to restrict file access.

Terms of Access for Restricted Files

Save Changes

Edit Embargo

The selected file or files have already been published. Contact an administrator to change the embargo date or reason of the file or files.

Delete Files

The file will be deleted after you click on the Delete button.

Files will not be removed from previously published versions of the dataset.

Select File(s)

Please select one or more files.

Share Dataset

Share this dataset on your favorite social media networks.

Continue

Dataset Citations

Citations for this dataset are retrieved from Crossref via DataCite using Make Data Count standards. For more information about dataset metrics, please refer to the User Guide.

Sorry, no citations were found.

Restricted Files Selected

The selected file(s) may not be downloaded because you have not been granted access.

Download Options

The files selected are too large to download as a ZIP.

You can select individual files that are below the 300.0 MB download limit from the files table, or use the Data Access API for programmatic access to the files.

Select File(s)

Please select a file or files to be downloaded.

Restricted Files Selected

The restricted file(s) selected may not be downloaded because you have not been granted access.

Click Continue to download the files you have access to download.

Ineligible Files Selected

Some file(s) cannot be transferred. (They are restricted, embargoed, or not Globus accessible.)

Click Continue to transfer the elligible files.

Delete Dataset

Are you sure you want to delete this dataset and all of its files? You cannot undelete this dataset.

Delete Draft Version

Are you sure you want to delete this draft version? Files will be reverted to the most recently published version. You cannot undelete this draft.

Unpublished Dataset Private URL

Private URL can only be used with unpublished versions of datasets.

Unpublished Dataset Private URL

Are you sure you want to disable the Private URL? If you have shared the Private URL with others they will no longer be able to use it to access your unpublished dataset.

Delete Files

The file(s) will be deleted after you click on the Delete button.

Files will not be removed from previously published versions of the dataset.

Compute

This dataset contains restricted files you may not compute on because you have not been granted access.

Deaccession Dataset

Are you sure you want to deaccession? This is permanent and the selected version(s) will no longer be viewable by the public.

Deaccession Dataset

Are you sure you want to deaccession this dataset? This is permanent an it will no longer be viewable by the public.

Version Differences Details

Please select two versions to view the differences.

Version Differences Details

Version:
Last Updated:

Select File(s)

Please select a file or files for access request.

Select File(s)

Embargoed files cannot be accessed. Please select an unembargoed file or files for your access request.

Edit Tags

Select existing file tags or create new tags to describe your files. Each file can have more than one tag.

Request Access

You need to Log In to request access.

Dataset Terms

Please confirm and/or complete the information needed below in order to request access to files in this dataset.

This dataset is made available under the following terms. Please confirm and/or complete the information needed below in order to continue.

License/Data Use Agreement

Our Community Norms as well as good scientific practices expect that proper credit is given via citation. Please use the data citation shown on the dataset page.

Custom terms specific to this dataset Custom Dataset Terms — the following Custom Dataset Terms have been defined for this dataset.

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Restrictions Data files are under embargo, which will be lifted when the related article is submitted for publication.

Citation Requirements Please cite the dataset as indicated in the dataset main page.

Name

Institution

Position

Preview Guestbook

Upon downloading files the guestbook asks for the following information.

Guestbook Name

Collected Data

Account Information

Package File Download

Use the Download URL in a Wget command or a download manager to download this package file. Download via web browser is not recommended. User Guide - Downloading a Dataverse Package via URL

Download URL

https://dataverse.ird.fr/api/access/datafile/

Compute Batch

Clear Batch

Dataset	Persistent Identifier	Change Compute Batch

Compute Batch

Submit for Review

You will not be able to make changes to this dataset while it is in review.

Publish Dataset

Are you sure you want to republish this dataset?

Select if this is a minor or major version update.

Minor Release (1.1)

Major Release (2.0)

Publish Dataset

This dataset cannot be published until UMR Entropie is published by its administrator.

Publish Dataset

This dataset cannot be published until UMR Entropie and DataSuds are published.

Return to Author

Return this dataset to contributor for modification. The reason for return entered below will be sent by email to the author.