From 6badd1f53ccf544ccd27b2e505288180dc0d26ce Mon Sep 17 00:00:00 2001 From: woollard Date: Tue, 9 Jul 2024 15:30:48 +0100 Subject: [PATCH 01/15] added the missing explicitly in the first column --- submit/samples/missing-values.rst | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/submit/samples/missing-values.rst b/submit/samples/missing-values.rst index d9f3cb84..2a33b691 100644 --- a/submit/samples/missing-values.rst +++ b/submit/samples/missing-values.rst @@ -41,12 +41,12 @@ INSDC Missing Value Reporting Terms | | | | | | | was not collected or reported in records | | | | | | | | predating the 2023 agreement. For use in | | | | | | | | Third Party data submissions. | -| +------------------------------+-----------------------------------------------+----------------------------------+---------------------------------------------------+ -| | not provided | | information of an expected format was not | data agreement established | | Data agreements were established before the | +|----------------------------+------------------------------+-----------------------------------------------+----------------------------------+---------------------------------------------------+ +| missing | not provided | | information of an expected format was not | data agreement established | | Data agreements were established before the | | | | | given, a value may be given at the later | pre-2023 | | 2023 INSDC standard and metadata can not be | | | | | stage | | | provided. A value may be given at a later stage | -| +------------------------------+-----------------------------------------------+----------------------------------+---------------------------------------------------+ -| | restricted access | | information exists but can not be released | endangered species | | Information can not be reported as the target | +|----------------------------+------------------------------+-----------------------------------------------+----------------------------------+---------------------------------------------------+ +| missing | restricted access | | information exists but can not be released | endangered species | | Information can not be reported as the target | | | | | openly because of privacy concerns | | | organism is endangered e.g. on the IUCN red- | | | | | | | | list | | | | | +----------------------------------+---------------------------------------------------+ From 20ba68a314b802a146dfeff0f6aab0c308cf0446 Mon Sep 17 00:00:00 2001 From: woollard Date: Thu, 17 Oct 2024 15:29:04 +0100 Subject: [PATCH 02/15] doc: multiple updates to this missing-values.rst file. The table needed to be clearer what the high level term was. Also remove a plain wrong hyphen from the example: "missing: data agreement-established pre-2023" --- submit/samples/missing-values.rst | 1 + 1 file changed, 1 insertion(+) diff --git a/submit/samples/missing-values.rst b/submit/samples/missing-values.rst index 2a33b691..3c4b6e62 100644 --- a/submit/samples/missing-values.rst +++ b/submit/samples/missing-values.rst @@ -60,6 +60,7 @@ Usage of INSDC Missing Value Reporting Terms ============================================ Please use the above standardised missing value vocabulary **only if a true value of an expected format for a mandatory field is missing**. If a true value is missing for a **recommended** or an **optional** field, then these fields should not be used for reporting at all. When reporting a missing mandatory field, the eight granular **‘reporting level’** terms need to be preceded with the term *missing:* to declare both the absence of a true value as well as the reason. +*not applicable* is only ever used as top level term, its reporting level terms ought to be prefixed by *missing: *. Example of usage: ----------------- From 7c2409b0d30198d3c675133a676d9749a854c425 Mon Sep 17 00:00:00 2001 From: woollard Date: Thu, 17 Oct 2024 15:54:51 +0100 Subject: [PATCH 03/15] doc: reformat table --- submit/samples/missing-values.rst | 60 +++++++++++++++---------------- 1 file changed, 30 insertions(+), 30 deletions(-) diff --git a/submit/samples/missing-values.rst b/submit/samples/missing-values.rst index 3c4b6e62..6a46f71f 100644 --- a/submit/samples/missing-values.rst +++ b/submit/samples/missing-values.rst @@ -20,39 +20,39 @@ INSDC Missing Value Reporting Terms +----------------------------+------------------------------+-----------------------------------------------+----------------------------------+---------------------------------------------------+ | **INSDC term (top level)** | **INSDC term (lower level)** | **Definition** | **INSDC term (reporting level)** | **Definition** | ++============================+==============================+===============================================+==================================+===================================================+ +| not applicable | | information is inappropriate to report, can | control sample | Information is not applicable as the sample | +| | | indicate that the standard itself fails to | | represents a negative control sample | +| | | model or represent the information | | collected in a lab | +| | | appropriately +----------------------------------+---------------------------------------------------+ +| | | | sample group | Information is not applicable as the sample | +| | | | | represents a group of samples that do not | +| | | | | have a single origin. E.g. for co-assembly or | +| | | | | transcriptome assembly. | +----------------------------+------------------------------+-----------------------------------------------+----------------------------------+---------------------------------------------------+ -| not applicable | | | information is inappropriate to report, can | control sample | | Information is not applicable as the sample | -| | | | indicate that the standard itself fails to | | | represents a negative control sample | -| | | | model or represent the information | | | collected in a lab | -| | | | appropriately +----------------------------------+---------------------------------------------------+ -| | | | | sample group | | Information is not applicable as the sample | -| | | | | | | represents a group of samples that do not | -| | | | | | | have a single origin. E.g. for co-assembly or | -| | | | | | | transcriptome assembly. | -+----------------------------+------------------------------+-----------------------------------------------+----------------------------------+---------------------------------------------------+ -| missing | not collected | | information of an expected format was not | synthetic construct | | Information does not exist as the sample | -| | | | given because it has not been collected | | | represents an ab-initio synthetic construct. | -| | | | +----------------------------------+---------------------------------------------------+ -| | | | | lab stock | | Information was not collected as the sample | -| | | | | | | represents a cultured cell line or model | -| | | | | | | organism under long-term lab control. | -| | | | +----------------------------------+---------------------------------------------------+ -| | | | | third party data | | Information does not exist as the metadata | -| | | | | | | was not collected or reported in records | -| | | | | | | predating the 2023 agreement. For use in | -| | | | | | | Third Party data submissions. | +| missing | not collected | information of an expected format was not | synthetic construct | Information does not exist as the sample | +| | | given because it has not been collected | | represents an ab-initio synthetic construct. | +| | | +----------------------------------+---------------------------------------------------+ +| | | | lab stock | Information was not collected as the sample | +| | | | | represents a cultured cell line or model | +| | | | | organism under long-term lab control. | +| | | +----------------------------------+---------------------------------------------------+ +| | | | third party data | Information does not exist as the metadata | +| | | | | was not collected or reported in records | +| | | | | predating the 2023 agreement. For use in | +| | | | | Third Party data submissions. | |----------------------------+------------------------------+-----------------------------------------------+----------------------------------+---------------------------------------------------+ -| missing | not provided | | information of an expected format was not | data agreement established | | Data agreements were established before the | -| | | | given, a value may be given at the later | pre-2023 | | 2023 INSDC standard and metadata can not be | -| | | | stage | | | provided. A value may be given at a later stage | +| missing | not provided | information of an expected format was not | data agreement established | Data agreements were established before the | +| | | given, a value may be given at the later | pre-2023 | 2023 INSDC standard and metadata can not be | +| | | stage | | provided. A value may be given at a later stage | |----------------------------+------------------------------+-----------------------------------------------+----------------------------------+---------------------------------------------------+ -| missing | restricted access | | information exists but can not be released | endangered species | | Information can not be reported as the target | -| | | | openly because of privacy concerns | | | organism is endangered e.g. on the IUCN red- | -| | | | | | | list | -| | | | +----------------------------------+---------------------------------------------------+ -| | | | | human-identifiable | | Information can not be reported as the | -| | | | | | | metadata would make the sample human- | -| | | | | | | identifiable. | +| missing | restricted access | information exists but can not be released | endangered species | Information can not be reported as the target | +| | | openly because of privacy concerns | | organism is endangered e.g. on the IUCN red- | +| | | | | list | +| | | +----------------------------------+---------------------------------------------------+ +| | | | human-identifiable | Information can not be reported as the | +| | | | | metadata would make the sample human- | +| | | | | identifiable. | +----------------------------+------------------------------+-----------------------------------------------+----------------------------------+---------------------------------------------------+ From b9fd46872b2ee7ce685ceb1c98f65c4b54be7e39 Mon Sep 17 00:00:00 2001 From: woollard Date: Thu, 17 Oct 2024 16:32:30 +0100 Subject: [PATCH 04/15] doc: reformat table yet again --- submit/samples/missing-values.rst | 26 +++++++++++++------------- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/submit/samples/missing-values.rst b/submit/samples/missing-values.rst index 6a46f71f..17ffd932 100644 --- a/submit/samples/missing-values.rst +++ b/submit/samples/missing-values.rst @@ -21,17 +21,17 @@ INSDC Missing Value Reporting Terms +----------------------------+------------------------------+-----------------------------------------------+----------------------------------+---------------------------------------------------+ | **INSDC term (top level)** | **INSDC term (lower level)** | **Definition** | **INSDC term (reporting level)** | **Definition** | +============================+==============================+===============================================+==================================+===================================================+ -| not applicable | | information is inappropriate to report, can | control sample | Information is not applicable as the sample | -| | | indicate that the standard itself fails to | | represents a negative control sample | -| | | model or represent the information | | collected in a lab | -| | | appropriately +----------------------------------+---------------------------------------------------+ +| not applicable | | information is inappropriate to report, can | control sample | Information is not applicable as the sample | +| | | indicate that the standard itself fails to | | represents a negative control sample | +| | | model or represent the information | | collected in a lab | +| | | appropriately +----------------------------------+---------------------------------------------------+ | | | | sample group | Information is not applicable as the sample | | | | | | represents a group of samples that do not | | | | | | have a single origin. E.g. for co-assembly or | | | | | | transcriptome assembly. | +----------------------------+------------------------------+-----------------------------------------------+----------------------------------+---------------------------------------------------+ -| missing | not collected | information of an expected format was not | synthetic construct | Information does not exist as the sample | -| | | given because it has not been collected | | represents an ab-initio synthetic construct. | +| missing | not collected | information of an expected format was not | synthetic construct | Information does not exist as the sample | +| | | given because it has not been collected | | represents an ab-initio synthetic construct. | | | | +----------------------------------+---------------------------------------------------+ | | | | lab stock | Information was not collected as the sample | | | | | | represents a cultured cell line or model | @@ -41,13 +41,13 @@ INSDC Missing Value Reporting Terms | | | | | was not collected or reported in records | | | | | | predating the 2023 agreement. For use in | | | | | | Third Party data submissions. | -|----------------------------+------------------------------+-----------------------------------------------+----------------------------------+---------------------------------------------------+ -| missing | not provided | information of an expected format was not | data agreement established | Data agreements were established before the | -| | | given, a value may be given at the later | pre-2023 | 2023 INSDC standard and metadata can not be | -| | | stage | | provided. A value may be given at a later stage | -|----------------------------+------------------------------+-----------------------------------------------+----------------------------------+---------------------------------------------------+ -| missing | restricted access | information exists but can not be released | endangered species | Information can not be reported as the target | -| | | openly because of privacy concerns | | organism is endangered e.g. on the IUCN red- | ++----------------------------+------------------------------+-----------------------------------------------+----------------------------------+---------------------------------------------------+ +| missing | not provided | information of an expected format was not | data agreement established | Data agreements were established before the | +| | | given, a value may be given at the later | pre-2023 | 2023 INSDC standard and metadata can not be | +| | | stage | | provided. A value may be given at a later stage | ++----------------------------+------------------------------+-----------------------------------------------+----------------------------------+---------------------------------------------------+ +| missing | restricted access | information exists but can not be released | endangered species | Information can not be reported as the target | +| | | openly because of privacy concerns | | organism is endangered e.g. on the IUCN red- | | | | | | list | | | | +----------------------------------+---------------------------------------------------+ | | | | human-identifiable | Information can not be reported as the | From fbd665ce6d3ca6515a7b71be98756c08ac4e4d74 Mon Sep 17 00:00:00 2001 From: woollard Date: Thu, 17 Oct 2024 16:34:48 +0100 Subject: [PATCH 05/15] doc: fixed rst emphasis formatting issue --- submit/samples/missing-values.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/submit/samples/missing-values.rst b/submit/samples/missing-values.rst index 17ffd932..f6aa5e73 100644 --- a/submit/samples/missing-values.rst +++ b/submit/samples/missing-values.rst @@ -60,7 +60,7 @@ Usage of INSDC Missing Value Reporting Terms ============================================ Please use the above standardised missing value vocabulary **only if a true value of an expected format for a mandatory field is missing**. If a true value is missing for a **recommended** or an **optional** field, then these fields should not be used for reporting at all. When reporting a missing mandatory field, the eight granular **‘reporting level’** terms need to be preceded with the term *missing:* to declare both the absence of a true value as well as the reason. -*not applicable* is only ever used as top level term, its reporting level terms ought to be prefixed by *missing: *. +*not applicable* is only ever used as top level term, its reporting level terms ought to be prefixed by *missing:*. Example of usage: ----------------- From 2262f1f3dbe192efc8f5d9f31eaa64d91dc3f442 Mon Sep 17 00:00:00 2001 From: woollard Date: Thu, 17 Oct 2024 16:40:29 +0100 Subject: [PATCH 06/15] doc: tidied the formatting and language syntax --- submit/samples/missing-values.rst | 15 ++++++--------- 1 file changed, 6 insertions(+), 9 deletions(-) diff --git a/submit/samples/missing-values.rst b/submit/samples/missing-values.rst index f6aa5e73..420fa6de 100644 --- a/submit/samples/missing-values.rst +++ b/submit/samples/missing-values.rst @@ -60,13 +60,10 @@ Usage of INSDC Missing Value Reporting Terms ============================================ Please use the above standardised missing value vocabulary **only if a true value of an expected format for a mandatory field is missing**. If a true value is missing for a **recommended** or an **optional** field, then these fields should not be used for reporting at all. When reporting a missing mandatory field, the eight granular **‘reporting level’** terms need to be preceded with the term *missing:* to declare both the absence of a true value as well as the reason. -*not applicable* is only ever used as top level term, its reporting level terms ought to be prefixed by *missing:*. +*not applicable* is only ever used as a top level term, its reporting level terms ought to be prefixed by *missing:* -Example of usage: ------------------ - -**geographic location (country and/or sea)**: *missing: data agreement-established pre-2023* - -**collection date**: *missing: control sample* - -**geographic location (country and/or sea)**: *missing: human-identifiable* +Examples of Usage: +------------------ +- **geographic location (country and/or sea)**: *missing: data agreement-established pre-2023* +- **collection date**: *missing: control sample* +- **geographic location (country and/or sea)**: *missing: human-identifiable* From ec0f50913f5086af5b5f7f478388c9e50b2311a7 Mon Sep 17 00:00:00 2001 From: woollard Date: Mon, 21 Oct 2024 13:38:30 +0100 Subject: [PATCH 07/15] doc: re-removing the excess hyphen... --- submit/samples/missing-values.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/submit/samples/missing-values.rst b/submit/samples/missing-values.rst index 420fa6de..b87c1d43 100644 --- a/submit/samples/missing-values.rst +++ b/submit/samples/missing-values.rst @@ -64,6 +64,6 @@ Please use the above standardised missing value vocabulary **only if a true valu Examples of Usage: ------------------ -- **geographic location (country and/or sea)**: *missing: data agreement-established pre-2023* +- **geographic location (country and/or sea)**: *missing: data agreement established pre-2023* - **collection date**: *missing: control sample* - **geographic location (country and/or sea)**: *missing: human-identifiable* From c10b48c1e81454eaef7259505aa99e241bffb8a0 Mon Sep 17 00:00:00 2001 From: woollard Date: Mon, 21 Oct 2024 14:07:17 +0100 Subject: [PATCH 08/15] doc: tabulated the Example usage with short and long names, so works for NCBI, DDBJ as well as ENA --- submit/samples/missing-values.rst | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/submit/samples/missing-values.rst b/submit/samples/missing-values.rst index b87c1d43..9c5fb004 100644 --- a/submit/samples/missing-values.rst +++ b/submit/samples/missing-values.rst @@ -64,6 +64,13 @@ Please use the above standardised missing value vocabulary **only if a true valu Examples of Usage: ------------------ -- **geographic location (country and/or sea)**: *missing: data agreement established pre-2023* -- **collection date**: *missing: control sample* -- **geographic location (country and/or sea)**: *missing: human-identifiable* + ++---------------------------+----------------------------------------------+-----------------------------------------------+ +| **Short Field Name** | **Long Field Name** | **Missing Value** Example | ++===========================+==============+===============================+===============================================+ +| **geo_loc_name** | **geographic location (country and/or sea)** |*missing: data agreement established pre-2023* | ++---------------------------+----------------------------------------------+-----------------------------------------------+ +| **collection_date** | **collection date** | *missing: control sample* | ++---------------------------+----------------------------------------------+-----------------------------------------------+ +| **geo_loc_name** | **geographic location (country and/or sea)** + *missing: human-identifiable* | ++---------------------------+----------------------------------------------+-----------------------------------------------+ From 99e99086a84b1f71d59f43a93d49f48ae7cb23b4 Mon Sep 17 00:00:00 2001 From: woollard Date: Mon, 2 Dec 2024 10:23:46 +0000 Subject: [PATCH 09/15] Initial: converted md to rst --- .../2024-02-29_Incorporating_MIxS_V6.2.rst | 149 ++++++++++++++++++ 1 file changed, 149 insertions(+) create mode 100644 submit/samples/sample_checklist/updates/2024-02-29_Incorporating_MIxS_V6.2.rst diff --git a/submit/samples/sample_checklist/updates/2024-02-29_Incorporating_MIxS_V6.2.rst b/submit/samples/sample_checklist/updates/2024-02-29_Incorporating_MIxS_V6.2.rst new file mode 100644 index 00000000..ac027aa8 --- /dev/null +++ b/submit/samples/sample_checklist/updates/2024-02-29_Incorporating_MIxS_V6.2.rst @@ -0,0 +1,149 @@ +============================================= +ENA Checklists Update Incorporating MIxS V6.2 +============================================= + +--------------------------------- +Checklists Updated: February 2024 +--------------------------------- + +---------------------------------------------------- +Summary of ENA Checklists after the MIxS v6.2 Update +---------------------------------------------------- + +* Four new MIxS checklists have been added to ENA: **GSC MIxS Agriculture, GSC MIxS Food and Production, GSC MIxS Symbiont**, and **GSC MIxS Hydrocarbon**. +* Fifteen existing MIxS checklists in ENA, had new checklists terms added. + + * Three had many new terms: GSC MIxS built environment(66), GSC MIxS plant-associated(24) and GSC MIxS sediment(14). + * Twelve checklists had between 1 and 8 new terms added. +* 368 new MIxS terms< were added to the ENA checklist system. There are now 1031 ENA sample checklist terms. +* 47 aliases(synonyms) of terms were added, e.g. where the MIxS term name had changed, or there was now a MIxS term for the same concept as an existing legacy ENA term. Wherever appropriate we use the MIxS term. + +This and similar metadata updates are important to both: + +1. meet the needs of the diverse data submitters to ENA and +2. ensure interoperability for ENA submitted metadata with that of other INSDC members and other portals. Please see the background to sample checklists in ENA for more information. + + +------------ +Introduction +------------ + +Please read this background about sample level checklists and GSC MIxS. + +A growing proportion of ENA's sample level checklists are from MIxS, currently the MIxS are 22 of the 52 sample checklists. Most of the other sources of ENA’s checklists are legacy. + +--------------------------------------- +Four New MIxS Derived Checklists in ENA +--------------------------------------- + ++----------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| New checklist Name in ENA | Deeper background to the checklist creation | Comment for ENA | ++==================================+==================================================================================================================================================================================================+=================================================================================================================================================================================================================================================================+ +| **GSC MIxS Agriculture** | [Community-Driven Metadata Standards for Agricultural Microbiome Research](https://apsjournals.apsnet.org/doi/10.1094/PBIOMES-09-19-0051-P) | | ++----------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| **GSC MIxS Food and Production** | | Built from five MIxS lists packages as much overlap(food-human foods, food-farm environment, food-food production facility, food-animal and animal feed) N.B. A dozen terms are currently excluded, as they were mainly agriculture and or soil sample related. | ++----------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| **GSC MIxS Symbiont** | [MIxS-SA: a MIxS extension defining the minimum information standard for sequence data from symbiont-associated micro-organisms](https://www.nature.com/articles/s43705-022-00092-w) | | ++----------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| **GSC MIxS Hydrocarbon** | [MIxS-HCR: a MIxS extension defining a minimal information standard for sequence data from environments pertaining to hydrocarbon resources](https://www.nature.com/articles/s43705-022-00092-w) | All added apart from “additional info” | ++----------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ + +--------------------------------------------------------------------------- +Fifteen existing MIxS checklists in ENA have had new checklists terms added +--------------------------------------------------------------------------- + +* For twelve checklists, between 1 and 8 new terms were added to these GSC MIxS checklists: air, host, human-associated, human-gut, human-oral, human-vaginal, microbial mat biofilm, miscellaneous natural or artificial environment, soil, wastewater sludge, and water +* For the following three checklists there was a more substantial addition: + * 66 terms being added to the **GSC MIxS built environment** + * 24 terms added to the **GSC MIxS plant-associated** + * 14 terms added to the **GSC MIxS sediment** + +----------------------------------------------------------------------------------- +General changes reflecting INSDC or specifically ENA needs, where Different to MIxS +----------------------------------------------------------------------------------- + +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Additional Controlled Value Terms being allowed +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +* Missing Value exceptions are now allowed for **height, elevation** and additionally **altitude**. This was requested a height and elevation were mandatory in **GSC MIxS Soil** + +**ENA uses separate fields rather than a single combined ones in certain cases** + +Why Different? Separate fields is are easier for users to populate from controlled vocabulary lists +* **geographic location (country and/or sea)** and **region** rather than MIxS' **Geographic location (country and/or sea, region)** +* **geographic location (latitude)** and **geographic location (longitude)** are captured as two separate fields. Rather than MIxS' single **geographic location (latitude and longitude)** + +------------------------------------ +Minor Changes to non-MIxS checklists +------------------------------------ + +Some checklists at ENA are not from MIxS. Nevertheless, we try to keep terms aligned between these and MIxS. This has the obvious benefit of increasing the findability and interoperability of metadata. Legacy term names will be made synonyms for the updated term names. + +----------------------------------------------------------------- +Summary Tables of Terms counts and Terms added Existing Checklist +----------------------------------------------------------------- + +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Summary Table of Terms ( all sample based ) +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ++-------+--------------------------------------------------------------------------+ +| Count | What | ++=======+==========================================================================+ +| 1031 | total terms now in ENA | ++-------+--------------------------------------------------------------------------+ +| 368 | new terms not in ENA were added from MIxS | ++-------+--------------------------------------------------------------------------+ +| 47 | aliases added | ++-------+--------------------------------------------------------------------------+ +| 16 | existing definitions updated | ++-------+--------------------------------------------------------------------------+ +| 3 | MIxS v6.2 terms were not added to ENA, such as "miscellaneous attribute" | ++-------+--------------------------------------------------------------------------+ + +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Table of Terms added to which checklist ( all sample based ) +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Only listing the terms where there were additional terms to existing checklists. + ++---------------------------------------------------------------+-----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| Checklist | Status | Comment | ++===============================================================+===========+===============================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================+ +| **GSC MIxS Agriculture** | New | N.B. From four MIxS packages | ++---------------------------------------------------------------+-----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| **GSC MIxS Food and Production** | New | Combined from several MIxS lists as so much overlap about a dozen terms, seemed out of place: agriculture and or soil looked better bets, so excluded those **geographic location (latitude)** and **geographic location (longitude)** are captured as two separate fields in ENA's version. Rather than MIxS' single **geographic location (latitude and longitude)** Existing bio_material field used rather than MIxS CL's **Repository name** | ++---------------------------------------------------------------+-----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| **GSC MIxS Symbiont** | New | ENA has the additional **sample symbiont of** **geographic location (latitude)** and **geographic location (longitude)** are captured as two separate fields in ENA's version. Rather than MIxS' single **geographic location (latitude and longitude)** | ++---------------------------------------------------------------+-----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| **GSC MIxS Hydrocarbon** | New | All added apart from “additional info” **geographic location (latitude)** and **geographic location (longitude)** are captured as two separate fields in ENA's version. Rather than MIxS' single **geographic location (latitude and longitude)** | ++---------------------------------------------------------------+-----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| **GSC MIxS air** | existing | new terms added:taxonomic classification | ++---------------------------------------------------------------+-----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| **GSC MIxS built environment** | existing | new terms added:outside relative humidity, presence of pets, animals, or insects, quadrant position, relative sampling location, room air exchange rate, room architectural elements, room condition, room count, room dimensions, room door distance, room location in building, room moisture damage or mold history, room net area, room occupancy, room sampling position, room type, room volume, room window count, rooms connected by a doorway, rooms that are on the same hallway, rooms that share a door with sampling room, rooms that share a wall with sampling room, sampling day weather, sampling floor, sampling room ID or name, sampling time outside, season, seasonal use, shading device condition, shading device location, shading device material, shading device signs of water/mold, shading device type, specific humidity, specifications, surface-air contaminant, taxonomic classification, temperature, temperature outside house, train line, train station collection location, train stop collection location, visual media, wall area, wall construction type, wall finish material, wall height, wall location, wall signs of water/mold, wall surface treatment, wall texture, wall thermal mass, water feature size, water feature type, weekday, window area/size, window condition, window covering, window horizontal position, window location, window material, window open frequency, window signs of water/mold, window status, window type, window vertical position, | ++---------------------------------------------------------------+-----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| **GSC MIxS host** | existing | new terms added: ancestral data, biological status, genetic modification, observed host symbionts, sample capture status, sample collection device or method, sample disease stage, taxonomic classification, | ++---------------------------------------------------------------+-----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| **GSC MIxS human-gut** | existing | new terms added: host scientific name, observed host symbionts, taxonomic classification, | ++---------------------------------------------------------------+-----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| **GSC MIxS human-oral** | existing | new terms added:host scientific name, observed host symbionts, taxonomic classification, | ++---------------------------------------------------------------+-----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| **GSC MIxS human-skin** | existing | new terms added: host scientific name, observed host symbionts, taxonomic classification, | ++---------------------------------------------------------------+-----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| **GSC MIxS human-vaginal** | existing | new terms added:host scientific name, | ++---------------------------------------------------------------+-----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| **GSC MIxS microbial mat biofilm** | existing | new terms added: taxonomic classification, total nitrogen content, | ++---------------------------------------------------------------+-----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| **GSC MIxS miscellaneous natural or artificial environment** | existing | new terms added: taxonomic classification, | ++---------------------------------------------------------------+-----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| **GSC MIxS plant-associated** | existing | new terms added: ancestral data, biological status, biotic regimen, culture rooting medium, genetic modification, growth facility, growth habit, host scientific name, light regimen, observed host symbionts, plant growth medium, plant sex, plant structure, rooting conditions, rooting medium carbon, rooting medium macronutrients, rooting medium micronutrients, rooting medium organic supplements, rooting medium pH, rooting medium regulators, rooting medium solidifier, sample capture status, sample disease stage, taxonomic classification, | ++---------------------------------------------------------------+-----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| **GSC MIxS sediment** | existing | new terms added: alkalinity, mean friction velocity, mean peak friction velocity, pH, particle classification, porosity, pressure, sediment type, taxonomic classification, temperature, tidal stage, total depth of water column, total nitrogen content, turbidity, | ++---------------------------------------------------------------+-----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| **GSC MIxS soil** | existing | new terms added: host specificity or range, mean seasonal precipitation, mean seasonal temperature, organic nitrogen, taxonomic classification, | ++---------------------------------------------------------------+-----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| **GSC MIxS wastewater sludge** | existing | new terms added:taxonomic classification, total nitrogen concentration, | ++---------------------------------------------------------------+-----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| **GSC MIxS water** | existing | new terms added: alkalinity method, size-fraction lower threshold, size-fraction upper threshold, taxonomic classification, total nitrogen concentration, | ++---------------------------------------------------------------+-----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ + + From b5900ba63e393161a40b841b056f21af86e25498 Mon Sep 17 00:00:00 2001 From: woollard Date: Mon, 2 Dec 2024 10:26:44 +0000 Subject: [PATCH 10/15] Initial: converted md to rst --- .../sample_checklist_introduction.rst | 52 +++++++++++++++++++ 1 file changed, 52 insertions(+) create mode 100644 submit/samples/sample_checklist/sample_checklist_introduction.rst diff --git a/submit/samples/sample_checklist/sample_checklist_introduction.rst b/submit/samples/sample_checklist/sample_checklist_introduction.rst new file mode 100644 index 00000000..89430fd8 --- /dev/null +++ b/submit/samples/sample_checklist/sample_checklist_introduction.rst @@ -0,0 +1,52 @@ +============================= +Sample Checklist Introduction +============================= + +------------ +Introduction +------------ + +Sample checklists are used to ensure that both the minimum core metadata and metadata specific to different sample types are submitted to ENA. +Please see: `background to sample checklists in ENA `_ +and the available `ENA sample checklists `_. + +ENA has many different sample checklists (about 40). The most basic is the generic checklist which can be useful for any sample. However please try to pick +the sample sheet most appropriate to your sample type, as the metadata being collected is likely to be most relevant and makes your data more useful to the scientific community. +Sample checklists typically created where a community needs one focused on their needs. + +The `Genome Standards Consortium(GSC) `_ +works with many communities to generate the _“Minimum Information about any (X) Sequence” (MIxS) specifications. ENA and other INSDC members implement the MIxS standards. Essentially these consist of: + +* Community specific checklists, but with each having a core of shared metadata terms. +* A metadata term(field) has a specific name and definition. +* Sometimes there is either: + + * a controlled list of values + * or required pattern for the value, for example an integer. + +--------------------------------------- +Working together on Improving Standards +--------------------------------------- + +As outlined above, ENA collaborates with +`GSC `_, `INSDC `_ and other standards bodies to help meet our increasingly diverse user needs and increase interoperability. The sequence technologies continue to evolve at pace and scientists apply them to help investigate basic biology, disease and biodiversity. + +There are some considerations with these standards especially in that the actual implementation varies in different organisations. Generally we try to minimise the differences to increase interoperability. Here are some examples: + +* In ENA, we use the **"long" term name** (called "title" in GSC MIxS) rather than the **"short" term name** . This is because some of the short names are ambiguous abbreviations, so the longer names provide more clarity. +* In MIxS, many of the checklists are called **combinations**; these consist of **core** terms and **extension** terms. In ENA, a small subset of these terms(e.g. taxonomy) will not be in the sample checklist as they are handled separately. +* In ENA, some terms have broader concepts than the MIxS e.g. we use **depth** term more generally rather than just **soil depth** we also use the same term to cover **depth below sea level** +* There are several MIxS terms such as **miscellaneous attribute**, which are not used in the ENA checklists, as they are ambiguous and not interoperable. +We do regularly mutually share suggested changes to definitions, term(field) naming or additional terms. + +---------------------------------- +Time Scales of of GSC MIxS Updates +---------------------------------- + +We try to get the balance of being stable and predictable, whilst still being responsive enough to meet the needs of communities. + +* Generally ENA and other INSDC members commit to checklist updates following the major MIxS releases e.g. 4.0, 5.0, 6.0, 7.0. These are typically every 2 to 3 years. + + * Updates, even with much automation can take many weeks of full time equivalent work to add and quality control. + * Sometimes terms(fields) change names and then change back again between sub-releases. +* If important terms, improved term definitions or even checklists are needed by ENA's user communities, we often promptly add those in. From 3ec058ea7964a1f6b811445721d6482e5ffa55a5 Mon Sep 17 00:00:00 2001 From: woollard Date: Mon, 2 Dec 2024 10:27:44 +0000 Subject: [PATCH 11/15] removed: md files --- submit/samples/missing-values.html | 0 .../sample_checklist_introduction.md | 27 ----- .../2024-02-29_Incorporating_MIxS_V6.2.md | 107 ------------------ 3 files changed, 134 deletions(-) create mode 100644 submit/samples/missing-values.html delete mode 100644 submit/samples/sample_checklist/sample_checklist_introduction.md delete mode 100644 submit/samples/sample_checklist/updates/2024-02-29_Incorporating_MIxS_V6.2.md diff --git a/submit/samples/missing-values.html b/submit/samples/missing-values.html new file mode 100644 index 00000000..e69de29b diff --git a/submit/samples/sample_checklist/sample_checklist_introduction.md b/submit/samples/sample_checklist/sample_checklist_introduction.md deleted file mode 100644 index 8fdf4c13..00000000 --- a/submit/samples/sample_checklist/sample_checklist_introduction.md +++ /dev/null @@ -1,27 +0,0 @@ -# Introduction - -Sample checklists are used to ensure that both the minimum core metadata and metadata specific to different sample types are submitted to ENA. Please see: [background to sample checklists in ENA](https://ena-browser-docs.readthedocs.io/en/latest/browser/sample-checklists.html) and the available [ENA sample checklists](https://www.ebi.ac.uk/ena/browser/checklists). - -The [Genome Standards Consortium(GSC)](http://www.gensc.org//pages/projects/mixs-gsc-project.html) works with many communities to generate the _“Minimum Information about any (X) Sequence” (MIxS) specifications_. ENA and other INSDC members implement the MIxS standards. Essentially these consist of: -* Community specific checklists, but with each having a core of shared metadata terms. -* A metadata term(field) has specific name and definition. -* Sometimes there is either: - * a controlled list of values - * or required pattern for the value, for example an integer. - -## Working together on Improving Standards -As outlined above, ENA collaborates with [GSC](http://www.gensc.org//pages/projects/mixs-gsc-project.html), [INSDC](https://www.insdc.org/) and other standards bodies to help meet our increasingly diverse user needs and increase interoperability. The sequence technologies continue to evolve at pace and scientists apply them to help investigate basic biology, disease and biodiversity. - -There are some considerations with these standards especially in that the actual implementation varies in different organisations. Generally we try to minimise the differences to increase interoperability. Here are some examples: -* In ENA, we use the **long term name**(called "title" in GSC MIxS) rather than the **short term name**. This is because some of the short names are ambiguous abbreviations, so the longer names provide more clarity. -* In MIxS, many of the checklists are called **combinations**; these consist of **core** terms and **extension** terms. In ENA, a small subset of these terms(e.g. taxonomy) will not be in the sample checklist as they are handled separately. -* In ENA, some terms have broader concepts than the MIxS e.g. we use **depth** term more generally rather than just **soil depth** we also use the same term to cover **depth below sea level** -* There are several MIxS terms such as **miscellaneous attribute**, which are not used in the ENA checklists, as they are ambiguous and not interoperable. -We do regularly mutually share suggested changes to definitions, term naming or additional terms. - -## Time Scales of Updates -We try to get the balance of being stable and predictable, whilst still being responsive enough to meet the needs of communities. -* Generally ENA and other INSDC members commit to checklist updates following the major MIxS releases e.g. 4.0, 5.0, 6.0, 7.0. These are typically every 2 to 3 years. - * Updates, even with much automation can take many weeks of full time equivalent work to add and quality control. - * Sometimes terms change names and then change back again between sub-releases. -* If important terms, improved term definitions or even checklists are needed by ENA's user communities, we often promptly add those in. diff --git a/submit/samples/sample_checklist/updates/2024-02-29_Incorporating_MIxS_V6.2.md b/submit/samples/sample_checklist/updates/2024-02-29_Incorporating_MIxS_V6.2.md deleted file mode 100644 index 3cb6d82b..00000000 --- a/submit/samples/sample_checklist/updates/2024-02-29_Incorporating_MIxS_V6.2.md +++ /dev/null @@ -1,107 +0,0 @@ -# ENA Checklists Update Incorporating MIxS V6.2 -Checklists Updated: February 2024 - -* [ENA Checklists Update Incorporating MIxS V6.2](#ena-checklists-update-incorporating-mixs-v62) - * [Summary of ENA Checklists after the MIxS v6.2 Update](#summary-of-ena-checklists-after-the-mixs-v62-update) - * [Introduction](#introduction) - * [Four New MIxS Derived Checklists in ENA](#four-new-mixs-derived-checklists-in-ena) - * [Fifteen existing MIxS checklists in ENA have had new checklists terms added](#fifteen-existing-mixs-checklists-in-ena-have-had-new-checklists-terms-added) - * [General changes reflecting INSDC or specifically ENA needs, where Different to MIxS](#general-changes-reflecting-insdc-or-specifically-ena-needs-where-different-to-mixs) - * [Additional Controlled Value Terms being allowed](#additional-controlled-value-terms-being-allowed) - * [ENA uses separate fields rather than a single combined ones in Certain cases](#ena-uses-separate-fields-rather-than-a-single-combined-ones-in-certain-cases) - * [Minor Changes to non-MIxS checklists](#minor-changes-to-non-mixs-checklists) -* [Summary Tables of Terms counts and Terms added Existing Checklist](#summary-tables-of-terms-counts-and-terms-added-existing-checklist) - * [Summary Table of Terms ( all sample based )](#summary-table-of-terms--all-sample-based-) - * [Table of Terms added to which checklist ( all sample based )](#table-of-terms-added-to-which-checklist--all-sample-based-) - - - -## Summary of ENA Checklists after the MIxS v6.2 Update -* Four new MIxS checklists have been added to ENA: GSC MIxS Agriculture, GSC MIxS Food and Production, GSC MIxS Symbiont, and GSC MIxS Hydrocarbon. -* Fifteen existing MIxS checklists in ENA, had new checklists terms added. - * Three had many new terms: GSC MIxS built environment(66), GSC MIxS plant-associated(24) and GSC MIxS sediment(14). - * Twelve checklists had between 1 and 8 new terms added. -* 368 new MIxS terms were added to the ENA checklist system. There are now 1031 ENA sample checklist terms. -* 47 aliases(synonyms) of terms were added, e.g. where the MIxS term name had changed, or there was now a MIxS term for the same concept as an existing legacy ENA term. Wherever appropriate we use the MIxS term. - -This and similar metadata updates are important to both: -1. meet the needs of the diverse data submitters to ENA and -2. ensure interoperability for ENA submitted metadata with that of other INSDC members and other portals. Please see the background to sample checklists in ENA for more information. - -This will take effect from 15-March-2024. - ---- -## Introduction -[Please read this background about sample level checklists](../sample_checklist_introduction.md) and GSC MIxS. - -A growing proportion of ENA's sample level checklists are from MIxS, currently the MIxS are 22 of the 52 sample checklists. Most of the other sources of ENA’s checklists are legacy. - -## Four New MIxS Derived Checklists in ENA - -| New checklist Name in ENA | Deeper background to the checklist creation | Comment for ENA | -|---------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| **GSC MIxS Agriculture** | [Community-Driven Metadata Standards for Agricultural Microbiome Research](https://apsjournals.apsnet.org/doi/10.1094/PBIOMES-09-19-0051-P) | | -| **GSC MIxS Food and Production** | | Built from five MIxS lists packages as much overlap(food-human foods, food-farm environment, food-food production facility, food-animal and animal feed) N.B. A dozen terms are currently excluded, as they were mainly agriculture and or soil sample related. | -| **GSC MIxS Symbiont** | [MIxS-SA: a MIxS extension defining the minimum information standard for sequence data from symbiont-associated micro-organisms](https://www.nature.com/articles/s43705-022-00092-w) | | -| **GSC MIxS Hydrocarbon** | [MIxS-HCR: a MIxS extension defining a minimal information standard for sequence data from environments pertaining to hydrocarbon resources](https://www.nature.com/articles/s43705-022-00092-w) | All added apart from “additional info” | - -## Fifteen existing MIxS checklists in ENA have had new checklists terms added - -* For twelve checklists, between 1 and 8 new terms were added to these GSC MIxS checklists: air, host, human-associated, human-gut, human-oral, human-vaginal, microbial mat biofilm, miscellaneous natural or artificial environment, soil, wastewater sludge, and water -* For the following three checklists there was a more substantial addition: - * 66 terms being added to the **GSC MIxS built environment** - * 24 terms added to the **GSC MIxS plant-associated** - * 14 terms added to the **GSC MIxS sediment** - -## General changes reflecting INSDC or specifically ENA needs, where Different to MIxS -### Additional Controlled Value Terms being allowed -* Missing Value exceptions are now allowed for ***height, elevation*** and additionally ***altitude***. This was requested a height and elevation were mandatory in ***GSC MIxS Soil*** - -### ENA uses separate fields rather than a single combined ones in Certain cases -Why Different? Separate fields is are easier for users to populate from controlled vocabulary lists -* ***geographic location (country and/or sea)*** and ***region*** rather than MIxS' ***Geographic location (country and/or sea, region)*** -* ***geographic location (latitude)*** and ***geographic location (longitude)*** are captured as two separate fields. Rather than MIxS' single ***geographic location (latitude and longitude)*** - -## Minor Changes to non-MIxS checklists - -Some checklists at ENA are not from MIxS. Nevertheless, we try to keep terms aligned between these and MIxS. This has the obvious benefit of increasing the findability and interoperability of metadata. Legacy term names will be made synonyms for the updated term names. - -# Summary Tables of Terms counts and Terms added Existing Checklist - -## Summary Table of Terms ( all sample based ) -| Count | What | -|-------|--------------------------------------------------------------------------| -| 1031 | total terms now in ENA | -| 368 | new terms not in ENA were added from MIxS | -| 47 | aliases added | -| 16 | existing definitions updated | -| 3 | MIxS v6.2 terms were not added to ENA, such as "miscellaneous attribute" | - -## Table of Terms added to which checklist ( all sample based ) -Only listing the terms where there were additional terms to existing checklists. - - -| Checklist | New or existing | Comment | -|--------------------------------------------------------------------------------------------------|-------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| **GSC MIxS Agriculture** | New | N.B. From four or so MIxS packages | -| **GSC MIxS Food and Production** | New |
  • Combined from several MIxS lists as so much overlap
  • about a dozen terms, seemed out of place: agriculture and or soil looked better bets, so excluded those
  • ***geographic location (latitude)*** and ***geographic location (longitude)*** are captured as two separate fields in ENA's version. Rather than MIxS' single ***geographic location (latitude and longitude)***
  • Existing bio_material field used rather than MIxS CL's ***Repository name*** | -| **GSC MIxS Symbiont** | New |
  • ENA has the additional ***sample symbiont of***
  • ***geographic location (latitude)*** and ***geographic location (longitude)*** are captured as two separate fields in ENA's version. Rather than MIxS' single ***geographic location (latitude and longitude)*** | -| **GSC MIxS Hydrocarbon** | New |
  • All added apart from “additional info”
  • ***geographic location (latitude)*** and ***geographic location (longitude)*** are captured as two separate fields in ENA's version. Rather than MIxS' single ***geographic location (latitude and longitude)*** | -| **GSC MIxS air** | existing | new terms added:
  • taxonomic
  • classification | -| **GSC MIxS built environment** | existing | new terms added:
    • outside relative humidity
    • presence of pets, animals, or insects
    • quadrant position
    • relative sampling location
    • room air exchange rate
    • room architectural elements
    • room condition
    • room count
    • room dimensions
    • room door distance
    • room location in building
    • room moisture damage or mold history
    • room net area
    • room occupancy
    • room sampling position
    • room type
    • room volume
    • room window count
    • rooms connected by a doorway
    • rooms that are on the same hallway
    • rooms that share a door with sampling room
    • rooms that share a wall with sampling room
    • sampling day weather
    • sampling floor
    • sampling room ID or name
    • sampling time outside
    • season
    • seasonal use
    • shading device condition
    • shading device location
    • shading device material
    • shading device signs of water/mold
    • shading device type
    • specific humidity
    • specifications
    • surface-air contaminant
    • taxonomic classification
    • temperature
    • temperature outside house
    • train line
    • train station collection location
    • train stop collection location
    • visual media
    • wall area
    • wall construction type
    • wall finish material
    • wall height
    • wall location
    • wall signs of water/mold
    • wall surface treatment
    • wall texture
    • wall thermal mass
    • water feature size
    • water feature type
    • weekday
    • window area/size
    • window condition
    • window covering
    • window horizontal position
    • window location
    • window material
    • window open frequency
    • window signs of water/mold
    • window status
    • window type
    • window vertical position
      • | -| **GSC MIxS host** | existing | new terms added:
        • ancestral data
        • biological status
        • genetic modification
        • observed host symbionts
        • sample capture status
        • sample collection device or method
        • sample disease stage
        • taxonomic classification
        | -| **GSC MIxS human-associated** | existing | new terms added:
        • nose throat disorder
        • observed host symbionts
        • taxonomic classification
        | -| **GSC MIxS human-gut** | existing | new terms added:
        • host scientific name
        • observed host symbionts
        • taxonomic classification
        | -| **GSC MIxS human-oral** | existing | new terms added:
        • host scientific name
        • observed host symbionts
        • taxonomic classification
        | -| **GSC MIxS human-skin** | existing | new terms added:
        • host scientific name
        • observed host symbionts
        • taxonomic classification
        | -| **GSC MIxS human-vaginal** | existing | new terms added:
        • host scientific name
        | -| **GSC MIxS microbial mat biofilm** | existing | new terms added:
        • taxonomic classification
        • total nitrogen content
        | -| **GSC MIxS miscellaneous natural or artificial environment** | existing | new terms added:
        • taxonomic classification
        | -| **GSC MIxS plant-associated** | existing | new terms added:
        • ancestral data
        • biological status
        • biotic regimen
        • culture rooting medium
        • genetic modification
        • growth facility
        • growth habit
        • host scientific name
        • light regimen
        • observed host symbionts
        • plant growth medium
        • plant sex
        • plant structure
        • rooting conditions
        • rooting medium carbon
        • rooting medium macronutrients
        • rooting medium micronutrients
        • rooting medium organic supplements
        • rooting medium pH
        • rooting medium regulators
        • rooting medium solidifier
        • sample capture status
        • sample disease stage
        • taxonomic classification
        | -| **GSC MIxS sediment** | existing | new terms added:
        • alkalinity
        • mean friction velocity
        • mean peak friction velocity
        • pH
        • particle classification
        • porosity
        • pressure
        • sediment type
        • taxonomic classification
        • temperature
        • tidal stage
        • total depth of water column
        • total nitrogen content
        • turbidity
        | -| **GSC MIxS soil** | existing | new terms added:
        • host specificity or range
        • mean seasonal precipitation
        • mean seasonal temperature
        • organic nitrogen
        • taxonomic classification
        | -| **GSC MIxS wastewater sludge** | existing | new terms added:
        • taxonomic classification
        • total nitrogen concentration
        | -| **GSC MIxS water** | existing | new terms added:
        • alkalinity method
        • size-fraction lower threshold
        • size-fraction upper threshold
        • taxonomic classification
        • total nitrogen concentration
        | - ---- - From 5adde2d79ff3ad62da2c52636de948e581adcb5a Mon Sep 17 00:00:00 2001 From: woollard Date: Mon, 2 Dec 2024 10:35:57 +0000 Subject: [PATCH 12/15] docs: added path to sample_checklist.rst to samples.rst --- submit/samples.rst | 1 + 1 file changed, 1 insertion(+) diff --git a/submit/samples.rst b/submit/samples.rst index 750afae2..cf73bd17 100644 --- a/submit/samples.rst +++ b/submit/samples.rst @@ -88,3 +88,4 @@ Find specific advice on registering studies using your preferred method below: samples/interactive samples/programmatic + samples/sample_checklist From 26652d05fd0af96f642b2bf62fad4352c36aa405 Mon Sep 17 00:00:00 2001 From: woollard Date: Mon, 2 Dec 2024 10:37:57 +0000 Subject: [PATCH 13/15] docs: added path to updates/2024-02-29_Incorporating_MIxS_V6.2.rs to updates.rstt --- submit/samples/sample_checklist/updates.rst | 9 +++++++++ 1 file changed, 9 insertions(+) create mode 100644 submit/samples/sample_checklist/updates.rst diff --git a/submit/samples/sample_checklist/updates.rst b/submit/samples/sample_checklist/updates.rst new file mode 100644 index 00000000..01d04eb5 --- /dev/null +++ b/submit/samples/sample_checklist/updates.rst @@ -0,0 +1,9 @@ +========================================== +Sample Checklist Log of Updates of Changes +========================================== + +There were many changes over the decades. Going forwards we are going to be tracking them better. + +.. toctree: + :maxdepth: 1 + updates/2024-02-29_Incorporating_MIxS_V6.2.rst From 0a380bbbc959d4ae4561368c354bec398da174a9 Mon Sep 17 00:00:00 2001 From: woollard Date: Mon, 2 Dec 2024 10:58:49 +0000 Subject: [PATCH 14/15] initial: samples/sample_checklist.rst --- submit/samples/sample_checklist.rst | 10 ++++++++++ 1 file changed, 10 insertions(+) create mode 100644 submit/samples/sample_checklist.rst diff --git a/submit/samples/sample_checklist.rst b/submit/samples/sample_checklist.rst new file mode 100644 index 00000000..7d6728dc --- /dev/null +++ b/submit/samples/sample_checklist.rst @@ -0,0 +1,10 @@ +======================== +Sample Checklist Related +======================== + +Various sample related information + +.. toctree: + :maxdepth: 1 + sample_checklist/sample_checklist_introduction.rst + sample_checklist/updates.rst From b1e1888623280a464f6a80e8c13d84a25e10304f Mon Sep 17 00:00:00 2001 From: woollard Date: Fri, 28 Feb 2025 14:06:26 +0000 Subject: [PATCH 15/15] doc: initial rough notes for sample_checklist_infrastructure.rst --- .../sample_checklist_infrastructure.rst | 36 +++++++++++++++++++ 1 file changed, 36 insertions(+) create mode 100644 submit/samples/sample_checklist/sample_checklist_infrastructure.rst diff --git a/submit/samples/sample_checklist/sample_checklist_infrastructure.rst b/submit/samples/sample_checklist/sample_checklist_infrastructure.rst new file mode 100644 index 00000000..6385f972 --- /dev/null +++ b/submit/samples/sample_checklist/sample_checklist_infrastructure.rst @@ -0,0 +1,36 @@ +=============================== +Sample Checklist Infrastructure +=============================== + + +DRAFT! + +------------ +Introduction +------------ + + +In late 2024/early 2025 ENA implementing modernisation of the underlying ENA checklist systems architecture. + +--------------------------------------------------- +Why we moved to using Versioning Sample Checklists? +--------------------------------------------------- +It will allow ENA to more rapidly update checklists (e.g. when new GSC MIxS releases) and also use ontologies for terms. + +What it has meant is that checklists will all have versions and you will need to pull down the latest one after submission. +Changes that need a new version could be as simple as the required pattern changing. + +------ +High Level Infrastructural Changes +It makes the system more maintenance friendly. + +-------------------------------------------------------- +Technical Endpoints to Computationally Access Checklists +-------------------------------------------------------- + +Endpoints, you can use here, (TO BE UPDATED TO PROD instances) +Paginate over all versioned schemas. +Schema and metadata (JSON Schema is embedded here) +Just a summary of schema +Paginate over latest schemas (same as above 'a', with latest=true query param) +Get JSON Schema of the latest version of a given checklist (eg. ERC000022)