formatting records for the nbn atlas an introduction to
play

Formatting records for the NBN Atlas: An introduction to Darwin Core - PowerPoint PPT Presentation

Formatting records for the NBN Atlas: An introduction to Darwin Core SOPHIA RATCLIFFE NBN Trust Technical & Data Partner Support Officer REUBEN ROBERTS NBN Systems Developer NBN Conference 2018 Knowledge Transfer Session Sharing UK


  1. Formatting records for the NBN Atlas: An introduction to Darwin Core SOPHIA RATCLIFFE NBN Trust Technical & Data Partner Support Officer REUBEN ROBERTS NBN Systems Developer NBN Conference 2018 Knowledge Transfer Session Sharing UK wildlife data

  2. Session Aims 2 — What is Darwin Core (DwC)? — How is DwC used in the NBN Atlas — Can we (NBN) use DwC better? — What can we contribute to DwC? — (Improvements to NBN Atlas pages) Sharing UK wildlife data

  3. What is DwC? 3 Darwin Core is the data standard for publishing and integrating biodiversity information Library of terms aimed at to providing common naming conventions and data structure Primarily based on taxa and their occurrence Adapted from: http://rs.tdwg.org/dwc/ Wieczorek et al. (2012) Darwin Core: An Evolving Community-Developed Biodiversity Data Standard. PLoS ONE 7(1): e29715 Sharing UK wildlife data

  4. Taxonomic Databases Working Group 4 1 st DwC protocol DwC ratified 2009 2018 1985 1998 20 169 255 # DwC terms: Sharing UK wildlife data

  5. Who (else) uses DwC? 5 Sharing UK wildlife data

  6. DwC reference guide 6 http://rs.tdwg.org/dwc/terms/

  7. DwC classes and terms 7 Record-level terms Institution, collection, nature of record, licence, rightsholder Occurrence Occurrence ID, recorder, individual count, quantity (and quantity type), sex, life stage, behaviour, status (presence/absence) Organism Organism scope (colony, nest, clump), organism remarks Event Date, sampling protocol and methods, field notes Location Latitude and longitude coordinates, geodetic datum, location name and remarks Identification Verification status, identifier Taxon Taxon ID (UKSI taxon version key), scientific name, vernacular name Sharing UK wildlife data

  8. DwC term example 8 http://rs.tdwg.org/dwc/terms/ Sharing UK wildlife data

  9. DwC example 9 What does it mean in terms of the data? HBRG Insects Dataset Sharing UK wildlife data

  10. DwC Extensions 10 — Simple multimedia — Literature references — Minimum Information about any (x) Sequence (MIxS) Sharing UK wildlife data

  11. Who manages DwC? 11 Darwin Core Maintenance Group https://www.tdwg.org/standards/dwc/maintenance/ — Issues submitted to a Github site: https://github.com/tdwg/dwc/issues — 30-day public review — review by TDWG's Technical Architecture Group Sharing UK wildlife data

  12. DwC Archives 12 Sharing data TXT Describes s XML d n e t x XML TXT E ZIP Extensions EML.XML Meta.XML Core Archive TXT http://tools.gbif.org/dwca-assistant/ Sharing UK wildlife data

  13. DwC on the NBN Atlas 13 — Taxon information (species dictionary) — updated 6-12 monthly — Occurrence records — monthly processing run (1 st weekend of each month) Sharing UK wildlife data

  14. Species dictionary 14 UK Species Inventory Access DB (NHM, London) Taxon identifier Scientific names Vernacular names Rank Status (accepted/synonym) Establishment means (native/non-native) Establishment status Realm (terrestrial, marine, freshwater) Darwin Core TAXON Sharing UK wildlife data

  15. Occurrence records 15 Accepted formats — DwCA (iRecord, RBGE) — NBN Atlas formatted spreadsheets — NBN Exchange format (Recorder 6, Marine Recorder) — Unformatted spreadsheets Sharing UK wildlife data

  16. Occurrence records terms 16 — Core — Desirable — Non-DwC — Other Sharing UK wildlife data

  17. Core terms 17 — occurrenceID — basisOfRecord — license — rightsholder — institutionCode — occurrenceStatus (present / absent) — identificationVerificationStatus Sharing UK wildlife data

  18. basisOfRecord 18 Sharing UK wildlife data

  19. identificationVerificationStatus 19 § Accepted § Accepted - considered correct § Accepted - correct § Unconfirmed § Unconfirmed - plausible § Unconfirmed - not reviewed Sharing UK wildlife data

  20. Core terms cont. 20 — taxonID or scientificName or vernacularName — eventDate — gridReference / decimalLatitude & decimalLongitude — geodeticDatum — coordinateUncertaintyInMeters — locality — recordedBy — identifiedBy Sharing UK wildlife data

  21. eventDate 21 — eventDate (YYYY-MM-DD) (ISO 8601) ÷ 1998-03-28 ÷ 1998-03-28/05-31 ÷ 1998-03 ÷ 1998-03/05 ÷ 1998 ÷ 1998/2002 — day, month, year (single fields) ÷ preferred method for single day events and partial dates (?) Sharing UK wildlife data

  22. eventDate cont. 22 — verbatimEventDate ÷ “spring 1998” — datePrecision (non-DwC) — endDate (non-DwC) ÷ endDate day, month, year Sharing UK wildlife data

  23. Core terms cont. 23 — taxonID or scientificName or vernacularName — eventDate — gridReference / decimalLatitude & decimalLongitude — geodeticDatum – default WGS84 — coordinateUncertaintyInMeters — locality — recordedBy — identifiedBy — datasetName Sharing UK wildlife data

  24. non-DwC terms 24 — verifier — organismStatus (alive/dead) Sharing UK wildlife data

  25. Desirable terms 25 — individualCount — organismQuantity — organismQuantityType — organismScope — sex — lifeStage Sharing UK wildlife data

  26. individualCount 26 — 3% records have individual count (~5m) — 29,000 different values Examples: “1 Adult”, “Frequent”, “1 Male”, “#NAME?”, “0.25”, “2 Adult Male; 1 Juvenile Female”, “Many” Sharing UK wildlife data

  27. organismQuantity 27 Sharing UK wildlife data

  28. organismQuantity 28 — 540,000 records with organismQuantity — 2,000 different values Examples: “Many”, “Several”, ”sev.”, ”Present” “Occasional” or “O” (organismQuantityType: DAFOR) ”50” (organismQuantityType: % cover) Sharing UK wildlife data

  29. Desirable terms 29 — individualCount — organismQuantity — organismQuantityType — organismScope — sex — lifeStage Sharing UK wildlife data

  30. organismScope 30 Sharing UK wildlife data

  31. organismScope 31 5,227 records with organismScope Breeding pair droppings 9.1% Female 9.1% heard Male nest 13% pair 32.4% Pair shell territories Other examples: sett, spraint, tracks, nest, burrow, eggs Sharing UK wildlife data

  32. lifeStage 32 473,631 records with lifeStage ad adult 6% calves 20% gall larva larvae male 65.4% not recorded pre preadult Other examples: immature, nymph, young, dead, chick Sharing UK wildlife data

  33. Comment (remarks) fields 33 — occurrenceRemarks — organismRemarks — eventRemarks — locationRemarks — identificationRemarks Sharing UK wildlife data

  34. Other terms 34 Event — eventID — samplingProtocol — sampleSizeValue — sampleSizeUnit — samplingEffort Sharing UK wildlife data

  35. Other terms cont. 35 Record-level — bibliographicCitation — references — informationWithheld — dataGeneralizations — dynamicProperties Sharing UK wildlife data

  36. dynamicProperties 36 A list of additional measurements, facts, characteristics, or assertions about the record. Meant to provide a mechanism for structured content. Sharing UK wildlife data

  37. dynamicProperties 37 A list of additional measurements, facts, characteristics, or assertions about the record. Meant to provide a mechanism for structured content. National Dormouse Database (NDD) NDMPsite: Yes RecordType: Live specimen RecordTypeReliability: Good Sharing UK wildlife data

  38. Data processing 38 Sharing UK wildlife data

  39. Data processing 39 1. Processing 2. Sampling SEARCH INDEX (raw and processed 3. Indexing values) Sharing UK wildlife data

  40. 1. Processing 40 — Name matching routine — OSGR <> Latitude/longitude coordinates — Dates — Species list membership — Sensitive species Sharing UK wildlife data

  41. Sensitive species 41 NBN 2018 Conference – Knowledge Transfer Session

  42. 1. Processing cont. 42 — Data quality checks ÷ recordHasIssues ÷ recordIssues Sharing UK wildlife data

  43. Data quality checks 43 Sharing UK wildlife data

  44. 2. Sampling 44 — Boundaries — Habitats — Environmental layers Sharing UK wildlife data

  45. 2. Sampling 45 NBN 2018 Conference – Knowledge Transfer Session

  46. 3. Indexing 46 — SOLR — Occurrence record fields: ÷ https://records-ws.nbnatlas.org/index/fields — Only possible to search / filter / facet indexed fields — Can add fields to the index (e.g. lifeStage) Sharing UK wildlife data

  47. Worked examples 47 — Recorder 6 dataset (Highlands Biological Records Centre) — CEDaR Northern Ireland Seal Survey Sharing UK wildlife data

  48. Help 48 — NBN Atlas documentation site https://docs.nbnatlas.org/share-species-occurrence- records-with-the-nbn-atlas/ — Darwin Core quick reference guide ÷ https://dwc.tdwg.org/terms/ — Darwin Core Archive Assistant (GBIF) ÷ http://tools.gbif.org/dwca-assistant/ — Darwin Core Archive Validator (GBIF) ÷ https://tools.gbif.org/dwca-validator/ Sharing UK wildlife data

  49. Can we use DwC better? 49 Controlled vocabularies: — lifeStage — sex — organismScope Sharing UK wildlife data

  50. What can we contribute back? 50 — organismStatus — verifier Sharing UK wildlife data

  51. Improvements 51 Improvements to the presentation of records in the NBN Atlas: 1. Occurrence records page 2. Data resource metadata page 3. Advanced records search Sharing UK wildlife data

  52. Improvements 52 — https://github.com/nbnuk/nbnatlas-issues Sharing UK wildlife data

Recommend


More recommend