Mathias Dillen

28 posts

Mathias Dillen

Mathias Dillen

@MathiasDillen

Katılım Ağustos 2019
31 Takip Edilen41 Takipçiler
Mathias Dillen
Mathias Dillen@MathiasDillen·
@rdmpage @GBIF @JSTORPlants GBIF is actively updated, JSTOR was often dump-populated 10+ years ago. So it's not that surprising. Given how difficult it is to access JSTOR content, it seems more productive to focus on GBIF data and match with taxon name databases for type links.
English
1
0
0
26
Roderic Page
Roderic Page@rdmpage·
Working on linking @JSTORPlants type specimens to corresponding occurrences in @GBIF, this is not nearly as easy as it should be. Matching identifiers is hard when people insist on mangling them, remixing them, or just deleting them. #PIDs
English
1
0
5
549
Mathias Dillen
Mathias Dillen@MathiasDillen·
@rdmpage @GBIF @JSTORPlants It'll be mainly differences in file compression and dimensions. Rescanning or other changes post publication should be rare. I daresay that the hash method would work for the majority of specimen images. And if not, it would be intriguing to find out exactly why.
English
1
0
0
21
Roderic Page
Roderic Page@rdmpage·
@MathiasDillen @GBIF @JSTORPlants So we’d need to test for image identity (same file), derived image (same image different size), and same thing but different image (although this case seems rare).
English
2
0
1
22
Mathias Dillen
Mathias Dillen@MathiasDillen·
@rdmpage @GBIF @JSTORPlants Reverse image search as a service is not that easy. Easiest could be if GBIF implemented a hash archive for the media references people publish? Wouldn't work for derivative images, but would make it easier to go through the broken id cleanup routine.
English
1
0
0
42
Mathias Dillen
Mathias Dillen@MathiasDillen·
@andrawaag @GBIF @rdmpage Calling it occurrence ID is a bit misleading, as these records also have a dwc:occurrenceID that is completely different. These IDs should be more persistent, but they can also change.
English
1
0
2
0
Alexis Garretson
Alexis Garretson@GarretsonAlexis·
Are there plans through @GBIF to track literature usage down to the specimen level? The dataset, publisher, and download citation counters are so awesome, it would be awesome to have that built-in at the specimen level (I know @BionomiaTrack is doing great stuff here too!)
English
2
3
10
0
Mathias Dillen retweetledi
Deborah Paul
Deborah Paul@idbdeb·
BISS_Journal@BISS_Journal

In the 1st symposium at #TDWG2021:🔹Connecting #biodiversity data with knowledge graphs🔹, led by @rdmpage & @franck_michel2, we'll use @Wikidata, @dbpedia, as well as domain-specific #Ozymandias & #OpenBiodiv as case studies. 🔗Abstracts: biss.pensoft.net/collection/307/ #Bioinformatics

English
0
6
11
0
Mathias Dillen
Mathias Dillen@MathiasDillen·
@dpsSpiders Data files should be watermarked somehow to indicate if they were ever edited with spreadsheet software. One thing that may help is to always call the first column in a csv file ID. Excel will neatly refuse to open it. journeybytes.com/fix-csv-sylk-e…
English
0
0
1
0