Best practices for stable URIs: Difference between revisions
| bwf>Quentin Groom | bwf>Jordan Biserkov  | ||
| Line 96: | Line 96: | ||
| * rdf/html at http://plazi.org/treatments/AD59E141A223170EB833D4A3F0A266D1.[html|rdf|json|xml] | * rdf/html at http://plazi.org/treatments/AD59E141A223170EB833D4A3F0A266D1.[html|rdf|json|xml] | ||
| Jordan Biserkov: | |||
| * object at   | * <code>  object at http://'''stable'''.example.org/specimens/''7D39CAAA-4B4B-4588-A372-D4097162B1CD''#concept</code> | ||
| * rdf/html at   | * <code>rdf/html at http://'''stable'''.example.org/specimens/''7D39CAAA-4B4B-4588-A372-D4097162B1CD''</code> | ||
| YOUR NAME: | YOUR NAME: | ||
Revision as of 09:38, 28 May 2013
Introduction
1. It is important to keep the mission-critical URIs (or URLs, or IRIs, or web-adresses) stable. Make a deliberate choice which pages and which classes of objects you want to manage as stable. Do not aim to keep all your URI/URLs stable forever: this often becomes unmanageable.
2. The primary purpose of this discussion is to support others in finding good URI patterns. The secondary purpose is to assess whether it is possible that some institutions voluntarily share the same pattern to ease recognition and set a recognizable example for others to follow?
3. Linked Open Data and the Semantic Web in particular use http-URIs to identify resources as well as to retrieve information about them. The Semantic Web works with any kind of http-URIs, including those that do not follow these best practices. However, it works best if URIs are kept stable. This can be difficult for some URI patterns; the present discussion makes suggestions how to make it reasonably likely to be able to keep your URIs stable.
4. While the present discussion may be useful when looking for stable URIs patterns for other purposes than Linked Open Data and the Semantic Web, it largely focuses on these and some aspects are specific to the Semantic Web.
5. In the face of changing technology, at some point you will have to use the webserver's rewrite module to keep URIs stable. The simpler the URI pattern is, the easier this becomes. Thus the first recommendation is: Use rewriting from the start on. Define simple URI patterns (= no ports, no extensions like .php or .aspx, no parameters with ? or &) that are being rewritten to your current technology.
6. If several URIs exist within a particular dereferencing service (e.g. two http-URIs), declare one as the "preferred" (canonical) URI. E.g. if both
- http://zoobank.org/7D39CAAA-4B4B-4588-A372-D4097162B1CD
- http://zoobank.org/NomenclaturalActs/7D39CAAA-4B4B-4588-A372-D4097162B1CD
refer to the same object, one should be a redirect to the other (e.g. http status 301).
7. Highly recommended references: 1. Sauermann & Cyganiak 2008, Cool URIs for the Semantic Web. 2. Hyam, R.D., Drinkwater, R.E. & Harris, D.J. Stable citations for herbarium specimens on the internet: an illustration from a taxonomic revision of Duboscia (Malvaceae) Phytotaxa 73: 17–30 (2012) PDF and Kevin Richards, Richard White, Nicola Nicolson, Richard Pyle 2011 A Beginner’s Guide to Persistent Identifiers (good general discussion, but most solutions discussed would not function in a Linked Data world).
Parts of stable URI patterns
A recommended URI pattern is the following:
- subdomain.yourdomain.org/path/variable-identifier#hash
- subdomain: If the stable URIs use a general purpose domain with many different services, it may be desirable to add a dedicated subdomain for specific services. Using subdomains offers the flexibility that several institutions in the future share operations for a specific subdomain without affecting the stability of these URIs. If the main domain is already dedicated to a specific service, using a subdomain is probably irrelevant.
- yourdomain.org/: The main domain name, like rbge.org.uk, zoobank.org, ipni.org, naturalis.nl, nhm.ac.uk/
- path: The part that is remains constant for different identifiers of the same class (e.g., taxa, specimens). Similar to a subdomain, this increases the ease with which identifiers can be kept stable over decades (using web server rewrite modules).
- It is possible to ignore this and use URIs like http://zoobank.org/7D39CAAA-4B4B-4588-A372-D4097162B1CD. However, this makes future rewrite rules more sensitive
 
- variable-identifier: The part that changes for each object. It will usually be a number or code you also use otherwise, like a simple locally unique or code (123, a123, M-2361318, ...) or it may be a UUID like 1C4EDC178AD79DD7F1A5AB856E8C5BCA.
- #hash: Relevant only when using the hash-method to distinguish between the abstract concept or concrete object (e.g. Formica rufa or a specific physical specimen, which cannot be transmitted through the internet, but described) and the web pages (html, pdf, rdf-data).
- (The alternative method is to use 303 redirects.)
 
Social consideration for labeling parts of stable URIs
With the hash method and using both subdomain and path for stability, three strings are needed.
1. Technically irrelevant, but confusing to humans are repetitions like http://objects.example.org/objects/123#object or concatenations of closely overlapping terms like http://objects.example.org/concepts/123#topic
2. Terms may come from these categories:
- Generic terms like: "resource", "portal", "content", "object", "concept", "thing", "topic", "id", "identifier", "citable". NOTE: ADDITIONAL PROPOSAL WELCOME!
- Terms for classes of objects or concepts like: "taxon"/"taxa", "taxonconcept", "name", "term", "sample", "specimen", "treatment", "description", "morphology", "collection", "person"/"people", "organisation"/"institution", "locality", "herbarium".
- An indicator of stability like "stable", "permanent", "stable-id", "purl" (= permanent URL). NOTE: ADDITIONAL PROPOSAL WELCOME!
- Terms with no or reduced semantic like: "dx", "zb" (abbreviation of Zoobank), "res", "it", "o", "t", "s", "p".
3. In the semantic web, the word "data" should be avoided where referring to the concept or thing itself (as opposed to the data about it). A URI like data.organisation.org/specimen/123 for a specimen itself (but redirected to another URI when the data are being returned) is easily misinterpreted as referring to the data rather than the object.
4. In principle a similar concern may be raised over the use of "id" or "identifier" (the semantic web would speak about the thing by means of an identifier, not about the identifier), but these concerns are probably negligible.
5. Terms from the categories above can probably used interchangeably for subdomain and path, i.e. specimen.example.org/object/123 and object.example.org/specimen/123 work similarly well.
- If you foresee that operations for different objects classes may in the future be consolidated within different consortia, it may be desirable to put the object class (like specimen) in the subdomain.
6. For the hash tag to indicate that the URI with hash is the real thing, the one without the data, the choices are more limited. Examples:
- specimen.example.org/res/123#specimen
- specimen.example.org/res/123#object
- specimen.example.org/res/123#obj
- specimen.example.org/res/123#id
- specimen.example.org/res/123#itself
- PLEASE ADD YOUR EXAMPLES!
- (The above applies only to the hash method, not 303 redirection, see here)
Examples:
- http://objects.example.org/res/123#specimen
- http://specimen.example.org/stable-id/123#physical
- http://id.example.org/specimen/123#obj
- http://res.example.org/specimen/123#id
- http://permanent.example.org/specimen/123#id
YOUR Preferred pattern for specimen or scientific names
Gregor Hagedorn:
- object at http://specimen.example.org/permanent/123#rsc
- rdf/html at http://specimen.example.org/permanent/123
Richard Pyle (from gplus discussion, "{UUID-identifier}" is a concrete UUID):
- object at http://dx.zoobank.org/NomenclaturalAct/{UUID-identifier}#stuff
- rdf/html at http://dx.zoobank.org/NomenclaturalAct/{UUID-identifier}
Peter DeVries:
- object at http://ocs.taxonconcept.org/ocs/0da685c9-9cdc-4dff-baf3-38d1bdbc6552
- rdf/html at http://ocs.taxonconcept.org/ocs/0da685c9-9cdc-4dff-baf3-38d1bdbc6552.html
Roger Hyam:
- object at http://data.rbge.org.uk/herb/E00435912
- rdf/html at http://elmer.rbge.org.uk/bgbase/vherb/bgbasevherb.php?cfg=bgbase/vherb/bgbasevherb.cfg&specimens_barcode=E00435912
Falko Glöckler:
- object at http://www.digicoll.info/ID:/ZMB_Phasm_D006
- rdf/html at http://www.digicoll.info/?referer=&UID=ZMB_Phasm_D006 or with namespace at http://www.digicoll.info/?referer=www.naturkundemuseum-berlin.de&UID=ZMB_Phasm_D006
Quentin Groom:
- object at http://herbarium.belgium.museum/permanent/BR5030008086350#id
- rdf/html at http://herbarium.belgium.museum/permanent/BR5030008086350
Terry Catapano:
- object at http://plazi.org/treatments/AD59E141A223170EB833D4A3F0A266D1
- rdf/html at http://plazi.org/treatments/AD59E141A223170EB833D4A3F0A266D1.[html|rdf|json|xml]
Jordan Biserkov:
- object at http://stable.example.org/specimens/7D39CAAA-4B4B-4588-A372-D4097162B1CD#concept
- rdf/html at http://stable.example.org/specimens/7D39CAAA-4B4B-4588-A372-D4097162B1CD
YOUR NAME:
- object at
- rdf/html at
YOUR NAME:
- object at
- rdf/html at
YOUR NAME:
- object at
- rdf/html at
Please add in your preferred pattern based on the notes above as well as new ideas. Can we achieve a set of patterns (not a single one) that others could mimic? I think this might help to spread the idea...
All accounts of biowikifarm instances work here, if you have no account Please Request an Account.


