引自:http://www.w3.org/TR/uri-clarification/
URIs, URLs, and URNs: Clarifications and Recommendations 1.0
Report from the joint W3C/IETF URI Planning Interest Group
W3C Note 21 September 2001
- This version:
- http://www.w3.org/TR/2001/NOTE-uri-clarification-20010921/
- Latest version:
- http://www.w3.org/TR/uri-clarification
- by:
- URI Planning Interest Group, W3C/IETF
(see Acknowledgements)
Copyright © 2001 W3C® (MIT, INRIA, Keio), All Rights Reserved. W3C liability, trademark, document use, and software licensing rules apply.
Abstract
This paper addresses and attempts to clarify two issues pertaining to URIs, and presents recommendations. Section 1 addresses how URI space is partitioned and the relationship between URIs, URLs, and URNs. Section 2 describes how URI schemes and URN namespace ids are registered. Section 3 mentions additional unresolved issues not considered by this paper and section 4 presents recommendations.
Status of this Document
This section describes the status of this document at the time of its publication. Other documents may supersede this document. The latest status of this document series is maintained at the W3C in a list of current W3C technical reports at http://www.w3.org/TR/ .
This is a report from the W3C/IETF URI Planning Interest Group, for review by W3C members, the IETF community, and other interested parties. We invite review and discussion of our recommendations for future work in the IETF and/or W3C. Please address your feedback to uri@w3.org, a mailing list with public archive.
This document has been produced as part of the W3C URI Activity.
1 URI Partitioning
There is some confusion in the web community over the partitioning of URI space, specifically, the relationship among the concepts of URL, URN, and URI. The confusion owes to the incompatibility between two different views of URI partitioning, which we call the "classical" and "contemporary" views.
1.1 Classical View
During the early years of discussion of web identifiers (early to mid 90s), people
assumed that an identifer type would be cast into one of two (or possibly more)
classes. An identifier might specify the location of a resource (a URL) or its name
(a URN) independent of location. Thus a URI was either a URL or a URN. There was
discussion about generalizing this by addition of a discrete number of additional
classes; for example, a URI might point to metadata rather than the resource itself,
in which case the URI would be a URC (citation). URI space was thus viewed as partitioned
into subspaces: URL and URN, and additional subspaces, to be defined. The only such
additional space ever proposed was URC and there never was any buy-in; so without
loss of generality it's reasonable to say that URI space was thought to be partitioned
into two classes: URL and URN. Thus for example, "http:
" was a URL
scheme, and "isbn:
" would (someday) be a URN scheme. Any new scheme
would be cast into one or the other of these two classes.
1.2 Contemporary View
Over time, the importance of this additional level of hierarchy seemed to lessen;
the view became that an individual scheme does not need to be cast into one of a
discrete set of URI types such as "URL", "URN", "URC", etc. Web-identifer schemes
are in general URI schemes; a given URI scheme may define subspaces. Thus "http:
"
is a URI scheme. "urn:
" is also a URI scheme; it defines subspaces,
called "namespaces". For example, the set of URNs of the form "urn:isbn:n-nn-nnnnnn-n
"
is a URN namespace. ("isbn
" is an URN namespace identifier. It is not
a "URN scheme" nor a "URI scheme").
Further according to the contemporary view, the term "URL" does not refer to a formal
partition of URI space; rather, URL is a useful but informal concept: a URL is a
type of URI that identifies a resource via a representation of its primary access
mechanism (e.g., its network "location"), rather than by some other attributes it
may have. Thus as we noted, "http:
" is a URI scheme. An http URI is
a URL. The phrase "URL scheme" is now used infrequently, usually to refer to some
subclass of URI schemes which exclude URNs.
1.3 Confusion
The body of documents (RFCs, etc) covering URI architecture, syntax, registration, etc., spans both the classical and contemporary periods. People who are well-versed in URI matters tend to use "URL" and "URI" in ways that seem to be interchangable. Among these experts, this isn't a problem. But among the Internet community at large, it is. People are not convinced that URI and URL mean the same thing, in documents where they (apparently) do. When one sees an RFC that talks about URI schemes (e.g. [RFC 2396]), another that talks about URL schemes (e.g. [RFC 2717]), and yet another that talks of URN schemes ([RFC 2276]) it is natural to wonder what's the difference, and how they relate to one another. While RFC 2396 1.2 attempts to address the distinction between URIs, URLs and URNs, it has not been successful in clearing up the confusion.
2 Registration
This section examines the state of registration of URI schemes and URN namespaces and the mechanisms by which registration currently occurs.
2.1 URI Schemes
2.1.1 Registered URI schemes
The official register of URI scheme names is maintained by IANA, at http://www.iana.org/assignments/uri-schemes
. For each scheme, the RFC that defines the scheme is listed, for example "http:
"
is defined by [RFC 2616]. The table currently lists 30 schemes.
In addition, there are a few "reserved" scheme names; at one point in time these
were intended to become registered schemes but have since been dropped.
2.1.2 Unregistered URI Schemes
We distinguish between public (unregistered) and private schemes. A public scheme (registered or not), is one for which there is some public document describing it.
2.1.2.1 Public Unregistered Schemes
Dan Connolly maintains a list
of known, public URI schemes, both registered and un-registered, a total of 84 schemes.
50 or so of these are unregistered (not listed in the IANA register). Some may be
obsolete (for example, it appears that "phone
", is obsolete, superceded
by "tel
"). Some have an RFC, but are not included in the IANA list.
2.1.2.2 Private Schemes
It's probably impossible to determine all of these, and it's not clear that it's worthwhile to try, except perhaps to get some idea of their number. In the minutes of the August 1997 IETF meeting is the observation that there may be 20-40 in use at Microsoft, with 2-3 being added a day, and that WebTV has 24, with 6 added per year.
2.1.3 Registration of URI Schemes
[RFC 2717] specifies procedures for registering scheme names, and points to [RFC 2718] which supplies guidelines. RFC 2717 describes an organization of schemes into "trees".
2.2 URN Namespaces
A URN namespace is identified by a "Namespace ID", NID, which is registered with IANA (see 2.2.4 Registration Procedures for URN NIDs).
2.2.1 Registered URN NIDs
There are two categories of registered URN NIDs:
-
Informal: These are of the form "urn-<number>" where <number> is assigned by IANA. There are three registered in this category (urn-1, urn-2, and urn-3).
-
Formal: The official list of registered NIDs is kept by IANA at http://www.iana.org/assignments/urn-namespaces. Currently it lists eight registered NIDs:
-
'ietf', defined by [RFC 2648], URN Namespace for IETF Documents
-
'pin', defined by [RFC 3043], The Network Solutions Personal Internet Name (PIN): A URN Namespace for People and Organizations
-
'issn' defined by [RFC 3043], Using The ISSN as URN within an ISSN-URN Namespace
-
'oid' defined by [RFC 3061], A URN Namespace of Object Identifiers
-
'newsml' defined by [RFC 3085], URN Namespace for NewsML Resources
-
'oasis' defined by [RFC 3121], A URN Namespace for OASIS
-
'xmlorg' defined by [RFC 3120], A URN Namespace for XML.org
-
'publicid' defined by [RFC 3151], A URN Namespace for Public Identifiers
-
2.2.2 Pending URN NIDs
There are a number of pending URN NID registration requests but there is no reliable way to discover them, or their status. For example, 'isbn' and 'nbn' have been approved by the IESG and are in the RFC Editor's queue. 'isbn', as a potential URN namespace (or URI scheme), in particular has been a source of much speculation and confusion over several years. It would be helpful if there were some formal means to track the status of NID requests such as 'isbn'.
2.2.3 Unregistered NIDs
In the "unregistered" category (besides the experimental case, not described in this paper) there are bonafide NIDs that just haven't bothered to even explore the process of registration.The most prominent that comes to mind is 'hdl'. In the case of 'hdl', it has been speculated that this scheme has not been registered because it is not clear to the owners whether it should be registered as a URI scheme or as a URN namespace.
2.2.4 Registration Procedures for URN NIDs
[RFC 2611] describes the mechanism to obtain an NID for a URN namespace, which is registered with IANA.
A request for an NID should describe features including: structural characteristic of identifiers (for example, features relevant to caching/shortcuts approaches); specific character encoding rules (e.g., which character should be used for single-quotes); RFCs, standards, etc, that explains the namespace structure; identifier uniqueness considerations; delegation of assignment authority, including how to become an assigner of identifiers; identifier persistence considerations; quality of service considerations; process for identifier resolution; rules for lexical equivalence; any special considerations required for conforming with the URN syntax (particularly applicable in the case of legacy naming systems); validation mechanisms (determining whether a given string is currently a validly-assigned URN; and scope (for example,"United States social security numbers").
3 Additional URI Issues
There are additional unresolved URI issues, not considered by this paper, which we hope will be addressed by a follow-on effort. We have not attempted to completely enumerate these issues, however, they include (but are not limited to) the following:
-
The use of URIs as identifiers that don't actually identify network resources (for example they identify an abstract object such as an XML schema, or a physical object such as a book or even a person).
-
IRIs (International Resource Identifiers): the extension of URI syntax to non-ASCII.
4 Recommendations
We recommend the following:
-
The W3C and IETF should jointly develop and endorse a model for URIs, URLs and URNs consistent with the '"Contemporary View" described in section 1, and which considers the additional URI issues listed or alluded to in section 3.
-
RFCs such as 2717 ("Registration Procedures for URL Scheme Names") and 2718 ("Guidelines for new URL Schemes") should both be generalized to refer to "URI schemes" rather that "URL schemes" and, after refinement, moved forward as Best Current Practice in IETF.
-
The registration procedures for alternative trees should be clarified in RFC 2717.
-
Public but unregistered schemes should become registered, where possible. Obsolete schemes should be purged or clearly marked as obsolete.
-
IANA registration information should be updated:
-
Add 'urn' to the list of registered URI schemes with a pointer to the URN namespace registry.
-
Maintain status information about pending registrations (URI schemes and URN NID requests ).
-
Insure that it is clear that the page is the official registry, e.g., by adding a heading to the effect "This is the Official IANA Registry of URI Schemes".
-
A Acknowledgements
The participants in the URI Planning Interest Group are:
- Tony Coates
- Dan Connolly
- Diana Dack
- Leslie Daigle
- Ray Denenberg
- Martin Dürst
- Paul Grosso
- Sandro Hawke
- Renato Iannella
- Graham Klyne
- Larry Masinter
- Michael Mealling
- Mark Needleman
- Norman Walsh
B References
- RFC 2717
- IETF (Internet Engineering Task Force). Registration Procedures for URL Scheme Names, ed. R. Petke and I. King. 1999.
- RFC 2718
- IETF (Internet Engineering Task Force). Guidelines for new URL Schemes, ed. L. Masinter, H. Alvestrand, D. Zigmond, and R. Petke. 1999.
- RFC 2648
- IETF (Internet Engineering Task Force). A URN Namespace for IETF Documents ed. R. Moats. 1999.
- RFC 3043
- IETF (Internet Engineering Task Force). The Network Solutions Personal Internet Name (PIN): A URN Namespace for People and Organizations ed. M. Mealling. 2001.
- RFC 3044
- IETF (Internet Engineering Task Force). Using The ISSN (International Serial Standard Number) as URN (Uniform Resource Names) within an ISSN-URN Namespace ed. S. Rozenfeld. 2001.
- RFC 3061
- IETF (Internet Engineering Task Force). A URN Namespace of Object Identifiers ed. M. Mealling, 2001.
- RFC 3085
- IETF (Internet Engineering Task Force). URN Namespace for NewsML Resources ed. A. Coates, D. Allen, and D. Rivers-Moore. 2001.
- RFC 3121
- IETF (Internet Engineering Task Force). A URN Namespace for OASIS ed. K. Best and N. Walsh. 2001.
- RFC 3120
- IETF (Internet Engineering Task Force). A URN Namespace for XML.org ed. K. Best and N. Walsh. 2001.
- RFC 3151
- IETF (Internet Engineering Task Force). A URN Namespace for Public Identifiers, ed. N. Walsh, J. Cowan, and P. Grosso. 2001.
- RFC 2611
- IETF (Internet Engineering Task Force). URN Namespace Definition Mechanisms, ed. L. Daigle, R. Iannella, and P. Faltstrom. 1999.
- RFC 2396
- IETF (Internet Engineering Task Force). Uniform Resource Identifiers (URI): Generic Syntax, ed. T. Berners-Lee, R. Fielding, and L. Masinter. 1998.
- RFC 2276
- IETF (Internet Engineering Task Force). Architectural Principles of Uniform Resource Name Resolution, ed. K. Sollins. 1998.
- RFC 2616
- IETF (Internet Engineering Task Force). Hypertext Transfer Protocol -- HTTP/1.1, ed. R. Fielding, J. Gettys, J. Mogul, et. al. 1999.