Segments
Hierarchical schemes often interpret the path as a slash-delimited sequence of percent-encoded strings called segments. In this library the segments may be accessed using these separate, bidirectional view types which reference the underlying URL:
First we observe these invariants about paths and segments:
-
All URLs have a path
-
A path is absolute, relative, or empty
-
Paths starting with "/" are absolute
-
A relative path can never follow an authority
-
Every path maps to a unique range of segments, plus a
boolindicating if the path is absolute. -
Every range of segments, plus a
boolindicating if the path is absolute, maps to a unique path.
The following URL contains a path with three segments: "path", "to", and "file.txt":
http://www.example.com/path/to/file.txt
To understand the relationship between the path and segments,
we define this function segs which returns a list of
strings corresponding to the elements in a container of segments:
// code_container_4_1
In this table we show the result of invoking segs with
different paths. This demonstrates how the library achieves
the invariants described above for various interesting cases:
This implies that two paths may map to the same sequence of
segments . In the paths "usr" and "./usr", the "./"
is a prefix that might be necessary to maintain the
invariant that instances of url_view_base always
refer to valid URLs. Thus, both paths map to { "usr" }.
On the other hand, each sequence determines a unique path
for a given URL. For instance, setting the segments to
{"a"} would always map to either "./a" or "a", depending
on whether the "." prefix is necessary to keep the URL valid.
Sequences don’t iterate the leading "." when it’s
necessary to keep the URL valid. Thus, when we
assign { "x", "y", "z" } to segments, the sequence
always contains { "x", "y", "z" } after that. It
never contains { ".", "x", "y", "z" } because the "."
needed to be included.
In other words, the contents of the segment container
are authoritative, and the path string is a function
of them. Not vice-versa.
Library algorithms which modify individual segments of the
path or set the entire path attempt to behave consistently
with the behavior expected as if the operation was performed
on the equivalent sequence. If a path maps, say, to the three
element sequence { "a", "b", "c" } then erasing the middle
segment should result in the sequence { "a", "c" }. The
library always strives to do exactly what the caller requests;
however, in some cases this would result in either an invalid
URL, or a dramatic and unwanted change in the URL’s semantics.
For example consider the following URL:
url u = url().set_path( "kyle:xy" );
The library produces the URL string "kyle%3Axy" and not
"kyle:xy", because the latter would have an unintended scheme.
The table below demonstrates the results achieved by performing
various modifications to a URL containing a path:
For the full set of containers and functions for operating on paths and segments, please consult the reference.