containerd/reference/reference.go

142 lines
4 KiB
Go
Raw Normal View History

cmd/dist, remotes: simplify resolution flow After receiving feedback during containerd summit walk through of the pull POC, we found that the resolution flow for names was out of place. We could see this present in awkward places where we were trying to re-resolve whether something was a digest or a tag and extra retries to various endpoints. By centering this problem around, "what do we write in the metadata store?", the following interface comes about: ``` Resolve(ctx context.Context, ref string) (name string, desc ocispec.Descriptor, fetcher Fetcher, err error) ``` The above takes an "opaque" reference (we'll get to this later) and returns the canonical name for the object, a content description of the object and a `Fetcher` that can be used to retrieve the object and its child resources. We can write `name` into the metadata store, pointing at the descriptor. Descisions about discovery, trust, provenance, distribution are completely abstracted away from the pulling code. A first response to such a monstrosity is "that is a lot of return arguments". When we look at the actual, we can see that in practice, the usage pattern works well, albeit we don't quite demonstrate the utility of `name`, which will be more apparent later. Designs that allowed separate resolution of the `Fetcher` and the return of a collected object were considered. Let's give this a chance before we go refactoring this further. With this change, we introduce a reference package with helps for remotes to decompose "docker-esque" references into consituent components, without arbitrarily enforcing those opinions on the backend. Utlimately, the name and the reference used to qualify that name are completely opaque to containerd. Obviously, implementors will need to show some candor in following some conventions, but the possibilities are fairly wide. Structurally, we still maintain the concept of the locator and object but the interpretation is up to the resolver. For the most part, the `dist` tool operates exactly the same, except objects can be fetched with a reference: ``` dist fetch docker.io/library/redis:latest ``` The above should work well with a running containerd instance. I recommend giving this a try with `fetch-object`, as well. With `fetch-object`, it is easy for one to better understand the intricacies of the OCI/Docker image formats. Ultimately, this serves the main purpose of the elusive "metadata store". Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-03-08 00:51:08 +00:00
package reference
import (
"errors"
"fmt"
"net/url"
"path"
"regexp"
"strings"
digest "github.com/opencontainers/go-digest"
)
var (
ErrInvalid = errors.New("invalid reference")
ErrObjectRequired = errors.New("object required")
ErrHostnameRequired = errors.New("hostname required")
)
// Spec defines the main components of a reference specification.
//
// A reference specification is a schema-less URI parsed into common
// components. The two main components, locator and object, are required to be
// supported by remotes. It represents a superset of the naming define in
// docker's reference schema. It aims to be compatible but not prescriptive.
//
// While the interpretation of the components, locator and object, are up to
// the remote, we define a few common parts, accessible via helper methods.
//
// The first is the hostname, which is part of the locator. This doesn't need
// to map to a physical resource, but it must parse as a hostname. We refer to
// this as the namespace.
//
// The other component made accessible by helper method is the digest. This is
// part of the object identifier, always prefixed with an '@'. If present, the
// remote may use the digest portion directly or resolve it against a prefix.
// If the object does not include the `@` symbol, the return value for `Digest`
// will be empty.
cmd/dist, remotes: simplify resolution flow After receiving feedback during containerd summit walk through of the pull POC, we found that the resolution flow for names was out of place. We could see this present in awkward places where we were trying to re-resolve whether something was a digest or a tag and extra retries to various endpoints. By centering this problem around, "what do we write in the metadata store?", the following interface comes about: ``` Resolve(ctx context.Context, ref string) (name string, desc ocispec.Descriptor, fetcher Fetcher, err error) ``` The above takes an "opaque" reference (we'll get to this later) and returns the canonical name for the object, a content description of the object and a `Fetcher` that can be used to retrieve the object and its child resources. We can write `name` into the metadata store, pointing at the descriptor. Descisions about discovery, trust, provenance, distribution are completely abstracted away from the pulling code. A first response to such a monstrosity is "that is a lot of return arguments". When we look at the actual, we can see that in practice, the usage pattern works well, albeit we don't quite demonstrate the utility of `name`, which will be more apparent later. Designs that allowed separate resolution of the `Fetcher` and the return of a collected object were considered. Let's give this a chance before we go refactoring this further. With this change, we introduce a reference package with helps for remotes to decompose "docker-esque" references into consituent components, without arbitrarily enforcing those opinions on the backend. Utlimately, the name and the reference used to qualify that name are completely opaque to containerd. Obviously, implementors will need to show some candor in following some conventions, but the possibilities are fairly wide. Structurally, we still maintain the concept of the locator and object but the interpretation is up to the resolver. For the most part, the `dist` tool operates exactly the same, except objects can be fetched with a reference: ``` dist fetch docker.io/library/redis:latest ``` The above should work well with a running containerd instance. I recommend giving this a try with `fetch-object`, as well. With `fetch-object`, it is easy for one to better understand the intricacies of the OCI/Docker image formats. Ultimately, this serves the main purpose of the elusive "metadata store". Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-03-08 00:51:08 +00:00
type Spec struct {
// Locator is the host and path portion of the specification. The host
// portion may refer to an actual host or just a namespace of related
// images.
//
// Typically, the locator may used to resolve the remote to fetch specific
// resources.
Locator string
// Object contains the identifier for the remote resource. Classically,
// this is a tag but can refer to anything in a remote. By convention, any
// portion that may be a partial or whole digest will be preceeded by an
// `@`. Anything preceeding the `@` will be referred to as the "tag".
//
// In practice, we will see this broken down into the following formats:
//
// 1. <tag>
// 2. <tag>@<digest spec>
// 3. @<digest spec>
//
// We define the tag to be anything except '@' and ':'. <digest spec> may
// be a full valid digest or shortened version, possibly with elided
// algorithm.
Object string
}
var splitRe = regexp.MustCompile(`[:@]`)
// Parse parses the string into a structured ref.
func Parse(s string) (Spec, error) {
u, err := url.Parse("dummy://" + s)
if err != nil {
return Spec{}, err
}
if u.Scheme != "dummy" {
return Spec{}, ErrInvalid
}
if u.Host == "" {
return Spec{}, ErrHostnameRequired
}
parts := splitRe.Split(u.Path, 2)
if len(parts) < 2 {
return Spec{}, ErrObjectRequired
}
// This allows us to retain the @ to signify digests or shortend digests in
// the object.
object := u.Path[len(parts[0]):]
if object[:1] == ":" {
object = object[1:]
}
return Spec{
Locator: path.Join(u.Host, parts[0]),
Object: object,
}, nil
}
// Hostname returns the hostname portion of the locator.
//
// Remotes are not required to directly access the resources at this host. This
// method is provided for convenience.
func (r Spec) Hostname() string {
i := strings.Index(r.Locator, "/")
if i < 0 {
i = len(r.Locator) + 1
}
return r.Locator[:i]
}
// Digest returns the digest portion of the reference spec. This may be a
// partial or invalid digest, which may be used to lookup a complete digest.
func (r Spec) Digest() digest.Digest {
_, dgst := SplitObject(r.Object)
return dgst
}
// String returns the normalized string for the ref.
func (r Spec) String() string {
if r.Object[:1] == "@" {
return fmt.Sprintf("%v%v", r.Locator, r.Object)
}
return fmt.Sprintf("%v:%v", r.Locator, r.Object)
}
// SplitObject provides two parts of the object spec, delimiited by an `@`
// symbol.
//
// Either may be empty and it is the callers job to validate them
// appropriately.
func SplitObject(obj string) (tag string, dgst digest.Digest) {
parts := strings.SplitAfterN(obj, "@", 2)
if len(parts) < 2 {
return parts[0], ""
} else {
return parts[0], digest.Digest(parts[1])
}
}