containerd/cmd/dist/fetchobject.go

362 lines
9.3 KiB
Go
Raw Normal View History

cmd/dist: POC implementation of dist fetch With this changeset we introduce several new things. The first is the top-level dist command. This is a toolkit that implements various distribution primitives, such as fetching, unpacking and ingesting. The first component to this is a simple `fetch` command. It is a low-level command that takes a "remote", identified by a `locator`, and an object identifier. Keyed by the locator, this tool can identify a remote implementation to fetch the content and write it back to standard out. By allowing this to be the unit of pluggability in fetching content, we can have quite a bit of flexibility in how we retrieve images. The current `fetch` implementation provides anonymous access to docker hub images, through the namespace `docker.io`. As an example, one can fetch the manifest for `redis` with the following command: ``` $ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json ``` Note that we have provided a mediatype "hint", nudging the fetch implementation to grab the correct endpoint. We can hash the output of that to fetch the same content by digest: ``` $ ./dist fetch docker.io/library/redis sha256:$(./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | shasum -a256) ``` Note that the hint is now elided, since we have affixed the content to a particular hash. If you are not yet entertained, let's bring `jq` and `xargs` into the mix for maximum fun. The following incantation fetches the same manifest and downloads all layers into the convenience of `/dev/null`: ``` $ ./dist fetch docker.io/library/redis sha256:a027a470aa2b9b41cc2539847a97b8a14794ebd0a4c7c5d64e390df6bde56c73 | jq -r '.layers[] | .digest' | xargs -n1 -P10 ./dist fetch docker.io/library/redis > /dev/null ``` This is just the beginning. We should be able to centralize configuration around fetch to implement a number of distribution methodologies that have been challenging or impossible up to this point. The `locator`, mentioned earlier, is a schemaless URL that provides a host and path that can be used to resolve the remote. By dispatching on this common identifier, we should be able to support almost any protocol and discovery mechanism imaginable. When this is more solidified, we can roll these up into higher-level operations that can be orchestrated through the `dist` tool or via GRPC. What a time to be alive! Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-20 03:03:44 +00:00
package main
import (
contextpkg "context"
"encoding/json"
"fmt"
"io"
"io/ioutil"
"net/http"
"net/url"
"os"
"path"
cmd/dist, remotes: simplify resolution flow After receiving feedback during containerd summit walk through of the pull POC, we found that the resolution flow for names was out of place. We could see this present in awkward places where we were trying to re-resolve whether something was a digest or a tag and extra retries to various endpoints. By centering this problem around, "what do we write in the metadata store?", the following interface comes about: ``` Resolve(ctx context.Context, ref string) (name string, desc ocispec.Descriptor, fetcher Fetcher, err error) ``` The above takes an "opaque" reference (we'll get to this later) and returns the canonical name for the object, a content description of the object and a `Fetcher` that can be used to retrieve the object and its child resources. We can write `name` into the metadata store, pointing at the descriptor. Descisions about discovery, trust, provenance, distribution are completely abstracted away from the pulling code. A first response to such a monstrosity is "that is a lot of return arguments". When we look at the actual, we can see that in practice, the usage pattern works well, albeit we don't quite demonstrate the utility of `name`, which will be more apparent later. Designs that allowed separate resolution of the `Fetcher` and the return of a collected object were considered. Let's give this a chance before we go refactoring this further. With this change, we introduce a reference package with helps for remotes to decompose "docker-esque" references into consituent components, without arbitrarily enforcing those opinions on the backend. Utlimately, the name and the reference used to qualify that name are completely opaque to containerd. Obviously, implementors will need to show some candor in following some conventions, but the possibilities are fairly wide. Structurally, we still maintain the concept of the locator and object but the interpretation is up to the resolver. For the most part, the `dist` tool operates exactly the same, except objects can be fetched with a reference: ``` dist fetch docker.io/library/redis:latest ``` The above should work well with a running containerd instance. I recommend giving this a try with `fetch-object`, as well. With `fetch-object`, it is easy for one to better understand the intricacies of the OCI/Docker image formats. Ultimately, this serves the main purpose of the elusive "metadata store". Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-03-08 00:51:08 +00:00
"strconv"
cmd/dist: POC implementation of dist fetch With this changeset we introduce several new things. The first is the top-level dist command. This is a toolkit that implements various distribution primitives, such as fetching, unpacking and ingesting. The first component to this is a simple `fetch` command. It is a low-level command that takes a "remote", identified by a `locator`, and an object identifier. Keyed by the locator, this tool can identify a remote implementation to fetch the content and write it back to standard out. By allowing this to be the unit of pluggability in fetching content, we can have quite a bit of flexibility in how we retrieve images. The current `fetch` implementation provides anonymous access to docker hub images, through the namespace `docker.io`. As an example, one can fetch the manifest for `redis` with the following command: ``` $ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json ``` Note that we have provided a mediatype "hint", nudging the fetch implementation to grab the correct endpoint. We can hash the output of that to fetch the same content by digest: ``` $ ./dist fetch docker.io/library/redis sha256:$(./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | shasum -a256) ``` Note that the hint is now elided, since we have affixed the content to a particular hash. If you are not yet entertained, let's bring `jq` and `xargs` into the mix for maximum fun. The following incantation fetches the same manifest and downloads all layers into the convenience of `/dev/null`: ``` $ ./dist fetch docker.io/library/redis sha256:a027a470aa2b9b41cc2539847a97b8a14794ebd0a4c7c5d64e390df6bde56c73 | jq -r '.layers[] | .digest' | xargs -n1 -P10 ./dist fetch docker.io/library/redis > /dev/null ``` This is just the beginning. We should be able to centralize configuration around fetch to implement a number of distribution methodologies that have been challenging or impossible up to this point. The `locator`, mentioned earlier, is a schemaless URL that provides a host and path that can be used to resolve the remote. By dispatching on this common identifier, we should be able to support almost any protocol and discovery mechanism imaginable. When this is more solidified, we can roll these up into higher-level operations that can be orchestrated through the `dist` tool or via GRPC. What a time to be alive! Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-20 03:03:44 +00:00
"strings"
"github.com/Sirupsen/logrus"
"github.com/docker/containerd/log"
cmd/dist, remotes: simplify resolution flow After receiving feedback during containerd summit walk through of the pull POC, we found that the resolution flow for names was out of place. We could see this present in awkward places where we were trying to re-resolve whether something was a digest or a tag and extra retries to various endpoints. By centering this problem around, "what do we write in the metadata store?", the following interface comes about: ``` Resolve(ctx context.Context, ref string) (name string, desc ocispec.Descriptor, fetcher Fetcher, err error) ``` The above takes an "opaque" reference (we'll get to this later) and returns the canonical name for the object, a content description of the object and a `Fetcher` that can be used to retrieve the object and its child resources. We can write `name` into the metadata store, pointing at the descriptor. Descisions about discovery, trust, provenance, distribution are completely abstracted away from the pulling code. A first response to such a monstrosity is "that is a lot of return arguments". When we look at the actual, we can see that in practice, the usage pattern works well, albeit we don't quite demonstrate the utility of `name`, which will be more apparent later. Designs that allowed separate resolution of the `Fetcher` and the return of a collected object were considered. Let's give this a chance before we go refactoring this further. With this change, we introduce a reference package with helps for remotes to decompose "docker-esque" references into consituent components, without arbitrarily enforcing those opinions on the backend. Utlimately, the name and the reference used to qualify that name are completely opaque to containerd. Obviously, implementors will need to show some candor in following some conventions, but the possibilities are fairly wide. Structurally, we still maintain the concept of the locator and object but the interpretation is up to the resolver. For the most part, the `dist` tool operates exactly the same, except objects can be fetched with a reference: ``` dist fetch docker.io/library/redis:latest ``` The above should work well with a running containerd instance. I recommend giving this a try with `fetch-object`, as well. With `fetch-object`, it is easy for one to better understand the intricacies of the OCI/Docker image formats. Ultimately, this serves the main purpose of the elusive "metadata store". Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-03-08 00:51:08 +00:00
"github.com/docker/containerd/reference"
cmd/dist: POC implementation of dist fetch With this changeset we introduce several new things. The first is the top-level dist command. This is a toolkit that implements various distribution primitives, such as fetching, unpacking and ingesting. The first component to this is a simple `fetch` command. It is a low-level command that takes a "remote", identified by a `locator`, and an object identifier. Keyed by the locator, this tool can identify a remote implementation to fetch the content and write it back to standard out. By allowing this to be the unit of pluggability in fetching content, we can have quite a bit of flexibility in how we retrieve images. The current `fetch` implementation provides anonymous access to docker hub images, through the namespace `docker.io`. As an example, one can fetch the manifest for `redis` with the following command: ``` $ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json ``` Note that we have provided a mediatype "hint", nudging the fetch implementation to grab the correct endpoint. We can hash the output of that to fetch the same content by digest: ``` $ ./dist fetch docker.io/library/redis sha256:$(./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | shasum -a256) ``` Note that the hint is now elided, since we have affixed the content to a particular hash. If you are not yet entertained, let's bring `jq` and `xargs` into the mix for maximum fun. The following incantation fetches the same manifest and downloads all layers into the convenience of `/dev/null`: ``` $ ./dist fetch docker.io/library/redis sha256:a027a470aa2b9b41cc2539847a97b8a14794ebd0a4c7c5d64e390df6bde56c73 | jq -r '.layers[] | .digest' | xargs -n1 -P10 ./dist fetch docker.io/library/redis > /dev/null ``` This is just the beginning. We should be able to centralize configuration around fetch to implement a number of distribution methodologies that have been challenging or impossible up to this point. The `locator`, mentioned earlier, is a schemaless URL that provides a host and path that can be used to resolve the remote. By dispatching on this common identifier, we should be able to support almost any protocol and discovery mechanism imaginable. When this is more solidified, we can roll these up into higher-level operations that can be orchestrated through the `dist` tool or via GRPC. What a time to be alive! Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-20 03:03:44 +00:00
"github.com/docker/containerd/remotes"
digest "github.com/opencontainers/go-digest"
ocispec "github.com/opencontainers/image-spec/specs-go/v1"
cmd/dist: POC implementation of dist fetch With this changeset we introduce several new things. The first is the top-level dist command. This is a toolkit that implements various distribution primitives, such as fetching, unpacking and ingesting. The first component to this is a simple `fetch` command. It is a low-level command that takes a "remote", identified by a `locator`, and an object identifier. Keyed by the locator, this tool can identify a remote implementation to fetch the content and write it back to standard out. By allowing this to be the unit of pluggability in fetching content, we can have quite a bit of flexibility in how we retrieve images. The current `fetch` implementation provides anonymous access to docker hub images, through the namespace `docker.io`. As an example, one can fetch the manifest for `redis` with the following command: ``` $ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json ``` Note that we have provided a mediatype "hint", nudging the fetch implementation to grab the correct endpoint. We can hash the output of that to fetch the same content by digest: ``` $ ./dist fetch docker.io/library/redis sha256:$(./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | shasum -a256) ``` Note that the hint is now elided, since we have affixed the content to a particular hash. If you are not yet entertained, let's bring `jq` and `xargs` into the mix for maximum fun. The following incantation fetches the same manifest and downloads all layers into the convenience of `/dev/null`: ``` $ ./dist fetch docker.io/library/redis sha256:a027a470aa2b9b41cc2539847a97b8a14794ebd0a4c7c5d64e390df6bde56c73 | jq -r '.layers[] | .digest' | xargs -n1 -P10 ./dist fetch docker.io/library/redis > /dev/null ``` This is just the beginning. We should be able to centralize configuration around fetch to implement a number of distribution methodologies that have been challenging or impossible up to this point. The `locator`, mentioned earlier, is a schemaless URL that provides a host and path that can be used to resolve the remote. By dispatching on this common identifier, we should be able to support almost any protocol and discovery mechanism imaginable. When this is more solidified, we can roll these up into higher-level operations that can be orchestrated through the `dist` tool or via GRPC. What a time to be alive! Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-20 03:03:44 +00:00
"github.com/pkg/errors"
"github.com/urfave/cli"
"golang.org/x/net/context/ctxhttp"
)
const (
MediaTypeDockerSchema2Manifest = "application/vnd.docker.distribution.manifest.v2+json"
MediaTypeDockerSchema2ManifestList = "application/vnd.docker.distribution.manifest.list.v2+json"
)
cmd/dist: POC implementation of dist fetch With this changeset we introduce several new things. The first is the top-level dist command. This is a toolkit that implements various distribution primitives, such as fetching, unpacking and ingesting. The first component to this is a simple `fetch` command. It is a low-level command that takes a "remote", identified by a `locator`, and an object identifier. Keyed by the locator, this tool can identify a remote implementation to fetch the content and write it back to standard out. By allowing this to be the unit of pluggability in fetching content, we can have quite a bit of flexibility in how we retrieve images. The current `fetch` implementation provides anonymous access to docker hub images, through the namespace `docker.io`. As an example, one can fetch the manifest for `redis` with the following command: ``` $ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json ``` Note that we have provided a mediatype "hint", nudging the fetch implementation to grab the correct endpoint. We can hash the output of that to fetch the same content by digest: ``` $ ./dist fetch docker.io/library/redis sha256:$(./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | shasum -a256) ``` Note that the hint is now elided, since we have affixed the content to a particular hash. If you are not yet entertained, let's bring `jq` and `xargs` into the mix for maximum fun. The following incantation fetches the same manifest and downloads all layers into the convenience of `/dev/null`: ``` $ ./dist fetch docker.io/library/redis sha256:a027a470aa2b9b41cc2539847a97b8a14794ebd0a4c7c5d64e390df6bde56c73 | jq -r '.layers[] | .digest' | xargs -n1 -P10 ./dist fetch docker.io/library/redis > /dev/null ``` This is just the beginning. We should be able to centralize configuration around fetch to implement a number of distribution methodologies that have been challenging or impossible up to this point. The `locator`, mentioned earlier, is a schemaless URL that provides a host and path that can be used to resolve the remote. By dispatching on this common identifier, we should be able to support almost any protocol and discovery mechanism imaginable. When this is more solidified, we can roll these up into higher-level operations that can be orchestrated through the `dist` tool or via GRPC. What a time to be alive! Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-20 03:03:44 +00:00
// TODO(stevvooe): Create "multi-fetch" mode that just takes a remote
// then receives object/hint lines on stdin, returning content as
// needed.
var fetchObjectCommand = cli.Command{
Name: "fetch-object",
cmd/dist: POC implementation of dist fetch With this changeset we introduce several new things. The first is the top-level dist command. This is a toolkit that implements various distribution primitives, such as fetching, unpacking and ingesting. The first component to this is a simple `fetch` command. It is a low-level command that takes a "remote", identified by a `locator`, and an object identifier. Keyed by the locator, this tool can identify a remote implementation to fetch the content and write it back to standard out. By allowing this to be the unit of pluggability in fetching content, we can have quite a bit of flexibility in how we retrieve images. The current `fetch` implementation provides anonymous access to docker hub images, through the namespace `docker.io`. As an example, one can fetch the manifest for `redis` with the following command: ``` $ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json ``` Note that we have provided a mediatype "hint", nudging the fetch implementation to grab the correct endpoint. We can hash the output of that to fetch the same content by digest: ``` $ ./dist fetch docker.io/library/redis sha256:$(./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | shasum -a256) ``` Note that the hint is now elided, since we have affixed the content to a particular hash. If you are not yet entertained, let's bring `jq` and `xargs` into the mix for maximum fun. The following incantation fetches the same manifest and downloads all layers into the convenience of `/dev/null`: ``` $ ./dist fetch docker.io/library/redis sha256:a027a470aa2b9b41cc2539847a97b8a14794ebd0a4c7c5d64e390df6bde56c73 | jq -r '.layers[] | .digest' | xargs -n1 -P10 ./dist fetch docker.io/library/redis > /dev/null ``` This is just the beginning. We should be able to centralize configuration around fetch to implement a number of distribution methodologies that have been challenging or impossible up to this point. The `locator`, mentioned earlier, is a schemaless URL that provides a host and path that can be used to resolve the remote. By dispatching on this common identifier, we should be able to support almost any protocol and discovery mechanism imaginable. When this is more solidified, we can roll these up into higher-level operations that can be orchestrated through the `dist` tool or via GRPC. What a time to be alive! Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-20 03:03:44 +00:00
Usage: "retrieve objects from a remote",
ArgsUsage: "[flags] <remote> <object> [<hint>, ...]",
Description: `Fetch objects by identifier from a remote.`,
Flags: []cli.Flag{
cli.DurationFlag{
Name: "timeout",
Usage: "total timeout for fetch",
EnvVar: "CONTAINERD_FETCH_TIMEOUT",
},
},
Action: func(context *cli.Context) error {
var (
ctx = background
cmd/dist: POC implementation of dist fetch With this changeset we introduce several new things. The first is the top-level dist command. This is a toolkit that implements various distribution primitives, such as fetching, unpacking and ingesting. The first component to this is a simple `fetch` command. It is a low-level command that takes a "remote", identified by a `locator`, and an object identifier. Keyed by the locator, this tool can identify a remote implementation to fetch the content and write it back to standard out. By allowing this to be the unit of pluggability in fetching content, we can have quite a bit of flexibility in how we retrieve images. The current `fetch` implementation provides anonymous access to docker hub images, through the namespace `docker.io`. As an example, one can fetch the manifest for `redis` with the following command: ``` $ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json ``` Note that we have provided a mediatype "hint", nudging the fetch implementation to grab the correct endpoint. We can hash the output of that to fetch the same content by digest: ``` $ ./dist fetch docker.io/library/redis sha256:$(./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | shasum -a256) ``` Note that the hint is now elided, since we have affixed the content to a particular hash. If you are not yet entertained, let's bring `jq` and `xargs` into the mix for maximum fun. The following incantation fetches the same manifest and downloads all layers into the convenience of `/dev/null`: ``` $ ./dist fetch docker.io/library/redis sha256:a027a470aa2b9b41cc2539847a97b8a14794ebd0a4c7c5d64e390df6bde56c73 | jq -r '.layers[] | .digest' | xargs -n1 -P10 ./dist fetch docker.io/library/redis > /dev/null ``` This is just the beginning. We should be able to centralize configuration around fetch to implement a number of distribution methodologies that have been challenging or impossible up to this point. The `locator`, mentioned earlier, is a schemaless URL that provides a host and path that can be used to resolve the remote. By dispatching on this common identifier, we should be able to support almost any protocol and discovery mechanism imaginable. When this is more solidified, we can roll these up into higher-level operations that can be orchestrated through the `dist` tool or via GRPC. What a time to be alive! Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-20 03:03:44 +00:00
timeout = context.Duration("timeout")
cmd/dist, remotes: simplify resolution flow After receiving feedback during containerd summit walk through of the pull POC, we found that the resolution flow for names was out of place. We could see this present in awkward places where we were trying to re-resolve whether something was a digest or a tag and extra retries to various endpoints. By centering this problem around, "what do we write in the metadata store?", the following interface comes about: ``` Resolve(ctx context.Context, ref string) (name string, desc ocispec.Descriptor, fetcher Fetcher, err error) ``` The above takes an "opaque" reference (we'll get to this later) and returns the canonical name for the object, a content description of the object and a `Fetcher` that can be used to retrieve the object and its child resources. We can write `name` into the metadata store, pointing at the descriptor. Descisions about discovery, trust, provenance, distribution are completely abstracted away from the pulling code. A first response to such a monstrosity is "that is a lot of return arguments". When we look at the actual, we can see that in practice, the usage pattern works well, albeit we don't quite demonstrate the utility of `name`, which will be more apparent later. Designs that allowed separate resolution of the `Fetcher` and the return of a collected object were considered. Let's give this a chance before we go refactoring this further. With this change, we introduce a reference package with helps for remotes to decompose "docker-esque" references into consituent components, without arbitrarily enforcing those opinions on the backend. Utlimately, the name and the reference used to qualify that name are completely opaque to containerd. Obviously, implementors will need to show some candor in following some conventions, but the possibilities are fairly wide. Structurally, we still maintain the concept of the locator and object but the interpretation is up to the resolver. For the most part, the `dist` tool operates exactly the same, except objects can be fetched with a reference: ``` dist fetch docker.io/library/redis:latest ``` The above should work well with a running containerd instance. I recommend giving this a try with `fetch-object`, as well. With `fetch-object`, it is easy for one to better understand the intricacies of the OCI/Docker image formats. Ultimately, this serves the main purpose of the elusive "metadata store". Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-03-08 00:51:08 +00:00
ref = context.Args().First()
cmd/dist: POC implementation of dist fetch With this changeset we introduce several new things. The first is the top-level dist command. This is a toolkit that implements various distribution primitives, such as fetching, unpacking and ingesting. The first component to this is a simple `fetch` command. It is a low-level command that takes a "remote", identified by a `locator`, and an object identifier. Keyed by the locator, this tool can identify a remote implementation to fetch the content and write it back to standard out. By allowing this to be the unit of pluggability in fetching content, we can have quite a bit of flexibility in how we retrieve images. The current `fetch` implementation provides anonymous access to docker hub images, through the namespace `docker.io`. As an example, one can fetch the manifest for `redis` with the following command: ``` $ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json ``` Note that we have provided a mediatype "hint", nudging the fetch implementation to grab the correct endpoint. We can hash the output of that to fetch the same content by digest: ``` $ ./dist fetch docker.io/library/redis sha256:$(./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | shasum -a256) ``` Note that the hint is now elided, since we have affixed the content to a particular hash. If you are not yet entertained, let's bring `jq` and `xargs` into the mix for maximum fun. The following incantation fetches the same manifest and downloads all layers into the convenience of `/dev/null`: ``` $ ./dist fetch docker.io/library/redis sha256:a027a470aa2b9b41cc2539847a97b8a14794ebd0a4c7c5d64e390df6bde56c73 | jq -r '.layers[] | .digest' | xargs -n1 -P10 ./dist fetch docker.io/library/redis > /dev/null ``` This is just the beginning. We should be able to centralize configuration around fetch to implement a number of distribution methodologies that have been challenging or impossible up to this point. The `locator`, mentioned earlier, is a schemaless URL that provides a host and path that can be used to resolve the remote. By dispatching on this common identifier, we should be able to support almost any protocol and discovery mechanism imaginable. When this is more solidified, we can roll these up into higher-level operations that can be orchestrated through the `dist` tool or via GRPC. What a time to be alive! Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-20 03:03:44 +00:00
)
if timeout > 0 {
var cancel func()
ctx, cancel = contextpkg.WithTimeout(ctx, timeout)
defer cancel()
}
resolver, err := getResolver(ctx)
if err != nil {
return err
}
cmd/dist, remotes: simplify resolution flow After receiving feedback during containerd summit walk through of the pull POC, we found that the resolution flow for names was out of place. We could see this present in awkward places where we were trying to re-resolve whether something was a digest or a tag and extra retries to various endpoints. By centering this problem around, "what do we write in the metadata store?", the following interface comes about: ``` Resolve(ctx context.Context, ref string) (name string, desc ocispec.Descriptor, fetcher Fetcher, err error) ``` The above takes an "opaque" reference (we'll get to this later) and returns the canonical name for the object, a content description of the object and a `Fetcher` that can be used to retrieve the object and its child resources. We can write `name` into the metadata store, pointing at the descriptor. Descisions about discovery, trust, provenance, distribution are completely abstracted away from the pulling code. A first response to such a monstrosity is "that is a lot of return arguments". When we look at the actual, we can see that in practice, the usage pattern works well, albeit we don't quite demonstrate the utility of `name`, which will be more apparent later. Designs that allowed separate resolution of the `Fetcher` and the return of a collected object were considered. Let's give this a chance before we go refactoring this further. With this change, we introduce a reference package with helps for remotes to decompose "docker-esque" references into consituent components, without arbitrarily enforcing those opinions on the backend. Utlimately, the name and the reference used to qualify that name are completely opaque to containerd. Obviously, implementors will need to show some candor in following some conventions, but the possibilities are fairly wide. Structurally, we still maintain the concept of the locator and object but the interpretation is up to the resolver. For the most part, the `dist` tool operates exactly the same, except objects can be fetched with a reference: ``` dist fetch docker.io/library/redis:latest ``` The above should work well with a running containerd instance. I recommend giving this a try with `fetch-object`, as well. With `fetch-object`, it is easy for one to better understand the intricacies of the OCI/Docker image formats. Ultimately, this serves the main purpose of the elusive "metadata store". Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-03-08 00:51:08 +00:00
ctx = log.WithLogger(ctx, log.G(ctx).WithField("ref", ref))
log.G(ctx).Infof("resolving")
_, desc, fetcher, err := resolver.Resolve(ctx, ref)
cmd/dist: POC implementation of dist fetch With this changeset we introduce several new things. The first is the top-level dist command. This is a toolkit that implements various distribution primitives, such as fetching, unpacking and ingesting. The first component to this is a simple `fetch` command. It is a low-level command that takes a "remote", identified by a `locator`, and an object identifier. Keyed by the locator, this tool can identify a remote implementation to fetch the content and write it back to standard out. By allowing this to be the unit of pluggability in fetching content, we can have quite a bit of flexibility in how we retrieve images. The current `fetch` implementation provides anonymous access to docker hub images, through the namespace `docker.io`. As an example, one can fetch the manifest for `redis` with the following command: ``` $ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json ``` Note that we have provided a mediatype "hint", nudging the fetch implementation to grab the correct endpoint. We can hash the output of that to fetch the same content by digest: ``` $ ./dist fetch docker.io/library/redis sha256:$(./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | shasum -a256) ``` Note that the hint is now elided, since we have affixed the content to a particular hash. If you are not yet entertained, let's bring `jq` and `xargs` into the mix for maximum fun. The following incantation fetches the same manifest and downloads all layers into the convenience of `/dev/null`: ``` $ ./dist fetch docker.io/library/redis sha256:a027a470aa2b9b41cc2539847a97b8a14794ebd0a4c7c5d64e390df6bde56c73 | jq -r '.layers[] | .digest' | xargs -n1 -P10 ./dist fetch docker.io/library/redis > /dev/null ``` This is just the beginning. We should be able to centralize configuration around fetch to implement a number of distribution methodologies that have been challenging or impossible up to this point. The `locator`, mentioned earlier, is a schemaless URL that provides a host and path that can be used to resolve the remote. By dispatching on this common identifier, we should be able to support almost any protocol and discovery mechanism imaginable. When this is more solidified, we can roll these up into higher-level operations that can be orchestrated through the `dist` tool or via GRPC. What a time to be alive! Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-20 03:03:44 +00:00
if err != nil {
return err
}
log.G(ctx).Infof("fetching")
cmd/dist, remotes: simplify resolution flow After receiving feedback during containerd summit walk through of the pull POC, we found that the resolution flow for names was out of place. We could see this present in awkward places where we were trying to re-resolve whether something was a digest or a tag and extra retries to various endpoints. By centering this problem around, "what do we write in the metadata store?", the following interface comes about: ``` Resolve(ctx context.Context, ref string) (name string, desc ocispec.Descriptor, fetcher Fetcher, err error) ``` The above takes an "opaque" reference (we'll get to this later) and returns the canonical name for the object, a content description of the object and a `Fetcher` that can be used to retrieve the object and its child resources. We can write `name` into the metadata store, pointing at the descriptor. Descisions about discovery, trust, provenance, distribution are completely abstracted away from the pulling code. A first response to such a monstrosity is "that is a lot of return arguments". When we look at the actual, we can see that in practice, the usage pattern works well, albeit we don't quite demonstrate the utility of `name`, which will be more apparent later. Designs that allowed separate resolution of the `Fetcher` and the return of a collected object were considered. Let's give this a chance before we go refactoring this further. With this change, we introduce a reference package with helps for remotes to decompose "docker-esque" references into consituent components, without arbitrarily enforcing those opinions on the backend. Utlimately, the name and the reference used to qualify that name are completely opaque to containerd. Obviously, implementors will need to show some candor in following some conventions, but the possibilities are fairly wide. Structurally, we still maintain the concept of the locator and object but the interpretation is up to the resolver. For the most part, the `dist` tool operates exactly the same, except objects can be fetched with a reference: ``` dist fetch docker.io/library/redis:latest ``` The above should work well with a running containerd instance. I recommend giving this a try with `fetch-object`, as well. With `fetch-object`, it is easy for one to better understand the intricacies of the OCI/Docker image formats. Ultimately, this serves the main purpose of the elusive "metadata store". Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-03-08 00:51:08 +00:00
rc, err := fetcher.Fetch(ctx, desc)
cmd/dist: POC implementation of dist fetch With this changeset we introduce several new things. The first is the top-level dist command. This is a toolkit that implements various distribution primitives, such as fetching, unpacking and ingesting. The first component to this is a simple `fetch` command. It is a low-level command that takes a "remote", identified by a `locator`, and an object identifier. Keyed by the locator, this tool can identify a remote implementation to fetch the content and write it back to standard out. By allowing this to be the unit of pluggability in fetching content, we can have quite a bit of flexibility in how we retrieve images. The current `fetch` implementation provides anonymous access to docker hub images, through the namespace `docker.io`. As an example, one can fetch the manifest for `redis` with the following command: ``` $ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json ``` Note that we have provided a mediatype "hint", nudging the fetch implementation to grab the correct endpoint. We can hash the output of that to fetch the same content by digest: ``` $ ./dist fetch docker.io/library/redis sha256:$(./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | shasum -a256) ``` Note that the hint is now elided, since we have affixed the content to a particular hash. If you are not yet entertained, let's bring `jq` and `xargs` into the mix for maximum fun. The following incantation fetches the same manifest and downloads all layers into the convenience of `/dev/null`: ``` $ ./dist fetch docker.io/library/redis sha256:a027a470aa2b9b41cc2539847a97b8a14794ebd0a4c7c5d64e390df6bde56c73 | jq -r '.layers[] | .digest' | xargs -n1 -P10 ./dist fetch docker.io/library/redis > /dev/null ``` This is just the beginning. We should be able to centralize configuration around fetch to implement a number of distribution methodologies that have been challenging or impossible up to this point. The `locator`, mentioned earlier, is a schemaless URL that provides a host and path that can be used to resolve the remote. By dispatching on this common identifier, we should be able to support almost any protocol and discovery mechanism imaginable. When this is more solidified, we can roll these up into higher-level operations that can be orchestrated through the `dist` tool or via GRPC. What a time to be alive! Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-20 03:03:44 +00:00
if err != nil {
return err
}
defer rc.Close()
if _, err := io.Copy(os.Stdout, rc); err != nil {
return err
}
return nil
},
}
cmd/dist, remotes: simplify resolution flow After receiving feedback during containerd summit walk through of the pull POC, we found that the resolution flow for names was out of place. We could see this present in awkward places where we were trying to re-resolve whether something was a digest or a tag and extra retries to various endpoints. By centering this problem around, "what do we write in the metadata store?", the following interface comes about: ``` Resolve(ctx context.Context, ref string) (name string, desc ocispec.Descriptor, fetcher Fetcher, err error) ``` The above takes an "opaque" reference (we'll get to this later) and returns the canonical name for the object, a content description of the object and a `Fetcher` that can be used to retrieve the object and its child resources. We can write `name` into the metadata store, pointing at the descriptor. Descisions about discovery, trust, provenance, distribution are completely abstracted away from the pulling code. A first response to such a monstrosity is "that is a lot of return arguments". When we look at the actual, we can see that in practice, the usage pattern works well, albeit we don't quite demonstrate the utility of `name`, which will be more apparent later. Designs that allowed separate resolution of the `Fetcher` and the return of a collected object were considered. Let's give this a chance before we go refactoring this further. With this change, we introduce a reference package with helps for remotes to decompose "docker-esque" references into consituent components, without arbitrarily enforcing those opinions on the backend. Utlimately, the name and the reference used to qualify that name are completely opaque to containerd. Obviously, implementors will need to show some candor in following some conventions, but the possibilities are fairly wide. Structurally, we still maintain the concept of the locator and object but the interpretation is up to the resolver. For the most part, the `dist` tool operates exactly the same, except objects can be fetched with a reference: ``` dist fetch docker.io/library/redis:latest ``` The above should work well with a running containerd instance. I recommend giving this a try with `fetch-object`, as well. With `fetch-object`, it is easy for one to better understand the intricacies of the OCI/Docker image formats. Ultimately, this serves the main purpose of the elusive "metadata store". Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-03-08 00:51:08 +00:00
// getResolver prepares the resolver from the environment and options.
func getResolver(ctx contextpkg.Context) (remotes.Resolver, error) {
return newDockerResolver(), nil
}
cmd/dist: POC implementation of dist fetch With this changeset we introduce several new things. The first is the top-level dist command. This is a toolkit that implements various distribution primitives, such as fetching, unpacking and ingesting. The first component to this is a simple `fetch` command. It is a low-level command that takes a "remote", identified by a `locator`, and an object identifier. Keyed by the locator, this tool can identify a remote implementation to fetch the content and write it back to standard out. By allowing this to be the unit of pluggability in fetching content, we can have quite a bit of flexibility in how we retrieve images. The current `fetch` implementation provides anonymous access to docker hub images, through the namespace `docker.io`. As an example, one can fetch the manifest for `redis` with the following command: ``` $ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json ``` Note that we have provided a mediatype "hint", nudging the fetch implementation to grab the correct endpoint. We can hash the output of that to fetch the same content by digest: ``` $ ./dist fetch docker.io/library/redis sha256:$(./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | shasum -a256) ``` Note that the hint is now elided, since we have affixed the content to a particular hash. If you are not yet entertained, let's bring `jq` and `xargs` into the mix for maximum fun. The following incantation fetches the same manifest and downloads all layers into the convenience of `/dev/null`: ``` $ ./dist fetch docker.io/library/redis sha256:a027a470aa2b9b41cc2539847a97b8a14794ebd0a4c7c5d64e390df6bde56c73 | jq -r '.layers[] | .digest' | xargs -n1 -P10 ./dist fetch docker.io/library/redis > /dev/null ``` This is just the beginning. We should be able to centralize configuration around fetch to implement a number of distribution methodologies that have been challenging or impossible up to this point. The `locator`, mentioned earlier, is a schemaless URL that provides a host and path that can be used to resolve the remote. By dispatching on this common identifier, we should be able to support almost any protocol and discovery mechanism imaginable. When this is more solidified, we can roll these up into higher-level operations that can be orchestrated through the `dist` tool or via GRPC. What a time to be alive! Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-20 03:03:44 +00:00
// NOTE(stevvooe): Most of the code below this point is prototype code to
// demonstrate a very simplified docker.io fetcher. We have a lot of hard coded
// values but we leave many of the details down to the fetcher, creating a lot
// of room for ways to fetch content.
cmd/dist, remotes: simplify resolution flow After receiving feedback during containerd summit walk through of the pull POC, we found that the resolution flow for names was out of place. We could see this present in awkward places where we were trying to re-resolve whether something was a digest or a tag and extra retries to various endpoints. By centering this problem around, "what do we write in the metadata store?", the following interface comes about: ``` Resolve(ctx context.Context, ref string) (name string, desc ocispec.Descriptor, fetcher Fetcher, err error) ``` The above takes an "opaque" reference (we'll get to this later) and returns the canonical name for the object, a content description of the object and a `Fetcher` that can be used to retrieve the object and its child resources. We can write `name` into the metadata store, pointing at the descriptor. Descisions about discovery, trust, provenance, distribution are completely abstracted away from the pulling code. A first response to such a monstrosity is "that is a lot of return arguments". When we look at the actual, we can see that in practice, the usage pattern works well, albeit we don't quite demonstrate the utility of `name`, which will be more apparent later. Designs that allowed separate resolution of the `Fetcher` and the return of a collected object were considered. Let's give this a chance before we go refactoring this further. With this change, we introduce a reference package with helps for remotes to decompose "docker-esque" references into consituent components, without arbitrarily enforcing those opinions on the backend. Utlimately, the name and the reference used to qualify that name are completely opaque to containerd. Obviously, implementors will need to show some candor in following some conventions, but the possibilities are fairly wide. Structurally, we still maintain the concept of the locator and object but the interpretation is up to the resolver. For the most part, the `dist` tool operates exactly the same, except objects can be fetched with a reference: ``` dist fetch docker.io/library/redis:latest ``` The above should work well with a running containerd instance. I recommend giving this a try with `fetch-object`, as well. With `fetch-object`, it is easy for one to better understand the intricacies of the OCI/Docker image formats. Ultimately, this serves the main purpose of the elusive "metadata store". Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-03-08 00:51:08 +00:00
type dockerFetcher struct {
base url.URL
token string
}
cmd/dist: POC implementation of dist fetch With this changeset we introduce several new things. The first is the top-level dist command. This is a toolkit that implements various distribution primitives, such as fetching, unpacking and ingesting. The first component to this is a simple `fetch` command. It is a low-level command that takes a "remote", identified by a `locator`, and an object identifier. Keyed by the locator, this tool can identify a remote implementation to fetch the content and write it back to standard out. By allowing this to be the unit of pluggability in fetching content, we can have quite a bit of flexibility in how we retrieve images. The current `fetch` implementation provides anonymous access to docker hub images, through the namespace `docker.io`. As an example, one can fetch the manifest for `redis` with the following command: ``` $ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json ``` Note that we have provided a mediatype "hint", nudging the fetch implementation to grab the correct endpoint. We can hash the output of that to fetch the same content by digest: ``` $ ./dist fetch docker.io/library/redis sha256:$(./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | shasum -a256) ``` Note that the hint is now elided, since we have affixed the content to a particular hash. If you are not yet entertained, let's bring `jq` and `xargs` into the mix for maximum fun. The following incantation fetches the same manifest and downloads all layers into the convenience of `/dev/null`: ``` $ ./dist fetch docker.io/library/redis sha256:a027a470aa2b9b41cc2539847a97b8a14794ebd0a4c7c5d64e390df6bde56c73 | jq -r '.layers[] | .digest' | xargs -n1 -P10 ./dist fetch docker.io/library/redis > /dev/null ``` This is just the beginning. We should be able to centralize configuration around fetch to implement a number of distribution methodologies that have been challenging or impossible up to this point. The `locator`, mentioned earlier, is a schemaless URL that provides a host and path that can be used to resolve the remote. By dispatching on this common identifier, we should be able to support almost any protocol and discovery mechanism imaginable. When this is more solidified, we can roll these up into higher-level operations that can be orchestrated through the `dist` tool or via GRPC. What a time to be alive! Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-20 03:03:44 +00:00
cmd/dist, remotes: simplify resolution flow After receiving feedback during containerd summit walk through of the pull POC, we found that the resolution flow for names was out of place. We could see this present in awkward places where we were trying to re-resolve whether something was a digest or a tag and extra retries to various endpoints. By centering this problem around, "what do we write in the metadata store?", the following interface comes about: ``` Resolve(ctx context.Context, ref string) (name string, desc ocispec.Descriptor, fetcher Fetcher, err error) ``` The above takes an "opaque" reference (we'll get to this later) and returns the canonical name for the object, a content description of the object and a `Fetcher` that can be used to retrieve the object and its child resources. We can write `name` into the metadata store, pointing at the descriptor. Descisions about discovery, trust, provenance, distribution are completely abstracted away from the pulling code. A first response to such a monstrosity is "that is a lot of return arguments". When we look at the actual, we can see that in practice, the usage pattern works well, albeit we don't quite demonstrate the utility of `name`, which will be more apparent later. Designs that allowed separate resolution of the `Fetcher` and the return of a collected object were considered. Let's give this a chance before we go refactoring this further. With this change, we introduce a reference package with helps for remotes to decompose "docker-esque" references into consituent components, without arbitrarily enforcing those opinions on the backend. Utlimately, the name and the reference used to qualify that name are completely opaque to containerd. Obviously, implementors will need to show some candor in following some conventions, but the possibilities are fairly wide. Structurally, we still maintain the concept of the locator and object but the interpretation is up to the resolver. For the most part, the `dist` tool operates exactly the same, except objects can be fetched with a reference: ``` dist fetch docker.io/library/redis:latest ``` The above should work well with a running containerd instance. I recommend giving this a try with `fetch-object`, as well. With `fetch-object`, it is easy for one to better understand the intricacies of the OCI/Docker image formats. Ultimately, this serves the main purpose of the elusive "metadata store". Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-03-08 00:51:08 +00:00
func (r *dockerFetcher) url(ps ...string) string {
url := r.base
url.Path = path.Join(url.Path, path.Join(ps...))
return url.String()
}
func (r *dockerFetcher) Fetch(ctx contextpkg.Context, desc ocispec.Descriptor) (io.ReadCloser, error) {
ctx = log.WithLogger(ctx, log.G(ctx).WithFields(
logrus.Fields{
"base": r.base.String(),
"digest": desc.Digest,
},
))
paths, err := getV2URLPaths(desc)
if err != nil {
return nil, err
}
for _, path := range paths {
u := r.url(path)
cmd/dist: POC implementation of dist fetch With this changeset we introduce several new things. The first is the top-level dist command. This is a toolkit that implements various distribution primitives, such as fetching, unpacking and ingesting. The first component to this is a simple `fetch` command. It is a low-level command that takes a "remote", identified by a `locator`, and an object identifier. Keyed by the locator, this tool can identify a remote implementation to fetch the content and write it back to standard out. By allowing this to be the unit of pluggability in fetching content, we can have quite a bit of flexibility in how we retrieve images. The current `fetch` implementation provides anonymous access to docker hub images, through the namespace `docker.io`. As an example, one can fetch the manifest for `redis` with the following command: ``` $ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json ``` Note that we have provided a mediatype "hint", nudging the fetch implementation to grab the correct endpoint. We can hash the output of that to fetch the same content by digest: ``` $ ./dist fetch docker.io/library/redis sha256:$(./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | shasum -a256) ``` Note that the hint is now elided, since we have affixed the content to a particular hash. If you are not yet entertained, let's bring `jq` and `xargs` into the mix for maximum fun. The following incantation fetches the same manifest and downloads all layers into the convenience of `/dev/null`: ``` $ ./dist fetch docker.io/library/redis sha256:a027a470aa2b9b41cc2539847a97b8a14794ebd0a4c7c5d64e390df6bde56c73 | jq -r '.layers[] | .digest' | xargs -n1 -P10 ./dist fetch docker.io/library/redis > /dev/null ``` This is just the beginning. We should be able to centralize configuration around fetch to implement a number of distribution methodologies that have been challenging or impossible up to this point. The `locator`, mentioned earlier, is a schemaless URL that provides a host and path that can be used to resolve the remote. By dispatching on this common identifier, we should be able to support almost any protocol and discovery mechanism imaginable. When this is more solidified, we can roll these up into higher-level operations that can be orchestrated through the `dist` tool or via GRPC. What a time to be alive! Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-20 03:03:44 +00:00
cmd/dist, remotes: simplify resolution flow After receiving feedback during containerd summit walk through of the pull POC, we found that the resolution flow for names was out of place. We could see this present in awkward places where we were trying to re-resolve whether something was a digest or a tag and extra retries to various endpoints. By centering this problem around, "what do we write in the metadata store?", the following interface comes about: ``` Resolve(ctx context.Context, ref string) (name string, desc ocispec.Descriptor, fetcher Fetcher, err error) ``` The above takes an "opaque" reference (we'll get to this later) and returns the canonical name for the object, a content description of the object and a `Fetcher` that can be used to retrieve the object and its child resources. We can write `name` into the metadata store, pointing at the descriptor. Descisions about discovery, trust, provenance, distribution are completely abstracted away from the pulling code. A first response to such a monstrosity is "that is a lot of return arguments". When we look at the actual, we can see that in practice, the usage pattern works well, albeit we don't quite demonstrate the utility of `name`, which will be more apparent later. Designs that allowed separate resolution of the `Fetcher` and the return of a collected object were considered. Let's give this a chance before we go refactoring this further. With this change, we introduce a reference package with helps for remotes to decompose "docker-esque" references into consituent components, without arbitrarily enforcing those opinions on the backend. Utlimately, the name and the reference used to qualify that name are completely opaque to containerd. Obviously, implementors will need to show some candor in following some conventions, but the possibilities are fairly wide. Structurally, we still maintain the concept of the locator and object but the interpretation is up to the resolver. For the most part, the `dist` tool operates exactly the same, except objects can be fetched with a reference: ``` dist fetch docker.io/library/redis:latest ``` The above should work well with a running containerd instance. I recommend giving this a try with `fetch-object`, as well. With `fetch-object`, it is easy for one to better understand the intricacies of the OCI/Docker image formats. Ultimately, this serves the main purpose of the elusive "metadata store". Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-03-08 00:51:08 +00:00
req, err := http.NewRequest(http.MethodGet, u, nil)
if err != nil {
return nil, err
}
cmd/dist, remotes: simplify resolution flow After receiving feedback during containerd summit walk through of the pull POC, we found that the resolution flow for names was out of place. We could see this present in awkward places where we were trying to re-resolve whether something was a digest or a tag and extra retries to various endpoints. By centering this problem around, "what do we write in the metadata store?", the following interface comes about: ``` Resolve(ctx context.Context, ref string) (name string, desc ocispec.Descriptor, fetcher Fetcher, err error) ``` The above takes an "opaque" reference (we'll get to this later) and returns the canonical name for the object, a content description of the object and a `Fetcher` that can be used to retrieve the object and its child resources. We can write `name` into the metadata store, pointing at the descriptor. Descisions about discovery, trust, provenance, distribution are completely abstracted away from the pulling code. A first response to such a monstrosity is "that is a lot of return arguments". When we look at the actual, we can see that in practice, the usage pattern works well, albeit we don't quite demonstrate the utility of `name`, which will be more apparent later. Designs that allowed separate resolution of the `Fetcher` and the return of a collected object were considered. Let's give this a chance before we go refactoring this further. With this change, we introduce a reference package with helps for remotes to decompose "docker-esque" references into consituent components, without arbitrarily enforcing those opinions on the backend. Utlimately, the name and the reference used to qualify that name are completely opaque to containerd. Obviously, implementors will need to show some candor in following some conventions, but the possibilities are fairly wide. Structurally, we still maintain the concept of the locator and object but the interpretation is up to the resolver. For the most part, the `dist` tool operates exactly the same, except objects can be fetched with a reference: ``` dist fetch docker.io/library/redis:latest ``` The above should work well with a running containerd instance. I recommend giving this a try with `fetch-object`, as well. With `fetch-object`, it is easy for one to better understand the intricacies of the OCI/Docker image formats. Ultimately, this serves the main purpose of the elusive "metadata store". Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-03-08 00:51:08 +00:00
req.Header.Set("Accept", strings.Join([]string{desc.MediaType, `*`}, ", "))
resp, err := r.doRequest(ctx, req)
cmd/dist: POC implementation of dist fetch With this changeset we introduce several new things. The first is the top-level dist command. This is a toolkit that implements various distribution primitives, such as fetching, unpacking and ingesting. The first component to this is a simple `fetch` command. It is a low-level command that takes a "remote", identified by a `locator`, and an object identifier. Keyed by the locator, this tool can identify a remote implementation to fetch the content and write it back to standard out. By allowing this to be the unit of pluggability in fetching content, we can have quite a bit of flexibility in how we retrieve images. The current `fetch` implementation provides anonymous access to docker hub images, through the namespace `docker.io`. As an example, one can fetch the manifest for `redis` with the following command: ``` $ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json ``` Note that we have provided a mediatype "hint", nudging the fetch implementation to grab the correct endpoint. We can hash the output of that to fetch the same content by digest: ``` $ ./dist fetch docker.io/library/redis sha256:$(./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | shasum -a256) ``` Note that the hint is now elided, since we have affixed the content to a particular hash. If you are not yet entertained, let's bring `jq` and `xargs` into the mix for maximum fun. The following incantation fetches the same manifest and downloads all layers into the convenience of `/dev/null`: ``` $ ./dist fetch docker.io/library/redis sha256:a027a470aa2b9b41cc2539847a97b8a14794ebd0a4c7c5d64e390df6bde56c73 | jq -r '.layers[] | .digest' | xargs -n1 -P10 ./dist fetch docker.io/library/redis > /dev/null ``` This is just the beginning. We should be able to centralize configuration around fetch to implement a number of distribution methodologies that have been challenging or impossible up to this point. The `locator`, mentioned earlier, is a schemaless URL that provides a host and path that can be used to resolve the remote. By dispatching on this common identifier, we should be able to support almost any protocol and discovery mechanism imaginable. When this is more solidified, we can roll these up into higher-level operations that can be orchestrated through the `dist` tool or via GRPC. What a time to be alive! Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-20 03:03:44 +00:00
if err != nil {
return nil, err
}
cmd/dist, remotes: simplify resolution flow After receiving feedback during containerd summit walk through of the pull POC, we found that the resolution flow for names was out of place. We could see this present in awkward places where we were trying to re-resolve whether something was a digest or a tag and extra retries to various endpoints. By centering this problem around, "what do we write in the metadata store?", the following interface comes about: ``` Resolve(ctx context.Context, ref string) (name string, desc ocispec.Descriptor, fetcher Fetcher, err error) ``` The above takes an "opaque" reference (we'll get to this later) and returns the canonical name for the object, a content description of the object and a `Fetcher` that can be used to retrieve the object and its child resources. We can write `name` into the metadata store, pointing at the descriptor. Descisions about discovery, trust, provenance, distribution are completely abstracted away from the pulling code. A first response to such a monstrosity is "that is a lot of return arguments". When we look at the actual, we can see that in practice, the usage pattern works well, albeit we don't quite demonstrate the utility of `name`, which will be more apparent later. Designs that allowed separate resolution of the `Fetcher` and the return of a collected object were considered. Let's give this a chance before we go refactoring this further. With this change, we introduce a reference package with helps for remotes to decompose "docker-esque" references into consituent components, without arbitrarily enforcing those opinions on the backend. Utlimately, the name and the reference used to qualify that name are completely opaque to containerd. Obviously, implementors will need to show some candor in following some conventions, but the possibilities are fairly wide. Structurally, we still maintain the concept of the locator and object but the interpretation is up to the resolver. For the most part, the `dist` tool operates exactly the same, except objects can be fetched with a reference: ``` dist fetch docker.io/library/redis:latest ``` The above should work well with a running containerd instance. I recommend giving this a try with `fetch-object`, as well. With `fetch-object`, it is easy for one to better understand the intricacies of the OCI/Docker image formats. Ultimately, this serves the main purpose of the elusive "metadata store". Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-03-08 00:51:08 +00:00
if resp.StatusCode > 299 {
if resp.StatusCode == http.StatusNotFound {
continue // try one of the other urls.
}
cmd/dist, remotes: simplify resolution flow After receiving feedback during containerd summit walk through of the pull POC, we found that the resolution flow for names was out of place. We could see this present in awkward places where we were trying to re-resolve whether something was a digest or a tag and extra retries to various endpoints. By centering this problem around, "what do we write in the metadata store?", the following interface comes about: ``` Resolve(ctx context.Context, ref string) (name string, desc ocispec.Descriptor, fetcher Fetcher, err error) ``` The above takes an "opaque" reference (we'll get to this later) and returns the canonical name for the object, a content description of the object and a `Fetcher` that can be used to retrieve the object and its child resources. We can write `name` into the metadata store, pointing at the descriptor. Descisions about discovery, trust, provenance, distribution are completely abstracted away from the pulling code. A first response to such a monstrosity is "that is a lot of return arguments". When we look at the actual, we can see that in practice, the usage pattern works well, albeit we don't quite demonstrate the utility of `name`, which will be more apparent later. Designs that allowed separate resolution of the `Fetcher` and the return of a collected object were considered. Let's give this a chance before we go refactoring this further. With this change, we introduce a reference package with helps for remotes to decompose "docker-esque" references into consituent components, without arbitrarily enforcing those opinions on the backend. Utlimately, the name and the reference used to qualify that name are completely opaque to containerd. Obviously, implementors will need to show some candor in following some conventions, but the possibilities are fairly wide. Structurally, we still maintain the concept of the locator and object but the interpretation is up to the resolver. For the most part, the `dist` tool operates exactly the same, except objects can be fetched with a reference: ``` dist fetch docker.io/library/redis:latest ``` The above should work well with a running containerd instance. I recommend giving this a try with `fetch-object`, as well. With `fetch-object`, it is easy for one to better understand the intricacies of the OCI/Docker image formats. Ultimately, this serves the main purpose of the elusive "metadata store". Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-03-08 00:51:08 +00:00
resp.Body.Close()
return nil, errors.Errorf("unexpected status code %v: %v", u, resp.Status)
}
return resp.Body, nil
}
return nil, errors.New("not found")
}
func (r *dockerFetcher) doRequest(ctx contextpkg.Context, req *http.Request) (*http.Response, error) {
ctx = log.WithLogger(ctx, log.G(ctx).WithField("url", req.URL.String()))
log.G(ctx).WithField("request.headers", req.Header).Debug("fetch content")
if r.token != "" {
req.Header.Set("Authorization", fmt.Sprintf("Bearer %s", r.token))
}
resp, err := ctxhttp.Do(ctx, http.DefaultClient, req)
if err != nil {
return nil, err
}
log.G(ctx).WithFields(logrus.Fields{
"status": resp.Status,
"response.headers": resp.Header,
}).Debug("fetch response received")
return resp, err
}
type dockerResolver struct{}
var _ remotes.Resolver = &dockerResolver{}
func newDockerResolver() remotes.Resolver {
return &dockerResolver{}
}
func (r *dockerResolver) Resolve(ctx contextpkg.Context, ref string) (string, ocispec.Descriptor, remotes.Fetcher, error) {
refspec, err := reference.Parse(ref)
if err != nil {
return "", ocispec.Descriptor{}, nil, err
}
cmd/dist, remotes: simplify resolution flow After receiving feedback during containerd summit walk through of the pull POC, we found that the resolution flow for names was out of place. We could see this present in awkward places where we were trying to re-resolve whether something was a digest or a tag and extra retries to various endpoints. By centering this problem around, "what do we write in the metadata store?", the following interface comes about: ``` Resolve(ctx context.Context, ref string) (name string, desc ocispec.Descriptor, fetcher Fetcher, err error) ``` The above takes an "opaque" reference (we'll get to this later) and returns the canonical name for the object, a content description of the object and a `Fetcher` that can be used to retrieve the object and its child resources. We can write `name` into the metadata store, pointing at the descriptor. Descisions about discovery, trust, provenance, distribution are completely abstracted away from the pulling code. A first response to such a monstrosity is "that is a lot of return arguments". When we look at the actual, we can see that in practice, the usage pattern works well, albeit we don't quite demonstrate the utility of `name`, which will be more apparent later. Designs that allowed separate resolution of the `Fetcher` and the return of a collected object were considered. Let's give this a chance before we go refactoring this further. With this change, we introduce a reference package with helps for remotes to decompose "docker-esque" references into consituent components, without arbitrarily enforcing those opinions on the backend. Utlimately, the name and the reference used to qualify that name are completely opaque to containerd. Obviously, implementors will need to show some candor in following some conventions, but the possibilities are fairly wide. Structurally, we still maintain the concept of the locator and object but the interpretation is up to the resolver. For the most part, the `dist` tool operates exactly the same, except objects can be fetched with a reference: ``` dist fetch docker.io/library/redis:latest ``` The above should work well with a running containerd instance. I recommend giving this a try with `fetch-object`, as well. With `fetch-object`, it is easy for one to better understand the intricacies of the OCI/Docker image formats. Ultimately, this serves the main purpose of the elusive "metadata store". Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-03-08 00:51:08 +00:00
var (
base url.URL
token string
)
switch refspec.Hostname() {
case "docker.io":
base.Scheme = "https"
base.Host = "registry-1.docker.io"
prefix := strings.TrimPrefix(refspec.Locator, "docker.io/")
base.Path = path.Join("/v2", prefix)
token, err = getToken(ctx, "repository:"+prefix+":pull")
if err != nil {
return "", ocispec.Descriptor{}, nil, err
}
case "localhost:5000":
base.Scheme = "http"
base.Host = "localhost:5000"
base.Path = path.Join("/v2", strings.TrimPrefix(refspec.Locator, "localhost:5000/"))
default:
return "", ocispec.Descriptor{}, nil, errors.Errorf("unsupported locator: %q", refspec.Locator)
}
fetcher := &dockerFetcher{
base: base,
token: token,
}
var (
urls []string
dgst = refspec.Digest()
)
if dgst != "" {
if err := dgst.Validate(); err != nil {
// need to fail here, since we can't actually resolve the invalid
// digest.
return "", ocispec.Descriptor{}, nil, err
}
// turns out, we have a valid digest, make a url.
urls = append(urls, fetcher.url("manifests", dgst.String()))
} else {
urls = append(urls, fetcher.url("manifests", refspec.Object))
}
// fallback to blobs on not found.
urls = append(urls, fetcher.url("blobs", dgst.String()))
for _, u := range urls {
req, err := http.NewRequest(http.MethodHead, u, nil)
if err != nil {
return "", ocispec.Descriptor{}, nil, err
}
// set headers for all the types we support for resolution.
req.Header.Set("Accept", strings.Join([]string{
MediaTypeDockerSchema2Manifest,
MediaTypeDockerSchema2ManifestList,
ocispec.MediaTypeImageManifest,
ocispec.MediaTypeImageIndex, "*"}, ", "))
log.G(ctx).Debug("resolving")
resp, err := fetcher.doRequest(ctx, req)
if err != nil {
return "", ocispec.Descriptor{}, nil, err
}
resp.Body.Close() // don't care about body contents.
if resp.StatusCode > 299 {
if resp.StatusCode == http.StatusNotFound {
continue
cmd/dist: POC implementation of dist fetch With this changeset we introduce several new things. The first is the top-level dist command. This is a toolkit that implements various distribution primitives, such as fetching, unpacking and ingesting. The first component to this is a simple `fetch` command. It is a low-level command that takes a "remote", identified by a `locator`, and an object identifier. Keyed by the locator, this tool can identify a remote implementation to fetch the content and write it back to standard out. By allowing this to be the unit of pluggability in fetching content, we can have quite a bit of flexibility in how we retrieve images. The current `fetch` implementation provides anonymous access to docker hub images, through the namespace `docker.io`. As an example, one can fetch the manifest for `redis` with the following command: ``` $ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json ``` Note that we have provided a mediatype "hint", nudging the fetch implementation to grab the correct endpoint. We can hash the output of that to fetch the same content by digest: ``` $ ./dist fetch docker.io/library/redis sha256:$(./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | shasum -a256) ``` Note that the hint is now elided, since we have affixed the content to a particular hash. If you are not yet entertained, let's bring `jq` and `xargs` into the mix for maximum fun. The following incantation fetches the same manifest and downloads all layers into the convenience of `/dev/null`: ``` $ ./dist fetch docker.io/library/redis sha256:a027a470aa2b9b41cc2539847a97b8a14794ebd0a4c7c5d64e390df6bde56c73 | jq -r '.layers[] | .digest' | xargs -n1 -P10 ./dist fetch docker.io/library/redis > /dev/null ``` This is just the beginning. We should be able to centralize configuration around fetch to implement a number of distribution methodologies that have been challenging or impossible up to this point. The `locator`, mentioned earlier, is a schemaless URL that provides a host and path that can be used to resolve the remote. By dispatching on this common identifier, we should be able to support almost any protocol and discovery mechanism imaginable. When this is more solidified, we can roll these up into higher-level operations that can be orchestrated through the `dist` tool or via GRPC. What a time to be alive! Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-20 03:03:44 +00:00
}
cmd/dist, remotes: simplify resolution flow After receiving feedback during containerd summit walk through of the pull POC, we found that the resolution flow for names was out of place. We could see this present in awkward places where we were trying to re-resolve whether something was a digest or a tag and extra retries to various endpoints. By centering this problem around, "what do we write in the metadata store?", the following interface comes about: ``` Resolve(ctx context.Context, ref string) (name string, desc ocispec.Descriptor, fetcher Fetcher, err error) ``` The above takes an "opaque" reference (we'll get to this later) and returns the canonical name for the object, a content description of the object and a `Fetcher` that can be used to retrieve the object and its child resources. We can write `name` into the metadata store, pointing at the descriptor. Descisions about discovery, trust, provenance, distribution are completely abstracted away from the pulling code. A first response to such a monstrosity is "that is a lot of return arguments". When we look at the actual, we can see that in practice, the usage pattern works well, albeit we don't quite demonstrate the utility of `name`, which will be more apparent later. Designs that allowed separate resolution of the `Fetcher` and the return of a collected object were considered. Let's give this a chance before we go refactoring this further. With this change, we introduce a reference package with helps for remotes to decompose "docker-esque" references into consituent components, without arbitrarily enforcing those opinions on the backend. Utlimately, the name and the reference used to qualify that name are completely opaque to containerd. Obviously, implementors will need to show some candor in following some conventions, but the possibilities are fairly wide. Structurally, we still maintain the concept of the locator and object but the interpretation is up to the resolver. For the most part, the `dist` tool operates exactly the same, except objects can be fetched with a reference: ``` dist fetch docker.io/library/redis:latest ``` The above should work well with a running containerd instance. I recommend giving this a try with `fetch-object`, as well. With `fetch-object`, it is easy for one to better understand the intricacies of the OCI/Docker image formats. Ultimately, this serves the main purpose of the elusive "metadata store". Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-03-08 00:51:08 +00:00
return "", ocispec.Descriptor{}, nil, errors.Errorf("unexpected status code %v: %v", u, resp.Status)
}
cmd/dist: POC implementation of dist fetch With this changeset we introduce several new things. The first is the top-level dist command. This is a toolkit that implements various distribution primitives, such as fetching, unpacking and ingesting. The first component to this is a simple `fetch` command. It is a low-level command that takes a "remote", identified by a `locator`, and an object identifier. Keyed by the locator, this tool can identify a remote implementation to fetch the content and write it back to standard out. By allowing this to be the unit of pluggability in fetching content, we can have quite a bit of flexibility in how we retrieve images. The current `fetch` implementation provides anonymous access to docker hub images, through the namespace `docker.io`. As an example, one can fetch the manifest for `redis` with the following command: ``` $ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json ``` Note that we have provided a mediatype "hint", nudging the fetch implementation to grab the correct endpoint. We can hash the output of that to fetch the same content by digest: ``` $ ./dist fetch docker.io/library/redis sha256:$(./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | shasum -a256) ``` Note that the hint is now elided, since we have affixed the content to a particular hash. If you are not yet entertained, let's bring `jq` and `xargs` into the mix for maximum fun. The following incantation fetches the same manifest and downloads all layers into the convenience of `/dev/null`: ``` $ ./dist fetch docker.io/library/redis sha256:a027a470aa2b9b41cc2539847a97b8a14794ebd0a4c7c5d64e390df6bde56c73 | jq -r '.layers[] | .digest' | xargs -n1 -P10 ./dist fetch docker.io/library/redis > /dev/null ``` This is just the beginning. We should be able to centralize configuration around fetch to implement a number of distribution methodologies that have been challenging or impossible up to this point. The `locator`, mentioned earlier, is a schemaless URL that provides a host and path that can be used to resolve the remote. By dispatching on this common identifier, we should be able to support almost any protocol and discovery mechanism imaginable. When this is more solidified, we can roll these up into higher-level operations that can be orchestrated through the `dist` tool or via GRPC. What a time to be alive! Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-20 03:03:44 +00:00
cmd/dist, remotes: simplify resolution flow After receiving feedback during containerd summit walk through of the pull POC, we found that the resolution flow for names was out of place. We could see this present in awkward places where we were trying to re-resolve whether something was a digest or a tag and extra retries to various endpoints. By centering this problem around, "what do we write in the metadata store?", the following interface comes about: ``` Resolve(ctx context.Context, ref string) (name string, desc ocispec.Descriptor, fetcher Fetcher, err error) ``` The above takes an "opaque" reference (we'll get to this later) and returns the canonical name for the object, a content description of the object and a `Fetcher` that can be used to retrieve the object and its child resources. We can write `name` into the metadata store, pointing at the descriptor. Descisions about discovery, trust, provenance, distribution are completely abstracted away from the pulling code. A first response to such a monstrosity is "that is a lot of return arguments". When we look at the actual, we can see that in practice, the usage pattern works well, albeit we don't quite demonstrate the utility of `name`, which will be more apparent later. Designs that allowed separate resolution of the `Fetcher` and the return of a collected object were considered. Let's give this a chance before we go refactoring this further. With this change, we introduce a reference package with helps for remotes to decompose "docker-esque" references into consituent components, without arbitrarily enforcing those opinions on the backend. Utlimately, the name and the reference used to qualify that name are completely opaque to containerd. Obviously, implementors will need to show some candor in following some conventions, but the possibilities are fairly wide. Structurally, we still maintain the concept of the locator and object but the interpretation is up to the resolver. For the most part, the `dist` tool operates exactly the same, except objects can be fetched with a reference: ``` dist fetch docker.io/library/redis:latest ``` The above should work well with a running containerd instance. I recommend giving this a try with `fetch-object`, as well. With `fetch-object`, it is easy for one to better understand the intricacies of the OCI/Docker image formats. Ultimately, this serves the main purpose of the elusive "metadata store". Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-03-08 00:51:08 +00:00
// this is the only point at which we trust the registry. we use the
// content headers to assemble a descriptor for the name. when this becomes
// more robust, we mostly get this information from a secure trust store.
dgstHeader := digest.Digest(resp.Header.Get("Docker-Content-Digest"))
cmd/dist: POC implementation of dist fetch With this changeset we introduce several new things. The first is the top-level dist command. This is a toolkit that implements various distribution primitives, such as fetching, unpacking and ingesting. The first component to this is a simple `fetch` command. It is a low-level command that takes a "remote", identified by a `locator`, and an object identifier. Keyed by the locator, this tool can identify a remote implementation to fetch the content and write it back to standard out. By allowing this to be the unit of pluggability in fetching content, we can have quite a bit of flexibility in how we retrieve images. The current `fetch` implementation provides anonymous access to docker hub images, through the namespace `docker.io`. As an example, one can fetch the manifest for `redis` with the following command: ``` $ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json ``` Note that we have provided a mediatype "hint", nudging the fetch implementation to grab the correct endpoint. We can hash the output of that to fetch the same content by digest: ``` $ ./dist fetch docker.io/library/redis sha256:$(./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | shasum -a256) ``` Note that the hint is now elided, since we have affixed the content to a particular hash. If you are not yet entertained, let's bring `jq` and `xargs` into the mix for maximum fun. The following incantation fetches the same manifest and downloads all layers into the convenience of `/dev/null`: ``` $ ./dist fetch docker.io/library/redis sha256:a027a470aa2b9b41cc2539847a97b8a14794ebd0a4c7c5d64e390df6bde56c73 | jq -r '.layers[] | .digest' | xargs -n1 -P10 ./dist fetch docker.io/library/redis > /dev/null ``` This is just the beginning. We should be able to centralize configuration around fetch to implement a number of distribution methodologies that have been challenging or impossible up to this point. The `locator`, mentioned earlier, is a schemaless URL that provides a host and path that can be used to resolve the remote. By dispatching on this common identifier, we should be able to support almost any protocol and discovery mechanism imaginable. When this is more solidified, we can roll these up into higher-level operations that can be orchestrated through the `dist` tool or via GRPC. What a time to be alive! Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-20 03:03:44 +00:00
cmd/dist, remotes: simplify resolution flow After receiving feedback during containerd summit walk through of the pull POC, we found that the resolution flow for names was out of place. We could see this present in awkward places where we were trying to re-resolve whether something was a digest or a tag and extra retries to various endpoints. By centering this problem around, "what do we write in the metadata store?", the following interface comes about: ``` Resolve(ctx context.Context, ref string) (name string, desc ocispec.Descriptor, fetcher Fetcher, err error) ``` The above takes an "opaque" reference (we'll get to this later) and returns the canonical name for the object, a content description of the object and a `Fetcher` that can be used to retrieve the object and its child resources. We can write `name` into the metadata store, pointing at the descriptor. Descisions about discovery, trust, provenance, distribution are completely abstracted away from the pulling code. A first response to such a monstrosity is "that is a lot of return arguments". When we look at the actual, we can see that in practice, the usage pattern works well, albeit we don't quite demonstrate the utility of `name`, which will be more apparent later. Designs that allowed separate resolution of the `Fetcher` and the return of a collected object were considered. Let's give this a chance before we go refactoring this further. With this change, we introduce a reference package with helps for remotes to decompose "docker-esque" references into consituent components, without arbitrarily enforcing those opinions on the backend. Utlimately, the name and the reference used to qualify that name are completely opaque to containerd. Obviously, implementors will need to show some candor in following some conventions, but the possibilities are fairly wide. Structurally, we still maintain the concept of the locator and object but the interpretation is up to the resolver. For the most part, the `dist` tool operates exactly the same, except objects can be fetched with a reference: ``` dist fetch docker.io/library/redis:latest ``` The above should work well with a running containerd instance. I recommend giving this a try with `fetch-object`, as well. With `fetch-object`, it is easy for one to better understand the intricacies of the OCI/Docker image formats. Ultimately, this serves the main purpose of the elusive "metadata store". Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-03-08 00:51:08 +00:00
if dgstHeader != "" {
if err := dgstHeader.Validate(); err != nil {
if err == nil {
return "", ocispec.Descriptor{}, nil, errors.Errorf("%q in header not a valid digest", dgstHeader)
cmd/dist: POC implementation of dist fetch With this changeset we introduce several new things. The first is the top-level dist command. This is a toolkit that implements various distribution primitives, such as fetching, unpacking and ingesting. The first component to this is a simple `fetch` command. It is a low-level command that takes a "remote", identified by a `locator`, and an object identifier. Keyed by the locator, this tool can identify a remote implementation to fetch the content and write it back to standard out. By allowing this to be the unit of pluggability in fetching content, we can have quite a bit of flexibility in how we retrieve images. The current `fetch` implementation provides anonymous access to docker hub images, through the namespace `docker.io`. As an example, one can fetch the manifest for `redis` with the following command: ``` $ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json ``` Note that we have provided a mediatype "hint", nudging the fetch implementation to grab the correct endpoint. We can hash the output of that to fetch the same content by digest: ``` $ ./dist fetch docker.io/library/redis sha256:$(./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | shasum -a256) ``` Note that the hint is now elided, since we have affixed the content to a particular hash. If you are not yet entertained, let's bring `jq` and `xargs` into the mix for maximum fun. The following incantation fetches the same manifest and downloads all layers into the convenience of `/dev/null`: ``` $ ./dist fetch docker.io/library/redis sha256:a027a470aa2b9b41cc2539847a97b8a14794ebd0a4c7c5d64e390df6bde56c73 | jq -r '.layers[] | .digest' | xargs -n1 -P10 ./dist fetch docker.io/library/redis > /dev/null ``` This is just the beginning. We should be able to centralize configuration around fetch to implement a number of distribution methodologies that have been challenging or impossible up to this point. The `locator`, mentioned earlier, is a schemaless URL that provides a host and path that can be used to resolve the remote. By dispatching on this common identifier, we should be able to support almost any protocol and discovery mechanism imaginable. When this is more solidified, we can roll these up into higher-level operations that can be orchestrated through the `dist` tool or via GRPC. What a time to be alive! Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-20 03:03:44 +00:00
}
cmd/dist, remotes: simplify resolution flow After receiving feedback during containerd summit walk through of the pull POC, we found that the resolution flow for names was out of place. We could see this present in awkward places where we were trying to re-resolve whether something was a digest or a tag and extra retries to various endpoints. By centering this problem around, "what do we write in the metadata store?", the following interface comes about: ``` Resolve(ctx context.Context, ref string) (name string, desc ocispec.Descriptor, fetcher Fetcher, err error) ``` The above takes an "opaque" reference (we'll get to this later) and returns the canonical name for the object, a content description of the object and a `Fetcher` that can be used to retrieve the object and its child resources. We can write `name` into the metadata store, pointing at the descriptor. Descisions about discovery, trust, provenance, distribution are completely abstracted away from the pulling code. A first response to such a monstrosity is "that is a lot of return arguments". When we look at the actual, we can see that in practice, the usage pattern works well, albeit we don't quite demonstrate the utility of `name`, which will be more apparent later. Designs that allowed separate resolution of the `Fetcher` and the return of a collected object were considered. Let's give this a chance before we go refactoring this further. With this change, we introduce a reference package with helps for remotes to decompose "docker-esque" references into consituent components, without arbitrarily enforcing those opinions on the backend. Utlimately, the name and the reference used to qualify that name are completely opaque to containerd. Obviously, implementors will need to show some candor in following some conventions, but the possibilities are fairly wide. Structurally, we still maintain the concept of the locator and object but the interpretation is up to the resolver. For the most part, the `dist` tool operates exactly the same, except objects can be fetched with a reference: ``` dist fetch docker.io/library/redis:latest ``` The above should work well with a running containerd instance. I recommend giving this a try with `fetch-object`, as well. With `fetch-object`, it is easy for one to better understand the intricacies of the OCI/Docker image formats. Ultimately, this serves the main purpose of the elusive "metadata store". Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-03-08 00:51:08 +00:00
return "", ocispec.Descriptor{}, nil, errors.Wrapf(err, "%q in header not a valid digest", dgstHeader)
}
dgst = dgstHeader
}
cmd/dist: POC implementation of dist fetch With this changeset we introduce several new things. The first is the top-level dist command. This is a toolkit that implements various distribution primitives, such as fetching, unpacking and ingesting. The first component to this is a simple `fetch` command. It is a low-level command that takes a "remote", identified by a `locator`, and an object identifier. Keyed by the locator, this tool can identify a remote implementation to fetch the content and write it back to standard out. By allowing this to be the unit of pluggability in fetching content, we can have quite a bit of flexibility in how we retrieve images. The current `fetch` implementation provides anonymous access to docker hub images, through the namespace `docker.io`. As an example, one can fetch the manifest for `redis` with the following command: ``` $ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json ``` Note that we have provided a mediatype "hint", nudging the fetch implementation to grab the correct endpoint. We can hash the output of that to fetch the same content by digest: ``` $ ./dist fetch docker.io/library/redis sha256:$(./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | shasum -a256) ``` Note that the hint is now elided, since we have affixed the content to a particular hash. If you are not yet entertained, let's bring `jq` and `xargs` into the mix for maximum fun. The following incantation fetches the same manifest and downloads all layers into the convenience of `/dev/null`: ``` $ ./dist fetch docker.io/library/redis sha256:a027a470aa2b9b41cc2539847a97b8a14794ebd0a4c7c5d64e390df6bde56c73 | jq -r '.layers[] | .digest' | xargs -n1 -P10 ./dist fetch docker.io/library/redis > /dev/null ``` This is just the beginning. We should be able to centralize configuration around fetch to implement a number of distribution methodologies that have been challenging or impossible up to this point. The `locator`, mentioned earlier, is a schemaless URL that provides a host and path that can be used to resolve the remote. By dispatching on this common identifier, we should be able to support almost any protocol and discovery mechanism imaginable. When this is more solidified, we can roll these up into higher-level operations that can be orchestrated through the `dist` tool or via GRPC. What a time to be alive! Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-20 03:03:44 +00:00
cmd/dist, remotes: simplify resolution flow After receiving feedback during containerd summit walk through of the pull POC, we found that the resolution flow for names was out of place. We could see this present in awkward places where we were trying to re-resolve whether something was a digest or a tag and extra retries to various endpoints. By centering this problem around, "what do we write in the metadata store?", the following interface comes about: ``` Resolve(ctx context.Context, ref string) (name string, desc ocispec.Descriptor, fetcher Fetcher, err error) ``` The above takes an "opaque" reference (we'll get to this later) and returns the canonical name for the object, a content description of the object and a `Fetcher` that can be used to retrieve the object and its child resources. We can write `name` into the metadata store, pointing at the descriptor. Descisions about discovery, trust, provenance, distribution are completely abstracted away from the pulling code. A first response to such a monstrosity is "that is a lot of return arguments". When we look at the actual, we can see that in practice, the usage pattern works well, albeit we don't quite demonstrate the utility of `name`, which will be more apparent later. Designs that allowed separate resolution of the `Fetcher` and the return of a collected object were considered. Let's give this a chance before we go refactoring this further. With this change, we introduce a reference package with helps for remotes to decompose "docker-esque" references into consituent components, without arbitrarily enforcing those opinions on the backend. Utlimately, the name and the reference used to qualify that name are completely opaque to containerd. Obviously, implementors will need to show some candor in following some conventions, but the possibilities are fairly wide. Structurally, we still maintain the concept of the locator and object but the interpretation is up to the resolver. For the most part, the `dist` tool operates exactly the same, except objects can be fetched with a reference: ``` dist fetch docker.io/library/redis:latest ``` The above should work well with a running containerd instance. I recommend giving this a try with `fetch-object`, as well. With `fetch-object`, it is easy for one to better understand the intricacies of the OCI/Docker image formats. Ultimately, this serves the main purpose of the elusive "metadata store". Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-03-08 00:51:08 +00:00
if dgst == "" {
return "", ocispec.Descriptor{}, nil, errors.Wrapf(err, "could not resolve digest for %v", ref)
}
cmd/dist: POC implementation of dist fetch With this changeset we introduce several new things. The first is the top-level dist command. This is a toolkit that implements various distribution primitives, such as fetching, unpacking and ingesting. The first component to this is a simple `fetch` command. It is a low-level command that takes a "remote", identified by a `locator`, and an object identifier. Keyed by the locator, this tool can identify a remote implementation to fetch the content and write it back to standard out. By allowing this to be the unit of pluggability in fetching content, we can have quite a bit of flexibility in how we retrieve images. The current `fetch` implementation provides anonymous access to docker hub images, through the namespace `docker.io`. As an example, one can fetch the manifest for `redis` with the following command: ``` $ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json ``` Note that we have provided a mediatype "hint", nudging the fetch implementation to grab the correct endpoint. We can hash the output of that to fetch the same content by digest: ``` $ ./dist fetch docker.io/library/redis sha256:$(./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | shasum -a256) ``` Note that the hint is now elided, since we have affixed the content to a particular hash. If you are not yet entertained, let's bring `jq` and `xargs` into the mix for maximum fun. The following incantation fetches the same manifest and downloads all layers into the convenience of `/dev/null`: ``` $ ./dist fetch docker.io/library/redis sha256:a027a470aa2b9b41cc2539847a97b8a14794ebd0a4c7c5d64e390df6bde56c73 | jq -r '.layers[] | .digest' | xargs -n1 -P10 ./dist fetch docker.io/library/redis > /dev/null ``` This is just the beginning. We should be able to centralize configuration around fetch to implement a number of distribution methodologies that have been challenging or impossible up to this point. The `locator`, mentioned earlier, is a schemaless URL that provides a host and path that can be used to resolve the remote. By dispatching on this common identifier, we should be able to support almost any protocol and discovery mechanism imaginable. When this is more solidified, we can roll these up into higher-level operations that can be orchestrated through the `dist` tool or via GRPC. What a time to be alive! Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-20 03:03:44 +00:00
cmd/dist, remotes: simplify resolution flow After receiving feedback during containerd summit walk through of the pull POC, we found that the resolution flow for names was out of place. We could see this present in awkward places where we were trying to re-resolve whether something was a digest or a tag and extra retries to various endpoints. By centering this problem around, "what do we write in the metadata store?", the following interface comes about: ``` Resolve(ctx context.Context, ref string) (name string, desc ocispec.Descriptor, fetcher Fetcher, err error) ``` The above takes an "opaque" reference (we'll get to this later) and returns the canonical name for the object, a content description of the object and a `Fetcher` that can be used to retrieve the object and its child resources. We can write `name` into the metadata store, pointing at the descriptor. Descisions about discovery, trust, provenance, distribution are completely abstracted away from the pulling code. A first response to such a monstrosity is "that is a lot of return arguments". When we look at the actual, we can see that in practice, the usage pattern works well, albeit we don't quite demonstrate the utility of `name`, which will be more apparent later. Designs that allowed separate resolution of the `Fetcher` and the return of a collected object were considered. Let's give this a chance before we go refactoring this further. With this change, we introduce a reference package with helps for remotes to decompose "docker-esque" references into consituent components, without arbitrarily enforcing those opinions on the backend. Utlimately, the name and the reference used to qualify that name are completely opaque to containerd. Obviously, implementors will need to show some candor in following some conventions, but the possibilities are fairly wide. Structurally, we still maintain the concept of the locator and object but the interpretation is up to the resolver. For the most part, the `dist` tool operates exactly the same, except objects can be fetched with a reference: ``` dist fetch docker.io/library/redis:latest ``` The above should work well with a running containerd instance. I recommend giving this a try with `fetch-object`, as well. With `fetch-object`, it is easy for one to better understand the intricacies of the OCI/Docker image formats. Ultimately, this serves the main purpose of the elusive "metadata store". Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-03-08 00:51:08 +00:00
var (
size int64
sizeHeader = resp.Header.Get("Content-Length")
)
cmd/dist: POC implementation of dist fetch With this changeset we introduce several new things. The first is the top-level dist command. This is a toolkit that implements various distribution primitives, such as fetching, unpacking and ingesting. The first component to this is a simple `fetch` command. It is a low-level command that takes a "remote", identified by a `locator`, and an object identifier. Keyed by the locator, this tool can identify a remote implementation to fetch the content and write it back to standard out. By allowing this to be the unit of pluggability in fetching content, we can have quite a bit of flexibility in how we retrieve images. The current `fetch` implementation provides anonymous access to docker hub images, through the namespace `docker.io`. As an example, one can fetch the manifest for `redis` with the following command: ``` $ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json ``` Note that we have provided a mediatype "hint", nudging the fetch implementation to grab the correct endpoint. We can hash the output of that to fetch the same content by digest: ``` $ ./dist fetch docker.io/library/redis sha256:$(./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | shasum -a256) ``` Note that the hint is now elided, since we have affixed the content to a particular hash. If you are not yet entertained, let's bring `jq` and `xargs` into the mix for maximum fun. The following incantation fetches the same manifest and downloads all layers into the convenience of `/dev/null`: ``` $ ./dist fetch docker.io/library/redis sha256:a027a470aa2b9b41cc2539847a97b8a14794ebd0a4c7c5d64e390df6bde56c73 | jq -r '.layers[] | .digest' | xargs -n1 -P10 ./dist fetch docker.io/library/redis > /dev/null ``` This is just the beginning. We should be able to centralize configuration around fetch to implement a number of distribution methodologies that have been challenging or impossible up to this point. The `locator`, mentioned earlier, is a schemaless URL that provides a host and path that can be used to resolve the remote. By dispatching on this common identifier, we should be able to support almost any protocol and discovery mechanism imaginable. When this is more solidified, we can roll these up into higher-level operations that can be orchestrated through the `dist` tool or via GRPC. What a time to be alive! Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-20 03:03:44 +00:00
cmd/dist, remotes: simplify resolution flow After receiving feedback during containerd summit walk through of the pull POC, we found that the resolution flow for names was out of place. We could see this present in awkward places where we were trying to re-resolve whether something was a digest or a tag and extra retries to various endpoints. By centering this problem around, "what do we write in the metadata store?", the following interface comes about: ``` Resolve(ctx context.Context, ref string) (name string, desc ocispec.Descriptor, fetcher Fetcher, err error) ``` The above takes an "opaque" reference (we'll get to this later) and returns the canonical name for the object, a content description of the object and a `Fetcher` that can be used to retrieve the object and its child resources. We can write `name` into the metadata store, pointing at the descriptor. Descisions about discovery, trust, provenance, distribution are completely abstracted away from the pulling code. A first response to such a monstrosity is "that is a lot of return arguments". When we look at the actual, we can see that in practice, the usage pattern works well, albeit we don't quite demonstrate the utility of `name`, which will be more apparent later. Designs that allowed separate resolution of the `Fetcher` and the return of a collected object were considered. Let's give this a chance before we go refactoring this further. With this change, we introduce a reference package with helps for remotes to decompose "docker-esque" references into consituent components, without arbitrarily enforcing those opinions on the backend. Utlimately, the name and the reference used to qualify that name are completely opaque to containerd. Obviously, implementors will need to show some candor in following some conventions, but the possibilities are fairly wide. Structurally, we still maintain the concept of the locator and object but the interpretation is up to the resolver. For the most part, the `dist` tool operates exactly the same, except objects can be fetched with a reference: ``` dist fetch docker.io/library/redis:latest ``` The above should work well with a running containerd instance. I recommend giving this a try with `fetch-object`, as well. With `fetch-object`, it is easy for one to better understand the intricacies of the OCI/Docker image formats. Ultimately, this serves the main purpose of the elusive "metadata store". Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-03-08 00:51:08 +00:00
size, err = strconv.ParseInt(sizeHeader, 10, 64)
if err != nil || size < 0 {
return "", ocispec.Descriptor{}, nil, errors.Wrapf(err, "%q in header not a valid size", sizeHeader)
}
cmd/dist: POC implementation of dist fetch With this changeset we introduce several new things. The first is the top-level dist command. This is a toolkit that implements various distribution primitives, such as fetching, unpacking and ingesting. The first component to this is a simple `fetch` command. It is a low-level command that takes a "remote", identified by a `locator`, and an object identifier. Keyed by the locator, this tool can identify a remote implementation to fetch the content and write it back to standard out. By allowing this to be the unit of pluggability in fetching content, we can have quite a bit of flexibility in how we retrieve images. The current `fetch` implementation provides anonymous access to docker hub images, through the namespace `docker.io`. As an example, one can fetch the manifest for `redis` with the following command: ``` $ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json ``` Note that we have provided a mediatype "hint", nudging the fetch implementation to grab the correct endpoint. We can hash the output of that to fetch the same content by digest: ``` $ ./dist fetch docker.io/library/redis sha256:$(./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | shasum -a256) ``` Note that the hint is now elided, since we have affixed the content to a particular hash. If you are not yet entertained, let's bring `jq` and `xargs` into the mix for maximum fun. The following incantation fetches the same manifest and downloads all layers into the convenience of `/dev/null`: ``` $ ./dist fetch docker.io/library/redis sha256:a027a470aa2b9b41cc2539847a97b8a14794ebd0a4c7c5d64e390df6bde56c73 | jq -r '.layers[] | .digest' | xargs -n1 -P10 ./dist fetch docker.io/library/redis > /dev/null ``` This is just the beginning. We should be able to centralize configuration around fetch to implement a number of distribution methodologies that have been challenging or impossible up to this point. The `locator`, mentioned earlier, is a schemaless URL that provides a host and path that can be used to resolve the remote. By dispatching on this common identifier, we should be able to support almost any protocol and discovery mechanism imaginable. When this is more solidified, we can roll these up into higher-level operations that can be orchestrated through the `dist` tool or via GRPC. What a time to be alive! Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-20 03:03:44 +00:00
cmd/dist, remotes: simplify resolution flow After receiving feedback during containerd summit walk through of the pull POC, we found that the resolution flow for names was out of place. We could see this present in awkward places where we were trying to re-resolve whether something was a digest or a tag and extra retries to various endpoints. By centering this problem around, "what do we write in the metadata store?", the following interface comes about: ``` Resolve(ctx context.Context, ref string) (name string, desc ocispec.Descriptor, fetcher Fetcher, err error) ``` The above takes an "opaque" reference (we'll get to this later) and returns the canonical name for the object, a content description of the object and a `Fetcher` that can be used to retrieve the object and its child resources. We can write `name` into the metadata store, pointing at the descriptor. Descisions about discovery, trust, provenance, distribution are completely abstracted away from the pulling code. A first response to such a monstrosity is "that is a lot of return arguments". When we look at the actual, we can see that in practice, the usage pattern works well, albeit we don't quite demonstrate the utility of `name`, which will be more apparent later. Designs that allowed separate resolution of the `Fetcher` and the return of a collected object were considered. Let's give this a chance before we go refactoring this further. With this change, we introduce a reference package with helps for remotes to decompose "docker-esque" references into consituent components, without arbitrarily enforcing those opinions on the backend. Utlimately, the name and the reference used to qualify that name are completely opaque to containerd. Obviously, implementors will need to show some candor in following some conventions, but the possibilities are fairly wide. Structurally, we still maintain the concept of the locator and object but the interpretation is up to the resolver. For the most part, the `dist` tool operates exactly the same, except objects can be fetched with a reference: ``` dist fetch docker.io/library/redis:latest ``` The above should work well with a running containerd instance. I recommend giving this a try with `fetch-object`, as well. With `fetch-object`, it is easy for one to better understand the intricacies of the OCI/Docker image formats. Ultimately, this serves the main purpose of the elusive "metadata store". Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-03-08 00:51:08 +00:00
desc := ocispec.Descriptor{
Digest: dgst,
MediaType: resp.Header.Get("Content-Type"), // need to strip disposition?
Size: size,
}
log.G(ctx).WithField("desc.digest", desc.Digest).Debug("resolved")
return ref, desc, fetcher, nil
}
cmd/dist: POC implementation of dist fetch With this changeset we introduce several new things. The first is the top-level dist command. This is a toolkit that implements various distribution primitives, such as fetching, unpacking and ingesting. The first component to this is a simple `fetch` command. It is a low-level command that takes a "remote", identified by a `locator`, and an object identifier. Keyed by the locator, this tool can identify a remote implementation to fetch the content and write it back to standard out. By allowing this to be the unit of pluggability in fetching content, we can have quite a bit of flexibility in how we retrieve images. The current `fetch` implementation provides anonymous access to docker hub images, through the namespace `docker.io`. As an example, one can fetch the manifest for `redis` with the following command: ``` $ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json ``` Note that we have provided a mediatype "hint", nudging the fetch implementation to grab the correct endpoint. We can hash the output of that to fetch the same content by digest: ``` $ ./dist fetch docker.io/library/redis sha256:$(./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | shasum -a256) ``` Note that the hint is now elided, since we have affixed the content to a particular hash. If you are not yet entertained, let's bring `jq` and `xargs` into the mix for maximum fun. The following incantation fetches the same manifest and downloads all layers into the convenience of `/dev/null`: ``` $ ./dist fetch docker.io/library/redis sha256:a027a470aa2b9b41cc2539847a97b8a14794ebd0a4c7c5d64e390df6bde56c73 | jq -r '.layers[] | .digest' | xargs -n1 -P10 ./dist fetch docker.io/library/redis > /dev/null ``` This is just the beginning. We should be able to centralize configuration around fetch to implement a number of distribution methodologies that have been challenging or impossible up to this point. The `locator`, mentioned earlier, is a schemaless URL that provides a host and path that can be used to resolve the remote. By dispatching on this common identifier, we should be able to support almost any protocol and discovery mechanism imaginable. When this is more solidified, we can roll these up into higher-level operations that can be orchestrated through the `dist` tool or via GRPC. What a time to be alive! Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-20 03:03:44 +00:00
cmd/dist, remotes: simplify resolution flow After receiving feedback during containerd summit walk through of the pull POC, we found that the resolution flow for names was out of place. We could see this present in awkward places where we were trying to re-resolve whether something was a digest or a tag and extra retries to various endpoints. By centering this problem around, "what do we write in the metadata store?", the following interface comes about: ``` Resolve(ctx context.Context, ref string) (name string, desc ocispec.Descriptor, fetcher Fetcher, err error) ``` The above takes an "opaque" reference (we'll get to this later) and returns the canonical name for the object, a content description of the object and a `Fetcher` that can be used to retrieve the object and its child resources. We can write `name` into the metadata store, pointing at the descriptor. Descisions about discovery, trust, provenance, distribution are completely abstracted away from the pulling code. A first response to such a monstrosity is "that is a lot of return arguments". When we look at the actual, we can see that in practice, the usage pattern works well, albeit we don't quite demonstrate the utility of `name`, which will be more apparent later. Designs that allowed separate resolution of the `Fetcher` and the return of a collected object were considered. Let's give this a chance before we go refactoring this further. With this change, we introduce a reference package with helps for remotes to decompose "docker-esque" references into consituent components, without arbitrarily enforcing those opinions on the backend. Utlimately, the name and the reference used to qualify that name are completely opaque to containerd. Obviously, implementors will need to show some candor in following some conventions, but the possibilities are fairly wide. Structurally, we still maintain the concept of the locator and object but the interpretation is up to the resolver. For the most part, the `dist` tool operates exactly the same, except objects can be fetched with a reference: ``` dist fetch docker.io/library/redis:latest ``` The above should work well with a running containerd instance. I recommend giving this a try with `fetch-object`, as well. With `fetch-object`, it is easy for one to better understand the intricacies of the OCI/Docker image formats. Ultimately, this serves the main purpose of the elusive "metadata store". Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-03-08 00:51:08 +00:00
return "", ocispec.Descriptor{}, nil, errors.Errorf("%v not found", ref)
cmd/dist: POC implementation of dist fetch With this changeset we introduce several new things. The first is the top-level dist command. This is a toolkit that implements various distribution primitives, such as fetching, unpacking and ingesting. The first component to this is a simple `fetch` command. It is a low-level command that takes a "remote", identified by a `locator`, and an object identifier. Keyed by the locator, this tool can identify a remote implementation to fetch the content and write it back to standard out. By allowing this to be the unit of pluggability in fetching content, we can have quite a bit of flexibility in how we retrieve images. The current `fetch` implementation provides anonymous access to docker hub images, through the namespace `docker.io`. As an example, one can fetch the manifest for `redis` with the following command: ``` $ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json ``` Note that we have provided a mediatype "hint", nudging the fetch implementation to grab the correct endpoint. We can hash the output of that to fetch the same content by digest: ``` $ ./dist fetch docker.io/library/redis sha256:$(./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | shasum -a256) ``` Note that the hint is now elided, since we have affixed the content to a particular hash. If you are not yet entertained, let's bring `jq` and `xargs` into the mix for maximum fun. The following incantation fetches the same manifest and downloads all layers into the convenience of `/dev/null`: ``` $ ./dist fetch docker.io/library/redis sha256:a027a470aa2b9b41cc2539847a97b8a14794ebd0a4c7c5d64e390df6bde56c73 | jq -r '.layers[] | .digest' | xargs -n1 -P10 ./dist fetch docker.io/library/redis > /dev/null ``` This is just the beginning. We should be able to centralize configuration around fetch to implement a number of distribution methodologies that have been challenging or impossible up to this point. The `locator`, mentioned earlier, is a schemaless URL that provides a host and path that can be used to resolve the remote. By dispatching on this common identifier, we should be able to support almost any protocol and discovery mechanism imaginable. When this is more solidified, we can roll these up into higher-level operations that can be orchestrated through the `dist` tool or via GRPC. What a time to be alive! Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-20 03:03:44 +00:00
}
func getToken(ctx contextpkg.Context, scopes ...string) (string, error) {
var (
u = url.URL{
Scheme: "https",
Host: "auth.docker.io",
Path: "/token",
}
q = url.Values{
"scope": scopes,
"service": []string{"registry.docker.io"}, // usually comes from auth challenge
}
)
u.RawQuery = q.Encode()
log.G(ctx).WithField("token.url", u.String()).Debug("requesting token")
resp, err := ctxhttp.Get(ctx, http.DefaultClient, u.String())
if err != nil {
return "", err
}
defer resp.Body.Close()
if resp.StatusCode > 299 {
return "", errors.Errorf("unexpected status code: %v %v", resp.StatusCode, resp.Status)
}
p, err := ioutil.ReadAll(resp.Body)
if err != nil {
return "", err
}
var tokenResponse struct {
Token string `json:"token"`
}
if err := json.Unmarshal(p, &tokenResponse); err != nil {
return "", err
}
return tokenResponse.Token, nil
}
// getV2URLPaths generates the candidate urls paths for the object based on the
cmd/dist: POC implementation of dist fetch With this changeset we introduce several new things. The first is the top-level dist command. This is a toolkit that implements various distribution primitives, such as fetching, unpacking and ingesting. The first component to this is a simple `fetch` command. It is a low-level command that takes a "remote", identified by a `locator`, and an object identifier. Keyed by the locator, this tool can identify a remote implementation to fetch the content and write it back to standard out. By allowing this to be the unit of pluggability in fetching content, we can have quite a bit of flexibility in how we retrieve images. The current `fetch` implementation provides anonymous access to docker hub images, through the namespace `docker.io`. As an example, one can fetch the manifest for `redis` with the following command: ``` $ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json ``` Note that we have provided a mediatype "hint", nudging the fetch implementation to grab the correct endpoint. We can hash the output of that to fetch the same content by digest: ``` $ ./dist fetch docker.io/library/redis sha256:$(./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | shasum -a256) ``` Note that the hint is now elided, since we have affixed the content to a particular hash. If you are not yet entertained, let's bring `jq` and `xargs` into the mix for maximum fun. The following incantation fetches the same manifest and downloads all layers into the convenience of `/dev/null`: ``` $ ./dist fetch docker.io/library/redis sha256:a027a470aa2b9b41cc2539847a97b8a14794ebd0a4c7c5d64e390df6bde56c73 | jq -r '.layers[] | .digest' | xargs -n1 -P10 ./dist fetch docker.io/library/redis > /dev/null ``` This is just the beginning. We should be able to centralize configuration around fetch to implement a number of distribution methodologies that have been challenging or impossible up to this point. The `locator`, mentioned earlier, is a schemaless URL that provides a host and path that can be used to resolve the remote. By dispatching on this common identifier, we should be able to support almost any protocol and discovery mechanism imaginable. When this is more solidified, we can roll these up into higher-level operations that can be orchestrated through the `dist` tool or via GRPC. What a time to be alive! Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-20 03:03:44 +00:00
// set of hints and the provided object id. URLs are returned in the order of
// most to least likely succeed.
cmd/dist, remotes: simplify resolution flow After receiving feedback during containerd summit walk through of the pull POC, we found that the resolution flow for names was out of place. We could see this present in awkward places where we were trying to re-resolve whether something was a digest or a tag and extra retries to various endpoints. By centering this problem around, "what do we write in the metadata store?", the following interface comes about: ``` Resolve(ctx context.Context, ref string) (name string, desc ocispec.Descriptor, fetcher Fetcher, err error) ``` The above takes an "opaque" reference (we'll get to this later) and returns the canonical name for the object, a content description of the object and a `Fetcher` that can be used to retrieve the object and its child resources. We can write `name` into the metadata store, pointing at the descriptor. Descisions about discovery, trust, provenance, distribution are completely abstracted away from the pulling code. A first response to such a monstrosity is "that is a lot of return arguments". When we look at the actual, we can see that in practice, the usage pattern works well, albeit we don't quite demonstrate the utility of `name`, which will be more apparent later. Designs that allowed separate resolution of the `Fetcher` and the return of a collected object were considered. Let's give this a chance before we go refactoring this further. With this change, we introduce a reference package with helps for remotes to decompose "docker-esque" references into consituent components, without arbitrarily enforcing those opinions on the backend. Utlimately, the name and the reference used to qualify that name are completely opaque to containerd. Obviously, implementors will need to show some candor in following some conventions, but the possibilities are fairly wide. Structurally, we still maintain the concept of the locator and object but the interpretation is up to the resolver. For the most part, the `dist` tool operates exactly the same, except objects can be fetched with a reference: ``` dist fetch docker.io/library/redis:latest ``` The above should work well with a running containerd instance. I recommend giving this a try with `fetch-object`, as well. With `fetch-object`, it is easy for one to better understand the intricacies of the OCI/Docker image formats. Ultimately, this serves the main purpose of the elusive "metadata store". Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-03-08 00:51:08 +00:00
func getV2URLPaths(desc ocispec.Descriptor) ([]string, error) {
cmd/dist: POC implementation of dist fetch With this changeset we introduce several new things. The first is the top-level dist command. This is a toolkit that implements various distribution primitives, such as fetching, unpacking and ingesting. The first component to this is a simple `fetch` command. It is a low-level command that takes a "remote", identified by a `locator`, and an object identifier. Keyed by the locator, this tool can identify a remote implementation to fetch the content and write it back to standard out. By allowing this to be the unit of pluggability in fetching content, we can have quite a bit of flexibility in how we retrieve images. The current `fetch` implementation provides anonymous access to docker hub images, through the namespace `docker.io`. As an example, one can fetch the manifest for `redis` with the following command: ``` $ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json ``` Note that we have provided a mediatype "hint", nudging the fetch implementation to grab the correct endpoint. We can hash the output of that to fetch the same content by digest: ``` $ ./dist fetch docker.io/library/redis sha256:$(./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | shasum -a256) ``` Note that the hint is now elided, since we have affixed the content to a particular hash. If you are not yet entertained, let's bring `jq` and `xargs` into the mix for maximum fun. The following incantation fetches the same manifest and downloads all layers into the convenience of `/dev/null`: ``` $ ./dist fetch docker.io/library/redis sha256:a027a470aa2b9b41cc2539847a97b8a14794ebd0a4c7c5d64e390df6bde56c73 | jq -r '.layers[] | .digest' | xargs -n1 -P10 ./dist fetch docker.io/library/redis > /dev/null ``` This is just the beginning. We should be able to centralize configuration around fetch to implement a number of distribution methodologies that have been challenging or impossible up to this point. The `locator`, mentioned earlier, is a schemaless URL that provides a host and path that can be used to resolve the remote. By dispatching on this common identifier, we should be able to support almost any protocol and discovery mechanism imaginable. When this is more solidified, we can roll these up into higher-level operations that can be orchestrated through the `dist` tool or via GRPC. What a time to be alive! Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-20 03:03:44 +00:00
var urls []string
cmd/dist, remotes: simplify resolution flow After receiving feedback during containerd summit walk through of the pull POC, we found that the resolution flow for names was out of place. We could see this present in awkward places where we were trying to re-resolve whether something was a digest or a tag and extra retries to various endpoints. By centering this problem around, "what do we write in the metadata store?", the following interface comes about: ``` Resolve(ctx context.Context, ref string) (name string, desc ocispec.Descriptor, fetcher Fetcher, err error) ``` The above takes an "opaque" reference (we'll get to this later) and returns the canonical name for the object, a content description of the object and a `Fetcher` that can be used to retrieve the object and its child resources. We can write `name` into the metadata store, pointing at the descriptor. Descisions about discovery, trust, provenance, distribution are completely abstracted away from the pulling code. A first response to such a monstrosity is "that is a lot of return arguments". When we look at the actual, we can see that in practice, the usage pattern works well, albeit we don't quite demonstrate the utility of `name`, which will be more apparent later. Designs that allowed separate resolution of the `Fetcher` and the return of a collected object were considered. Let's give this a chance before we go refactoring this further. With this change, we introduce a reference package with helps for remotes to decompose "docker-esque" references into consituent components, without arbitrarily enforcing those opinions on the backend. Utlimately, the name and the reference used to qualify that name are completely opaque to containerd. Obviously, implementors will need to show some candor in following some conventions, but the possibilities are fairly wide. Structurally, we still maintain the concept of the locator and object but the interpretation is up to the resolver. For the most part, the `dist` tool operates exactly the same, except objects can be fetched with a reference: ``` dist fetch docker.io/library/redis:latest ``` The above should work well with a running containerd instance. I recommend giving this a try with `fetch-object`, as well. With `fetch-object`, it is easy for one to better understand the intricacies of the OCI/Docker image formats. Ultimately, this serves the main purpose of the elusive "metadata store". Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-03-08 00:51:08 +00:00
switch desc.MediaType {
case MediaTypeDockerSchema2Manifest, MediaTypeDockerSchema2ManifestList,
ocispec.MediaTypeImageManifest, ocispec.MediaTypeImageIndex:
urls = append(urls, path.Join("manifests", desc.Digest.String()))
cmd/dist: POC implementation of dist fetch With this changeset we introduce several new things. The first is the top-level dist command. This is a toolkit that implements various distribution primitives, such as fetching, unpacking and ingesting. The first component to this is a simple `fetch` command. It is a low-level command that takes a "remote", identified by a `locator`, and an object identifier. Keyed by the locator, this tool can identify a remote implementation to fetch the content and write it back to standard out. By allowing this to be the unit of pluggability in fetching content, we can have quite a bit of flexibility in how we retrieve images. The current `fetch` implementation provides anonymous access to docker hub images, through the namespace `docker.io`. As an example, one can fetch the manifest for `redis` with the following command: ``` $ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json ``` Note that we have provided a mediatype "hint", nudging the fetch implementation to grab the correct endpoint. We can hash the output of that to fetch the same content by digest: ``` $ ./dist fetch docker.io/library/redis sha256:$(./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | shasum -a256) ``` Note that the hint is now elided, since we have affixed the content to a particular hash. If you are not yet entertained, let's bring `jq` and `xargs` into the mix for maximum fun. The following incantation fetches the same manifest and downloads all layers into the convenience of `/dev/null`: ``` $ ./dist fetch docker.io/library/redis sha256:a027a470aa2b9b41cc2539847a97b8a14794ebd0a4c7c5d64e390df6bde56c73 | jq -r '.layers[] | .digest' | xargs -n1 -P10 ./dist fetch docker.io/library/redis > /dev/null ``` This is just the beginning. We should be able to centralize configuration around fetch to implement a number of distribution methodologies that have been challenging or impossible up to this point. The `locator`, mentioned earlier, is a schemaless URL that provides a host and path that can be used to resolve the remote. By dispatching on this common identifier, we should be able to support almost any protocol and discovery mechanism imaginable. When this is more solidified, we can roll these up into higher-level operations that can be orchestrated through the `dist` tool or via GRPC. What a time to be alive! Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-20 03:03:44 +00:00
}
cmd/dist, remotes: simplify resolution flow After receiving feedback during containerd summit walk through of the pull POC, we found that the resolution flow for names was out of place. We could see this present in awkward places where we were trying to re-resolve whether something was a digest or a tag and extra retries to various endpoints. By centering this problem around, "what do we write in the metadata store?", the following interface comes about: ``` Resolve(ctx context.Context, ref string) (name string, desc ocispec.Descriptor, fetcher Fetcher, err error) ``` The above takes an "opaque" reference (we'll get to this later) and returns the canonical name for the object, a content description of the object and a `Fetcher` that can be used to retrieve the object and its child resources. We can write `name` into the metadata store, pointing at the descriptor. Descisions about discovery, trust, provenance, distribution are completely abstracted away from the pulling code. A first response to such a monstrosity is "that is a lot of return arguments". When we look at the actual, we can see that in practice, the usage pattern works well, albeit we don't quite demonstrate the utility of `name`, which will be more apparent later. Designs that allowed separate resolution of the `Fetcher` and the return of a collected object were considered. Let's give this a chance before we go refactoring this further. With this change, we introduce a reference package with helps for remotes to decompose "docker-esque" references into consituent components, without arbitrarily enforcing those opinions on the backend. Utlimately, the name and the reference used to qualify that name are completely opaque to containerd. Obviously, implementors will need to show some candor in following some conventions, but the possibilities are fairly wide. Structurally, we still maintain the concept of the locator and object but the interpretation is up to the resolver. For the most part, the `dist` tool operates exactly the same, except objects can be fetched with a reference: ``` dist fetch docker.io/library/redis:latest ``` The above should work well with a running containerd instance. I recommend giving this a try with `fetch-object`, as well. With `fetch-object`, it is easy for one to better understand the intricacies of the OCI/Docker image formats. Ultimately, this serves the main purpose of the elusive "metadata store". Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-03-08 00:51:08 +00:00
// always fallback to attempting to get the object out of the blobs store.
urls = append(urls, path.Join("blobs", desc.Digest.String()))
cmd/dist: POC implementation of dist fetch With this changeset we introduce several new things. The first is the top-level dist command. This is a toolkit that implements various distribution primitives, such as fetching, unpacking and ingesting. The first component to this is a simple `fetch` command. It is a low-level command that takes a "remote", identified by a `locator`, and an object identifier. Keyed by the locator, this tool can identify a remote implementation to fetch the content and write it back to standard out. By allowing this to be the unit of pluggability in fetching content, we can have quite a bit of flexibility in how we retrieve images. The current `fetch` implementation provides anonymous access to docker hub images, through the namespace `docker.io`. As an example, one can fetch the manifest for `redis` with the following command: ``` $ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json ``` Note that we have provided a mediatype "hint", nudging the fetch implementation to grab the correct endpoint. We can hash the output of that to fetch the same content by digest: ``` $ ./dist fetch docker.io/library/redis sha256:$(./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | shasum -a256) ``` Note that the hint is now elided, since we have affixed the content to a particular hash. If you are not yet entertained, let's bring `jq` and `xargs` into the mix for maximum fun. The following incantation fetches the same manifest and downloads all layers into the convenience of `/dev/null`: ``` $ ./dist fetch docker.io/library/redis sha256:a027a470aa2b9b41cc2539847a97b8a14794ebd0a4c7c5d64e390df6bde56c73 | jq -r '.layers[] | .digest' | xargs -n1 -P10 ./dist fetch docker.io/library/redis > /dev/null ``` This is just the beginning. We should be able to centralize configuration around fetch to implement a number of distribution methodologies that have been challenging or impossible up to this point. The `locator`, mentioned earlier, is a schemaless URL that provides a host and path that can be used to resolve the remote. By dispatching on this common identifier, we should be able to support almost any protocol and discovery mechanism imaginable. When this is more solidified, we can roll these up into higher-level operations that can be orchestrated through the `dist` tool or via GRPC. What a time to be alive! Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-20 03:03:44 +00:00
cmd/dist, remotes: simplify resolution flow After receiving feedback during containerd summit walk through of the pull POC, we found that the resolution flow for names was out of place. We could see this present in awkward places where we were trying to re-resolve whether something was a digest or a tag and extra retries to various endpoints. By centering this problem around, "what do we write in the metadata store?", the following interface comes about: ``` Resolve(ctx context.Context, ref string) (name string, desc ocispec.Descriptor, fetcher Fetcher, err error) ``` The above takes an "opaque" reference (we'll get to this later) and returns the canonical name for the object, a content description of the object and a `Fetcher` that can be used to retrieve the object and its child resources. We can write `name` into the metadata store, pointing at the descriptor. Descisions about discovery, trust, provenance, distribution are completely abstracted away from the pulling code. A first response to such a monstrosity is "that is a lot of return arguments". When we look at the actual, we can see that in practice, the usage pattern works well, albeit we don't quite demonstrate the utility of `name`, which will be more apparent later. Designs that allowed separate resolution of the `Fetcher` and the return of a collected object were considered. Let's give this a chance before we go refactoring this further. With this change, we introduce a reference package with helps for remotes to decompose "docker-esque" references into consituent components, without arbitrarily enforcing those opinions on the backend. Utlimately, the name and the reference used to qualify that name are completely opaque to containerd. Obviously, implementors will need to show some candor in following some conventions, but the possibilities are fairly wide. Structurally, we still maintain the concept of the locator and object but the interpretation is up to the resolver. For the most part, the `dist` tool operates exactly the same, except objects can be fetched with a reference: ``` dist fetch docker.io/library/redis:latest ``` The above should work well with a running containerd instance. I recommend giving this a try with `fetch-object`, as well. With `fetch-object`, it is easy for one to better understand the intricacies of the OCI/Docker image formats. Ultimately, this serves the main purpose of the elusive "metadata store". Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-03-08 00:51:08 +00:00
return urls, nil
cmd/dist: POC implementation of dist fetch With this changeset we introduce several new things. The first is the top-level dist command. This is a toolkit that implements various distribution primitives, such as fetching, unpacking and ingesting. The first component to this is a simple `fetch` command. It is a low-level command that takes a "remote", identified by a `locator`, and an object identifier. Keyed by the locator, this tool can identify a remote implementation to fetch the content and write it back to standard out. By allowing this to be the unit of pluggability in fetching content, we can have quite a bit of flexibility in how we retrieve images. The current `fetch` implementation provides anonymous access to docker hub images, through the namespace `docker.io`. As an example, one can fetch the manifest for `redis` with the following command: ``` $ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json ``` Note that we have provided a mediatype "hint", nudging the fetch implementation to grab the correct endpoint. We can hash the output of that to fetch the same content by digest: ``` $ ./dist fetch docker.io/library/redis sha256:$(./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | shasum -a256) ``` Note that the hint is now elided, since we have affixed the content to a particular hash. If you are not yet entertained, let's bring `jq` and `xargs` into the mix for maximum fun. The following incantation fetches the same manifest and downloads all layers into the convenience of `/dev/null`: ``` $ ./dist fetch docker.io/library/redis sha256:a027a470aa2b9b41cc2539847a97b8a14794ebd0a4c7c5d64e390df6bde56c73 | jq -r '.layers[] | .digest' | xargs -n1 -P10 ./dist fetch docker.io/library/redis > /dev/null ``` This is just the beginning. We should be able to centralize configuration around fetch to implement a number of distribution methodologies that have been challenging or impossible up to this point. The `locator`, mentioned earlier, is a schemaless URL that provides a host and path that can be used to resolve the remote. By dispatching on this common identifier, we should be able to support almost any protocol and discovery mechanism imaginable. When this is more solidified, we can roll these up into higher-level operations that can be orchestrated through the `dist` tool or via GRPC. What a time to be alive! Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-20 03:03:44 +00:00
}