Commit graph

25 commits

Author SHA1 Message Date
Stephen J Day
831f68fd71
cmd/dist, remotes: simplify resolution flow
After receiving feedback during containerd summit walk through of the
pull POC, we found that the resolution flow for names was out of place.
We could see this present in awkward places where we were trying to
re-resolve whether something was a digest or a tag and extra retries to
various endpoints.

By centering this problem around, "what do we write in the metadata
store?", the following interface comes about:

```
Resolve(ctx context.Context, ref string) (name string, desc ocispec.Descriptor, fetcher Fetcher, err error)
```

The above takes an "opaque" reference (we'll get to this later) and
returns the canonical name for the object, a content description of the
object and a `Fetcher` that can be used to retrieve the object and its
child resources. We can write `name` into the metadata store, pointing
at the descriptor. Descisions about discovery, trust, provenance,
distribution are completely abstracted away from the pulling code.

A first response to such a monstrosity is "that is a lot of return
arguments". When we look at the actual, we can see that in practice, the
usage pattern works well, albeit we don't quite demonstrate the utility
of `name`, which will be more apparent later. Designs that allowed
separate resolution of the `Fetcher` and the return of a collected
object were considered. Let's give this a chance before we go
refactoring this further.

With this change, we introduce a reference package with helps for
remotes to decompose "docker-esque" references into consituent
components, without arbitrarily enforcing those opinions on the backend.
Utlimately, the name and the reference used to qualify that name are
completely opaque to containerd. Obviously, implementors will need to
show some candor in following some conventions, but the possibilities
are fairly wide. Structurally, we still maintain the concept of the
locator and object but the interpretation is up to the resolver.

For the most part, the `dist` tool operates exactly the same, except
objects can be fetched with a reference:

```
dist fetch docker.io/library/redis:latest
```

The above should work well with a running containerd instance. I
recommend giving this a try with `fetch-object`, as well. With
`fetch-object`, it is easy for one to better understand the intricacies
of the OCI/Docker image formats.

Ultimately, this serves the main purpose of the elusive "metadata
store".

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-03-08 16:46:13 -08:00
Stephen J Day
55a1b4eff8 cmd/dist: implement fetch prototype
With the rename of fetch to fetch-object, we now introduce the `fetch`
command. It will fetch all of the resources required for an image into
the content store. We'll still need to follow this up with metadata
registration but this is a good start.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-03-02 17:36:01 -08:00
Stephen J Day
ea9389d4c5
cmd/dist: default mediatypes to oci and docker
To make using the `fetch-object` for demonstrations much easier, the
mediatypes are defaulted when a non-digest object identifier is
provided. We also add support for OCI mediatypes, although they are
mostly unavailable.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-03-02 16:50:32 -08:00
Stephen J Day
6ab6cdce71
cmd/dist: change fetch to fetch-object command
To allow us to differentiate from fetching an image, fetch a part of an
image and pulling an image, we now call the `fetch` command the
`fetch-object` command. We can now introduce a command that does the
complete image fetch without creating snapshots, allowing `pull` to
perform the entire process.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-03-02 13:50:09 -08:00
Stephen J Day
5da4e1d0d2 services/content: move service client into package
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-02-28 17:12:24 -08:00
Stephen J Day
d61d0b5aef
cmd/dist: add global connect-timeout for GRPC
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-02-28 16:43:08 -08:00
Stephen J Day
706c629354
api/services/content: define delete method
Allow deletion of content over the GRPC interface. For now, we are going
with a model that conducts reference management outside of the content
store, in the metadata store but this design is valid either way.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-02-27 20:06:29 -08:00
Stephen J Day
2e0c92b168
cmd/dist/fetch: address subtle concurrency bug
When using the fetcher concurrently, the loop modifying the closed
`base` parameter was causing urls from different digests to be returned
randomly. We copy the the value and then modify it to make it work
correctly.

Luckily, we are using content addressable storage or this would have
been undetectable.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-02-24 18:31:26 -08:00
Michael Crosby
e04df4e3e5 Merge pull request #571 from stevvooe/use-init-func
cmd/dist: consistently replace version string
2017-02-24 16:33:13 -08:00
Stephen J Day
1cdf9dc834
cmd/dist: consistently replace version string
A previous PR placed the version string replacement in the `init`
function in the other commands. This makes this same change consistently
in the `dist` tool.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-02-24 16:09:19 -08:00
Daniel, Dao Quang Minh
660783cb00 Merge pull request #548 from fate-grand-order/fixed
Use errors.New() directly to output the error message
2017-02-23 11:06:04 +00:00
Phil Estes
a463ba33fc Merge pull request #561 from stevvooe/correct-versioning
version: finish version setup
2017-02-22 13:48:05 -08:00
Stephen J Day
c062a85782
content: cleanup service and interfaces
After implementing pull, a few changes are required to the content store
interface to make sure that the implementation works smoothly.
Specifically, we work to make sure the predeclaration path for digests
works the same between remote and local writers. Before, we were
hesitent to require the the size and digest up front, but it became
clear that having this provided significant benefit.

There are also several cleanups related to naming. We now call the
expected digest `Expected` consistently across the board and `Total` is
used to mark the expected size.

This whole effort comes together to provide a very smooth status
reporting workflow for image pull and push. This will be more obvious
when the bulk of pull code lands.

There are a few other changes to make `content.WriteBlob` more broadly
useful. In accordance with addition for predeclaring expected size when
getting a `Writer`, `WriteBlob` now supports this fully. It will also
resume downloads if provided an `io.Seeker` or `io.ReaderAt`. Coupled
with the `httpReadSeeker` from `docker/distribution`, we should only be
a lines of code away from resumable downloads.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-02-22 13:30:01 -08:00
Stephen J Day
935144fadd
version: finish version setup
This setup will now correctly set the version number from the git tag.
When using `--version`, we will see the binary name, the package it was
built from and a git hash based on the tag:

```console
$./bin/dist -v
./bin/dist github.com/docker/containerd 0b45d91.m
```

Note that in the above example, if we set a tag of `v1.0.0-dev`, that
will show up in the version number, as follows:

```console
$./bin/dist -v
./bin/dist github.com/docker/containerd v1.0.0-dev
```

Once commits are made past that tag, the version number will be
expressed relative to that tag and include a git hash:

```console
$./bin/dist -v
./bin/dist github.com/docker/containerd v1.0.0-dev-1-g7953e96.m
```

Some these examples include a `.m` postfix. This indicates that the
binary was build from a source tree with local modifications.

We can add a dev tag to start getting 1.0 version numbers for test
builds.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-02-22 13:16:06 -08:00
fate-grand-order
08405824ad Use errors.New() directly to output the error message
Signed-off-by: fate-grand-order <chenjg@harmonycloud.cn>
2017-02-22 10:53:16 +08:00
Stephen J Day
e6efb397cf
cmd/dist: port commands over to use GRPC content store
Following from the rest of the work in this branch, we now are porting
the dist command to work directly against the containerd content API.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-02-21 13:10:31 -08:00
Stephen J Day
621164bc84
content: refactor content store for API
After iterating on the GRPC API, the changes required for the actual
implementation are now included in the content store. The begin change
is the move to a single, atomic `Ingester.Writer` method for locking
content ingestion on a key. From this, comes several new interface
definitions.

The main benefit here is the clarification between `Status` and `Info`
that came out of the GPRC API. `Status` tells the status of a write,
whereas `Info` is for querying metadata about various blobs.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-02-21 13:10:22 -08:00
Michael Crosby
1d08c7bc5c Merge pull request #549 from fate-grand-order/typo
correct misspell in cmd/dist/fetch.go and events/transaction.go
2017-02-21 11:14:15 -08:00
fate-grand-order
3626ee7b77 correct misspell in cmd/dist/fetch.go and events/transaction.go
Signed-off-by: fate-grand-order <chenjg@harmonycloud.cn>
2017-02-21 20:24:04 +08:00
Derek McGowan
6443891a7d Update log lines to use containerd log package
Removed unused requires root test function and updated
tar requires function to use lookup method.

Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
2017-02-17 11:50:49 -08:00
Derek McGowan
f0a43e72cd Update layer apply to use containerd archive
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
2017-02-17 11:50:49 -08:00
fate-grand-order
af86cd4d2f Use error.New () directly to output the error message
Signed-off-by: fate-grand-order <chenjg@harmonycloud.cn>
2017-02-10 14:31:49 +08:00
Stephen J Day
3e0238612b
dist: provide apply command to build rootfs
This changeset adds the simple apply command. It consumes a tar layer
and applies that layer to the specified directory. For the most part, it
is a direct call into Docker's `pkg/archive.ApplyLayer`.

The following demonstrates unpacking the wordpress rootfs into a local
directory `wordpress`:

```
$ ./dist fetch docker.io/library/wordpress 4.5 mediatype:application/vnd.docker.distribution.manifest.v2+json | \
    jq -r '.layers[] | "sudo ./dist apply ./wordpress < $(./dist path -n "+.digest+")"' | xargs -I{} -n1 sh -c "{}"
```

Note that you should have fetched the layers into the local content
store before running the above. Alternatively, you can just read the
manifest from the content store, rather than fetching it. We use fetch
above to avoid having to lookup the manifest digest for our demo.

This tool has a long way to go. We still need to incorporate
snapshotting, as well as the ability to calculate the `ChainID` under
subsequent unpacking. Once we have some tools to play around with
snapshotting, we'll be able to incorporate our `rootfs.ApplyLayer`
algorithm that will get us a lot closer to a production worthy system.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-27 11:00:29 -08:00
Stephen J Day
f9cd9be61a
dist: expand functionality of the dist tool
With this change, we add the following commands to the dist tool:

- `ingest`: verify and accept content into storage
- `active`: display active ingest processes
- `list`: list content in storage
- `path`: provide a path to a blob by digest
- `delete`: remove a piece of content from storage

We demonstrate the utility with the following shell pipeline:

```
$ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | \
    jq -r '.layers[] | "./dist fetch docker.io/library/redis "+.digest + "| ./dist ingest --expected-digest "+.digest+" --expected-size "+(.size | tostring) +" docker.io/library/redis@"+.digest' | xargs -I{} -P10 -n1 sh -c "{}"
```

The above fetches a manifest, pipes it to jq, which assembles a shell
pipeline to ingest each layer into the content store. Because the
transactions are keyed by their digest, concurrent downloads and
downloads of repeated content are ignored. Each process is then executed
parallel using xargs.

Put shortly, this is a parallel layer download.

In a separate shell session, could monitor the active downloads with the
following:

```
$ watch -n0.2 ./dist active
```

For now, the content is downloaded into `.content` in the current
working directory. To watch the contents of this directory, you can use
the following:

```
$ watch -n0.2 tree .content
```

This will help to understand what is going on internally.

To get access to the layers, you can use the path command:

```
$./dist path sha256:010c454d55e53059beaba4044116ea4636f8dd8181e975d893931c7e7204fffa
sha256:010c454d55e53059beaba4044116ea4636f8dd8181e975d893931c7e7204fffa /home/sjd/go/src/github.com/docker/containerd/.content/blobs/sha256/010c454d55e53059beaba4044116ea4636f8dd8181e975d893931c7e7204fffa
```

When you are done, you can clear out the content with the classic xargs
pipeline:

```
$ ./dist list -q | xargs ./dist delete
```

Note that this is mostly a POC. Things like failed downloads and
abandoned download cleanup aren't quite handled. We'll probably make
adjustments around how content store transactions are handled to address
this.

From here, we'll build out full image pull and create tooling to get
runtime bundles from the fetched content.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-27 10:29:10 -08:00
Stephen J Day
19eecaab12
cmd/dist: POC implementation of dist fetch
With this changeset we introduce several new things. The first is the
top-level dist command. This is a toolkit that implements various
distribution primitives, such as fetching, unpacking and ingesting.

The first component to this is a simple `fetch` command. It is a
low-level command that takes a "remote", identified by a `locator`, and
an object identifier. Keyed by the locator, this tool can identify a
remote implementation to fetch the content and write it back to standard
out. By allowing this to be the unit of pluggability in fetching
content, we can have quite a bit of flexibility in how we retrieve
images.

The current `fetch` implementation provides anonymous access to docker
hub images, through the namespace `docker.io`. As an example, one can
fetch the manifest for `redis` with the following command:

```
$ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json
```

Note that we have provided a mediatype "hint", nudging the fetch
implementation to grab the correct endpoint. We can hash the output of
that to fetch the same content by digest:

```
$ ./dist fetch docker.io/library/redis sha256:$(./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | shasum -a256)
```

Note that the hint is now elided, since we have affixed the content to a
particular hash.

If you are not yet entertained, let's bring `jq` and `xargs` into the
mix for maximum fun. The following incantation fetches the same manifest
and downloads all layers into the convenience of `/dev/null`:

```
$ ./dist fetch docker.io/library/redis sha256:a027a470aa2b9b41cc2539847a97b8a14794ebd0a4c7c5d64e390df6bde56c73 | jq -r '.layers[] | .digest' | xargs -n1 -P10 ./dist fetch docker.io/library/redis > /dev/null
```

This is just the beginning. We should be able to centralize
configuration around fetch to implement a number of distribution
methodologies that have been challenging or impossible up to this point.
The `locator`, mentioned earlier, is a schemaless URL that provides a
host and path that can be used to resolve the remote. By dispatching on
this common identifier, we should be able to support almost any protocol
and discovery mechanism imaginable.

When this is more solidified, we can roll these up into higher-level
operations that can be orchestrated through the `dist` tool or via GRPC.

What a time to be alive!

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2017-01-23 13:27:07 -08:00