containerd

Author	SHA1	Message	Date
Stephen J Day	e53539c58f	cmd/dist, cmd/ctr: end to end image pull With this changeset, we now have a proof of concept of end to end pull. Up to this point, the relationship between subsystems has been somewhat theoretical. We now leverage fetching, the snapshot drivers, the rootfs service, image metadata and the execution service, validating the proposed model for containerd. There are a few caveats, including the need to move some of the access into GRPC services, but the basic components are there. The first command we will cover here is `dist pull`. This is the analog of `docker pull` and `git pull`. It performs a full resource fetch for an image and unpacks the root filesystem into the snapshot drivers. An example follows: ``` console $ sudo ./bin/dist pull docker.io/library/redis:latest docker.io/library/redis:latest: resolved \|++++++++++++++++++++++++++++++++++++++\| manifest-sha256:4c8fb09e8d634ab823b1c125e64f0e1ceaf216025aa38283ea1b42997f1e8059: done \|++++++++++++++++++++++++++++++++++++++\| layer-sha256:3b281f2bcae3b25c701d53a219924fffe79bdb74385340b73a539ed4020999c4: done \|++++++++++++++++++++++++++++++++++++++\| config-sha256:e4a35914679d05d25e2fccfd310fde1aa59ffbbf1b0b9d36f7b03db5ca0311b0: done \|++++++++++++++++++++++++++++++++++++++\| layer-sha256:4b7726832aec75f0a742266c7190c4d2217492722dfd603406208eaa902648d8: done \|++++++++++++++++++++++++++++++++++++++\| layer-sha256:338a7133395941c85087522582af182d2f6477dbf54ba769cb24ec4fd91d728f: done \|++++++++++++++++++++++++++++++++++++++\| layer-sha256:83f12ff60ff1132d1e59845e26c41968406b4176c1a85a50506c954696b21570: done \|++++++++++++++++++++++++++++++++++++++\| layer-sha256:693502eb7dfbc6b94964ae66ebc72d3e32facd981c72995b09794f1e87bac184: done \|++++++++++++++++++++++++++++++++++++++\| layer-sha256:622732cddc347afc9360b4b04b46c6f758191a1dc73d007f95548658847ee67e: done \|++++++++++++++++++++++++++++++++++++++\| layer-sha256:19a7e34366a6f558336c364693df538c38307484b729a36fede76432789f084f: done \|++++++++++++++++++++++++++++++++++++++\| elapsed: 1.6 s total: 0.0 B (0.0 B/s) INFO[0001] unpacking rootfs ``` Note that we haven't integrated rootfs unpacking into the status output, but we pretty much have what is in docker today (:P). We can see the result of our pull with the following: ```console $ sudo ./bin/dist images REF TYPE DIGEST SIZE docker.io/library/redis:latest application/vnd.docker.distribution.manifest.v2+json sha256:4c8fb09e8d634ab823b1c125e64f0e1ceaf216025aa38283ea1b42997f1e8059 1.8 kB ``` The above shows that we have an image called "docker.io/library/redis:latest" mapped to the given digest marked with a specific format. We get the size of the manifest right now, not the full image, but we can add more as we need it. For the most part, this is all that is needed, but a few tweaks to the model for naming may need to be added. Specifically, we may want to index under a few different names, including those qualified by hash or matched by tag versions. We can do more work in this area as we develop the metadata store. The name shown above can then be used to run the actual container image. We can do this with the following command: ```console $ sudo ./bin/ctr run --id foo docker.io/library/redis:latest /usr/local/bin/redis-server 1:C 17 Mar 17:20:25.316 # Warning: no config file specified, using the default config. In order to specify a config file use /usr/local/bin/redis-server /path/to/redis.conf 1:M 17 Mar 17:20:25.317 * Increased maximum number of open files to 10032 (it was originally set to 1024). _._ _.-``__ ''-._ _.-`` `. `_. ''-._ Redis 3.2.8 (00000000/0) 64 bit .-`` .-```. ```\/ _.,_ ''-._ ( ' , .-` \| `, ) Running in standalone mode \|`-._`-...-` __...-.``-._\|'` _.-'\| Port: 6379 \| `-._ `._ / _.-' \| PID: 1 `-._ `-._ `-./ _.-' _.-' \|`-._`-._ `-.__.-' _.-'_.-'\| \| `-._`-._ _.-'_.-' \| http://redis.io `-._ `-._`-.__.-'_.-' _.-' \|`-._`-._ `-.__.-' _.-'_.-'\| \| `-._`-._ _.-'_.-' \| `-._ `-._`-.__.-'_.-' _.-' `-._ `-.__.-' _.-' `-._ _.-' `-.__.-' 1:M 17 Mar 17:20:25.326 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128. 1:M 17 Mar 17:20:25.326 # Server started, Redis version 3.2.8 1:M 17 Mar 17:20:25.326 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect. 1:M 17 Mar 17:20:25.326 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled. 1:M 17 Mar 17:20:25.326 * The server is now ready to accept connections on port 6379 ``` Wow! So, now we are running `redis`! There are still a few things to work out. Notice that we have to specify the command as part of the arguments to `ctr run`. This is because are not yet reading the image config and converting it to an OCI runtime config. With the base laid in this PR, adding such functionality should be straightforward. While this is a _little_ messy, this is great progress. It should be easy sailing from here. Signed-off-by: Stephen J Day <stephen.day@docker.com>	2017-03-21 13:08:23 -07:00
Stephen J Day	5a3151eefc	cmd/dist, image, remotes: introduce image handlers With this PR, we introduce the concept of image handlers. They support walking a tree of image resource descriptors for doing various tasks related to processing them. Handlers can be dispatched sequentially or in parallel and can be stacked for various effects. The main functionality we introduce here is parameterized fetch without coupling format resolution to the process itself. Two important handlers, `remotes.FetchHandler` and `image.ChildrenHandler` can be composed to implement recursive fetch with full status reporting. The approach can also be modified to filter based on platform or other constraints, unlocking a lot of possibilities. This also includes some light refactoring in the fetch command, in preparation for submission of end to end pull. Signed-off-by: Stephen J Day <stephen.day@docker.com>	2017-03-17 15:47:50 -07:00
Stephen Day	bb3fbded9c	Merge pull request #632 from dmcgowan/rootfs-fixes Fix rootfs digest computation	2017-03-16 12:04:49 -07:00
Akihiro Suda	6089c1525b	new package: compression (ported from docker/pkg/archive) Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>	2017-03-16 05:29:27 +00:00
Derek McGowan	4492a2cee3	Fix rootfs digest computation Compute digest from uncompressed archive. Properly propagate error on unpack. Rename dist cmd commands to match command name. Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)	2017-03-15 17:17:25 -07:00
Derek McGowan	212efa578a	Remove get function from rootfs The service can use the snapshotter directly to get the rootfs. Removed debug line for mount response. Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)	2017-03-15 16:32:21 -07:00
Derek McGowan	b1bc82726f	Rename prepare to unpack and init to prepare Unpack and prepare better map to the actions done by rootfs. Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)	2017-03-15 16:32:21 -07:00
Derek McGowan	3a20dd41d5	Add init subcommand to rootfs Init command gets the mounts for a given chain id and outputs a mount command. Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)	2017-03-15 16:32:21 -07:00
Derek McGowan	38a6f90f2b	Add rootfs command to dist Commands allows preparing a rootfs from a manifest hash Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)	2017-03-15 16:32:21 -07:00
Stephen J Day	831f68fd71	cmd/dist, remotes: simplify resolution flow After receiving feedback during containerd summit walk through of the pull POC, we found that the resolution flow for names was out of place. We could see this present in awkward places where we were trying to re-resolve whether something was a digest or a tag and extra retries to various endpoints. By centering this problem around, "what do we write in the metadata store?", the following interface comes about: ``` Resolve(ctx context.Context, ref string) (name string, desc ocispec.Descriptor, fetcher Fetcher, err error) ``` The above takes an "opaque" reference (we'll get to this later) and returns the canonical name for the object, a content description of the object and a `Fetcher` that can be used to retrieve the object and its child resources. We can write `name` into the metadata store, pointing at the descriptor. Descisions about discovery, trust, provenance, distribution are completely abstracted away from the pulling code. A first response to such a monstrosity is "that is a lot of return arguments". When we look at the actual, we can see that in practice, the usage pattern works well, albeit we don't quite demonstrate the utility of `name`, which will be more apparent later. Designs that allowed separate resolution of the `Fetcher` and the return of a collected object were considered. Let's give this a chance before we go refactoring this further. With this change, we introduce a reference package with helps for remotes to decompose "docker-esque" references into consituent components, without arbitrarily enforcing those opinions on the backend. Utlimately, the name and the reference used to qualify that name are completely opaque to containerd. Obviously, implementors will need to show some candor in following some conventions, but the possibilities are fairly wide. Structurally, we still maintain the concept of the locator and object but the interpretation is up to the resolver. For the most part, the `dist` tool operates exactly the same, except objects can be fetched with a reference: ``` dist fetch docker.io/library/redis:latest ``` The above should work well with a running containerd instance. I recommend giving this a try with `fetch-object`, as well. With `fetch-object`, it is easy for one to better understand the intricacies of the OCI/Docker image formats. Ultimately, this serves the main purpose of the elusive "metadata store". Signed-off-by: Stephen J Day <stephen.day@docker.com>	2017-03-08 16:46:13 -08:00
Stephen J Day	55a1b4eff8	cmd/dist: implement fetch prototype With the rename of fetch to fetch-object, we now introduce the `fetch` command. It will fetch all of the resources required for an image into the content store. We'll still need to follow this up with metadata registration but this is a good start. Signed-off-by: Stephen J Day <stephen.day@docker.com>	2017-03-02 17:36:01 -08:00
Stephen J Day	ea9389d4c5	cmd/dist: default mediatypes to oci and docker To make using the `fetch-object` for demonstrations much easier, the mediatypes are defaulted when a non-digest object identifier is provided. We also add support for OCI mediatypes, although they are mostly unavailable. Signed-off-by: Stephen J Day <stephen.day@docker.com>	2017-03-02 16:50:32 -08:00
Stephen J Day	6ab6cdce71	cmd/dist: change fetch to fetch-object command To allow us to differentiate from fetching an image, fetch a part of an image and pulling an image, we now call the `fetch` command the `fetch-object` command. We can now introduce a command that does the complete image fetch without creating snapshots, allowing `pull` to perform the entire process. Signed-off-by: Stephen J Day <stephen.day@docker.com>	2017-03-02 13:50:09 -08:00
Stephen J Day	5da4e1d0d2	services/content: move service client into package Signed-off-by: Stephen J Day <stephen.day@docker.com>	2017-02-28 17:12:24 -08:00
Stephen J Day	d61d0b5aef	cmd/dist: add global connect-timeout for GRPC Signed-off-by: Stephen J Day <stephen.day@docker.com>	2017-02-28 16:43:08 -08:00
Stephen J Day	706c629354	api/services/content: define delete method Allow deletion of content over the GRPC interface. For now, we are going with a model that conducts reference management outside of the content store, in the metadata store but this design is valid either way. Signed-off-by: Stephen J Day <stephen.day@docker.com>	2017-02-27 20:06:29 -08:00
Stephen J Day	2e0c92b168	cmd/dist/fetch: address subtle concurrency bug When using the fetcher concurrently, the loop modifying the closed `base` parameter was causing urls from different digests to be returned randomly. We copy the the value and then modify it to make it work correctly. Luckily, we are using content addressable storage or this would have been undetectable. Signed-off-by: Stephen J Day <stephen.day@docker.com>	2017-02-24 18:31:26 -08:00
Michael Crosby	e04df4e3e5	Merge pull request #571 from stevvooe/use-init-func cmd/dist: consistently replace version string	2017-02-24 16:33:13 -08:00
Stephen J Day	1cdf9dc834	cmd/dist: consistently replace version string A previous PR placed the version string replacement in the `init` function in the other commands. This makes this same change consistently in the `dist` tool. Signed-off-by: Stephen J Day <stephen.day@docker.com>	2017-02-24 16:09:19 -08:00
Daniel, Dao Quang Minh	660783cb00	Merge pull request #548 from fate-grand-order/fixed Use errors.New() directly to output the error message	2017-02-23 11:06:04 +00:00
Phil Estes	a463ba33fc	Merge pull request #561 from stevvooe/correct-versioning version: finish version setup	2017-02-22 13:48:05 -08:00
Stephen J Day	c062a85782	content: cleanup service and interfaces After implementing pull, a few changes are required to the content store interface to make sure that the implementation works smoothly. Specifically, we work to make sure the predeclaration path for digests works the same between remote and local writers. Before, we were hesitent to require the the size and digest up front, but it became clear that having this provided significant benefit. There are also several cleanups related to naming. We now call the expected digest `Expected` consistently across the board and `Total` is used to mark the expected size. This whole effort comes together to provide a very smooth status reporting workflow for image pull and push. This will be more obvious when the bulk of pull code lands. There are a few other changes to make `content.WriteBlob` more broadly useful. In accordance with addition for predeclaring expected size when getting a `Writer`, `WriteBlob` now supports this fully. It will also resume downloads if provided an `io.Seeker` or `io.ReaderAt`. Coupled with the `httpReadSeeker` from `docker/distribution`, we should only be a lines of code away from resumable downloads. Signed-off-by: Stephen J Day <stephen.day@docker.com>	2017-02-22 13:30:01 -08:00
Stephen J Day	935144fadd	version: finish version setup This setup will now correctly set the version number from the git tag. When using `--version`, we will see the binary name, the package it was built from and a git hash based on the tag: ```console $./bin/dist -v ./bin/dist github.com/docker/containerd 0b45d91.m ``` Note that in the above example, if we set a tag of `v1.0.0-dev`, that will show up in the version number, as follows: ```console $./bin/dist -v ./bin/dist github.com/docker/containerd v1.0.0-dev ``` Once commits are made past that tag, the version number will be expressed relative to that tag and include a git hash: ```console $./bin/dist -v ./bin/dist github.com/docker/containerd v1.0.0-dev-1-g7953e96.m ``` Some these examples include a `.m` postfix. This indicates that the binary was build from a source tree with local modifications. We can add a dev tag to start getting 1.0 version numbers for test builds. Signed-off-by: Stephen J Day <stephen.day@docker.com>	2017-02-22 13:16:06 -08:00
fate-grand-order	08405824ad	Use errors.New() directly to output the error message Signed-off-by: fate-grand-order <chenjg@harmonycloud.cn>	2017-02-22 10:53:16 +08:00
Stephen J Day	e6efb397cf	cmd/dist: port commands over to use GRPC content store Following from the rest of the work in this branch, we now are porting the dist command to work directly against the containerd content API. Signed-off-by: Stephen J Day <stephen.day@docker.com>	2017-02-21 13:10:31 -08:00
Stephen J Day	621164bc84	content: refactor content store for API After iterating on the GRPC API, the changes required for the actual implementation are now included in the content store. The begin change is the move to a single, atomic `Ingester.Writer` method for locking content ingestion on a key. From this, comes several new interface definitions. The main benefit here is the clarification between `Status` and `Info` that came out of the GPRC API. `Status` tells the status of a write, whereas `Info` is for querying metadata about various blobs. Signed-off-by: Stephen J Day <stephen.day@docker.com>	2017-02-21 13:10:22 -08:00
Michael Crosby	1d08c7bc5c	Merge pull request #549 from fate-grand-order/typo correct misspell in cmd/dist/fetch.go and events/transaction.go	2017-02-21 11:14:15 -08:00
fate-grand-order	3626ee7b77	correct misspell in cmd/dist/fetch.go and events/transaction.go Signed-off-by: fate-grand-order <chenjg@harmonycloud.cn>	2017-02-21 20:24:04 +08:00
Derek McGowan	6443891a7d	Update log lines to use containerd log package Removed unused requires root test function and updated tar requires function to use lookup method. Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)	2017-02-17 11:50:49 -08:00
Derek McGowan	f0a43e72cd	Update layer apply to use containerd archive Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)	2017-02-17 11:50:49 -08:00
fate-grand-order	af86cd4d2f	Use error.New () directly to output the error message Signed-off-by: fate-grand-order <chenjg@harmonycloud.cn>	2017-02-10 14:31:49 +08:00
Stephen J Day	3e0238612b	dist: provide apply command to build rootfs This changeset adds the simple apply command. It consumes a tar layer and applies that layer to the specified directory. For the most part, it is a direct call into Docker's `pkg/archive.ApplyLayer`. The following demonstrates unpacking the wordpress rootfs into a local directory `wordpress`: ``` $ ./dist fetch docker.io/library/wordpress 4.5 mediatype:application/vnd.docker.distribution.manifest.v2+json \| \ jq -r '.layers[] \| "sudo ./dist apply ./wordpress < $(./dist path -n "+.digest+")"' \| xargs -I{} -n1 sh -c "{}" ``` Note that you should have fetched the layers into the local content store before running the above. Alternatively, you can just read the manifest from the content store, rather than fetching it. We use fetch above to avoid having to lookup the manifest digest for our demo. This tool has a long way to go. We still need to incorporate snapshotting, as well as the ability to calculate the `ChainID` under subsequent unpacking. Once we have some tools to play around with snapshotting, we'll be able to incorporate our `rootfs.ApplyLayer` algorithm that will get us a lot closer to a production worthy system. Signed-off-by: Stephen J Day <stephen.day@docker.com>	2017-01-27 11:00:29 -08:00
Stephen J Day	f9cd9be61a	dist: expand functionality of the dist tool With this change, we add the following commands to the dist tool: - `ingest`: verify and accept content into storage - `active`: display active ingest processes - `list`: list content in storage - `path`: provide a path to a blob by digest - `delete`: remove a piece of content from storage We demonstrate the utility with the following shell pipeline: ``` $ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json \| \ jq -r '.layers[] \| "./dist fetch docker.io/library/redis "+.digest + "\| ./dist ingest --expected-digest "+.digest+" --expected-size "+(.size \| tostring) +" docker.io/library/redis@"+.digest' \| xargs -I{} -P10 -n1 sh -c "{}" ``` The above fetches a manifest, pipes it to jq, which assembles a shell pipeline to ingest each layer into the content store. Because the transactions are keyed by their digest, concurrent downloads and downloads of repeated content are ignored. Each process is then executed parallel using xargs. Put shortly, this is a parallel layer download. In a separate shell session, could monitor the active downloads with the following: ``` $ watch -n0.2 ./dist active ``` For now, the content is downloaded into `.content` in the current working directory. To watch the contents of this directory, you can use the following: ``` $ watch -n0.2 tree .content ``` This will help to understand what is going on internally. To get access to the layers, you can use the path command: ``` $./dist path sha256:010c454d55e53059beaba4044116ea4636f8dd8181e975d893931c7e7204fffa sha256:010c454d55e53059beaba4044116ea4636f8dd8181e975d893931c7e7204fffa /home/sjd/go/src/github.com/docker/containerd/.content/blobs/sha256/010c454d55e53059beaba4044116ea4636f8dd8181e975d893931c7e7204fffa ``` When you are done, you can clear out the content with the classic xargs pipeline: ``` $ ./dist list -q \| xargs ./dist delete ``` Note that this is mostly a POC. Things like failed downloads and abandoned download cleanup aren't quite handled. We'll probably make adjustments around how content store transactions are handled to address this. From here, we'll build out full image pull and create tooling to get runtime bundles from the fetched content. Signed-off-by: Stephen J Day <stephen.day@docker.com>	2017-01-27 10:29:10 -08:00
Stephen J Day	19eecaab12	cmd/dist: POC implementation of dist fetch With this changeset we introduce several new things. The first is the top-level dist command. This is a toolkit that implements various distribution primitives, such as fetching, unpacking and ingesting. The first component to this is a simple `fetch` command. It is a low-level command that takes a "remote", identified by a `locator`, and an object identifier. Keyed by the locator, this tool can identify a remote implementation to fetch the content and write it back to standard out. By allowing this to be the unit of pluggability in fetching content, we can have quite a bit of flexibility in how we retrieve images. The current `fetch` implementation provides anonymous access to docker hub images, through the namespace `docker.io`. As an example, one can fetch the manifest for `redis` with the following command: ``` $ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json ``` Note that we have provided a mediatype "hint", nudging the fetch implementation to grab the correct endpoint. We can hash the output of that to fetch the same content by digest: ``` $ ./dist fetch docker.io/library/redis sha256:$(./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json \| shasum -a256) ``` Note that the hint is now elided, since we have affixed the content to a particular hash. If you are not yet entertained, let's bring `jq` and `xargs` into the mix for maximum fun. The following incantation fetches the same manifest and downloads all layers into the convenience of `/dev/null`: ``` $ ./dist fetch docker.io/library/redis sha256:a027a470aa2b9b41cc2539847a97b8a14794ebd0a4c7c5d64e390df6bde56c73 \| jq -r '.layers[] \| .digest' \| xargs -n1 -P10 ./dist fetch docker.io/library/redis > /dev/null ``` This is just the beginning. We should be able to centralize configuration around fetch to implement a number of distribution methodologies that have been challenging or impossible up to this point. The `locator`, mentioned earlier, is a schemaless URL that provides a host and path that can be used to resolve the remote. By dispatching on this common identifier, we should be able to support almost any protocol and discovery mechanism imaginable. When this is more solidified, we can roll these up into higher-level operations that can be orchestrated through the `dist` tool or via GRPC. What a time to be alive! Signed-off-by: Stephen J Day <stephen.day@docker.com>	2017-01-23 13:27:07 -08:00

34 commits