From ff0ddaa28e48dd721ed5e1e6be734f9669fac0ba Mon Sep 17 00:00:00 2001
From: Stephen J Day <stephen.day@docker.com>
Date: Fri, 27 Jan 2017 12:04:23 -0800
Subject: [PATCH] reports: development report for 2017-01-27

Signed-off-by: Stephen J Day <stephen.day@docker.com>
---
 reports/2017-01-27.md | 256 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 256 insertions(+)
 create mode 100644 reports/2017-01-27.md

diff --git a/reports/2017-01-27.md b/reports/2017-01-27.md
new file mode 100644
index 0000000..c9438c3
--- /dev/null
+++ b/reports/2017-01-27.md
@@ -0,0 +1,256 @@
+# Development Report for Jan 27, 2017
+
+This week we made a lot of progress on tools to work with local content storage
+and image distribution. These parts are critical in forming an end to end proof
+of concept, taking docker/oci images and turning them into bundles.
+
+We also have defined a new GRPC protocol for interacting with the
+container-shim, which is used for robust container management.
+
+## Maintainers
+
+* https://github.com/docker/containerd/pull/473
+
+Derek McGowan will be joining the containerd team as a maintainer. His
+extensive experience in graphdrivers and distribution will be invaluable to the
+containerd project.
+
+## Shim over GRPC
+
+* https://github.com/docker/containerd/pull/462
+
+```
+NAME:
+   containerd-shim - 
+                    __        _                     __           __    _         
+  _________  ____  / /_____ _(_)___  ___  _________/ /     _____/ /_  (_)___ ___ 
+ / ___/ __ \/ __ \/ __/ __ `/ / __ \/ _ \/ ___/ __  /_____/ ___/ __ \/ / __ `__ \
+/ /__/ /_/ / / / / /_/ /_/ / / / / /  __/ /  / /_/ /_____(__  ) / / / / / / / / /
+\___/\____/_/ /_/\__/\__,_/_/_/ /_/\___/_/   \__,_/     /____/_/ /_/_/_/ /_/ /_/ 
+                                                                                 
+shim for container lifecycle and reconnection
+
+
+USAGE:
+   containerd-shim [global options] command [command options] [arguments...]
+
+VERSION:
+   1.0.0
+
+COMMANDS:
+     help, h  Shows a list of commands or help for one command
+
+GLOBAL OPTIONS:
+   --debug        enable debug output in logs
+   --help, -h     show help
+   --version, -v  print the version
+
+```
+
+This week we completed work on porting the shim over to GRPC.  This allows us
+to have a more robust way to interface with the shim.  It also allows us to
+have one shim per container where previously we had one shim per process.  This
+drastically reduces the memory usage for exec processes.
+
+We also had a lot of code in the containerd core for syncing with the shims
+during execution.  This was because we needed ways to signal if the shim was
+running, the container was created or any errors on create and then starting
+the container's process.  Getting this right and syncing was hard and required
+a lot of code.  With the new flow it is just function calls via rpc.
+
+```proto
+service Shim {
+	rpc Create(CreateRequest) returns (CreateResponse);
+	rpc Start(StartRequest) returns (google.protobuf.Empty);
+	rpc Delete(DeleteRequest) returns (DeleteResponse);
+	rpc Exec(ExecRequest) returns (ExecResponse);
+	rpc Pty(PtyRequest) returns (google.protobuf.Empty);
+	rpc Events(EventsRequest) returns (stream Event);
+	rpc State(StateRequest) returns (StateResponse);
+}
+```
+
+With the GRPC service it allows us to decouple the shim's lifecycle from the
+containers, in the way that we get synchronous feedback if the container failed
+to create, start, or exec from shim errors.
+
+The overhead for adding GRPC to the shim is actually less than the initial
+implementation.  We already had a few pipes that allowed you to control
+resizing of the pty master and exit events, now all replaced by one unix
+socket.  Unix sockets are cheap and fast and we reduce our open fd count with
+way by not relying on multiple fifos.  
+
+We also added a subcommand to the `ctr` command for testing and interfacing
+with the shim.  You can interact with a shim directly via `ctr shim` and get
+events, start containers, start exec processes.
+
+## Distribution Tool
+
+* https://github.com/docker/containerd/pull/452
+* https://github.com/docker/containerd/pull/472
+* https://github.com/docker/containerd/pull/474
+
+Last week, @stevvooe committed the first parts of the distribution tool. The main
+component provided there was the `dist fetch` command. This has been followed
+up by several other low-level commands that interact with content resolution
+and local storage that can be used together to work with parts of images.
+
+With this change, we add the following commands to the dist tool:
+    
+- `ingest`: verify and accept content into storage
+- `active`: display active ingest processes
+- `list`: list content in storage
+- `path`: provide a path to a blob by digest
+- `delete`: remove a piece of content from storage
+- `apply`: apply a layer to a directory
+
+When this is more solidified, we can roll these up into higher-level
+operations that can be orchestrated through the `dist` tool or via GRPC.
+
+As part of the _Development Report_, we thought it was a good idea to show
+these tools in depth. Specifically, we can show going from an image locator to
+a root filesystem with the current suite of commands.
+
+### Fetching Image Resources
+
+The first component added to the `dist` tool is the `fetch` command. It is a
+low-level command for fetching image resources, such as manifests and layers.
+It operates around the concept of `remotes`. Objects are fetched by providing a
+`locator` and an object identifier. The `locator`, roughly analogous to an
+image name or repository, is a schema-less URL. The following is an example of
+a `locator`:
+
+```
+docker.io/library/redis
+```
+
+When we say the `locator` is a "schema-less URL", we mean that it starts with
+hostname and has a path, representing some image repository. While the hostname
+may represent an actual location, we can pass it through arbitrary resolution
+systems to get the actual location. In that sense, it acts like a namespace.
+
+In practice, the `locator` can be used to resolve a `remote`. Object
+identifiers are then passed to this remote, along with hints, which are then
+mapped to the specific protocol and retrieved.  By dispatching on this common
+identifier, we should be able to support almost any protocol and discovery
+mechanism imaginable.
+
+The actual `fetch` command currently provides anonymous access to Docker Hub
+images, keyed by the `locator` namespace `docker.io`. With a `locator`,
+`identifier` and `hint`, the correct protocol and endpoints are resolved and the
+resource is printed to stdout. As an example, one can fetch the manifest for
+`redis` with the following command:
+    
+```
+$ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json
+```
+
+Note that we have provided a mediatype "hint", nudging the fetch implementation
+to grab the correct endpoint. We can hash the output of that to fetch the same
+content by digest:
+    
+```
+$ ./dist fetch docker.io/library/redis sha256:$(./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | shasum -a256)
+```
+    
+The hint now elided on the outer command, since we have affixed the content to
+a particular hash. The above shows us effectively fetches by tag, then by hash
+to demonstrate the equivalence when interacting with a remote.
+ 
+This is just the beginning. We should be able to centralize configuration
+around fetch to implement a number of distribution methodologies that have been
+challenging or impossible up to this point.
+
+Keep reading to see how this is used with the other commands to fetch complete
+images.
+
+### Fetching all the layers of an image
+
+If you are not yet entertained, let's bring `jq` and `xargs` into the mix for
+maximum fun. Our first task will be to collect the layers into a local content
+store with the `ingest` command.
+
+The following incantation fetches the manifest and downloads each layer:
+
+ ```
+$ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | \
+	jq -r '.layers[] | "./dist fetch docker.io/library/redis "+.digest + "| ./dist ingest --expected-digest "+.digest+" --expected-size "+(.size | tostring) +" docker.io/library/redis@"+.digest' | xargs -I{} -P10 -n1 sh -c "{}"
+```
+
+The above fetches a manifest, pipes it to jq, which assembles a shell pipeline
+to ingest each layer into the content store. Because the transactions are keyed
+by their digest, concurrent downloads and downloads of repeated content are
+ignored. Each process is then executed parallel using xargs.  If you run the
+above command twice, it will not download the layers because those blobs are
+already present in the content store.
+
+What about status? Let's first remove our content so we can monitor a download.
+`dist list` can be combined with xargs and `dist delete` to remove that
+content:
+
+```
+$ ./dist list -q | xargs ./dist delete
+```
+
+In a separate shell session, could monitor the active downloads with the following:
+    
+```
+$ watch -n0.2 ./dist active
+```
+    
+For now, the content is downloaded into `.content` in the current working
+directory. To watch the contents of this directory, you can use the following:
+    
+```
+$ watch -n0.2 tree .content
+```
+
+Now, run the fetch pipeline from above. You'll see the active downloads, keyed
+by locator and object, as well as the ingest transactions resulting blobs
+becoming available in the content store. This will help to understand what is
+going on internally.
+ 
+### Getting to a rootfs
+
+While we haven't yet integrated full snapshot support for layer application, we
+can use the `dist apply` command to start building out rootfs for inspection
+and testing. We'll build up a similar pipeline to unpack the layers and get an
+actual image rootfs.
+
+To get access to the layers, you can use the path command: 
+
+```
+$./dist path sha256:010c454d55e53059beaba4044116ea4636f8dd8181e975d893931c7e7204fffa
+sha256:010c454d55e53059beaba4044116ea4636f8dd8181e975d893931c7e7204fffa /home/sjd/go/src/github.com/docker/containerd/.content/blobs/sha256/010c454d55e53059beaba4044116ea4636f8dd8181e975d893931c7e7204fffa
+```
+
+This returns the a direct path to the blob to facilitate fast access. We can
+incorporate this into the `apply` command to get to a rootfs for `redis`:
+    
+```
+$ mkdir redis-rootfs
+$ ./dist fetch docker.io/library/redis latest mediatype:application/vnd.docker.distribution.manifest.v2+json | \
+	jq -r '.layers[] | "sudo ./dist apply ./redis-rootfs < $(./dist path -q "+.digest+")"' | xargs -I{} -n1 sh -c "{}"
+```
+
+The above fetches the manifest, then passes each layer into the `dist apply`
+command, resulting in the full redis container root filesystem. We do not do
+this in parallel, since each layer must be applied sequentially. Also, note
+that we have to run `apply` with `sudo`, since the layers typically have
+resources with root ownership.
+
+Alternatively, you can just read the manifest from the content store, rather
+than fetching it. We use fetch above to avoid having to lookup the manifest
+digest for our demo.
+
+Note that this is mostly a POC. This tool has a long way to go. Things like
+failed downloads and abandoned download cleanup aren't quite handled. We'll
+probably make adjustments around how content store transactions are handled to
+address this. We still need to incorporate snapshotting, as well as the ability
+to calculate the `ChainID` under subsequent unpacking. Once we have some tools
+to play around with snapshotting, we'll be able to incorporate our
+`rootfs.ApplyLayer` algorithm that will get us a lot closer to a production
+worthy system.
+   
+From here, we'll build out full image pull and create tooling to get runtime
+bundles from the fetched content.