Further fixes for unicode handling in manifests

We were occasionally trying to compute schema 2 version 1 signatures on the *unicode* representation, which was failing the signature check. This PR adds a new wrapper type called `Bytes`, which all manifests must take in, and which handles the unicodes vs encoded utf-8 stuff in a central location. This PR also adds a test for the manifest that was breaking in production.
This commit is contained in:
Joseph Schorr 2019-01-08 20:49:00 -05:00
parent 05fa2bcbe0
commit 171c7e5238
28 changed files with 275 additions and 106 deletions

View file

@ -4,18 +4,14 @@ from image.docker.schema2 import (DOCKER_SCHEMA2_MANIFEST_CONTENT_TYPE,
DOCKER_SCHEMA2_MANIFESTLIST_CONTENT_TYPE)
from image.docker.schema2.manifest import DockerSchema2Manifest
from image.docker.schema2.list import DockerSchema2ManifestList
from util.bytes import Bytes
def parse_manifest_from_bytes(manifest_bytes, media_type, validate=True):
""" Parses and returns a manifest from the given bytes, for the given media type.
Raises a ManifestException if the parse fails for some reason.
"""
# NOTE: Docker sometimes pushed manifests encoded as utf-8, so decode them
# if we can. Otherwise, treat the string as already unicode encoded.
try:
manifest_bytes = manifest_bytes.decode('utf-8')
except:
pass
assert isinstance(manifest_bytes, Bytes)
if media_type == DOCKER_SCHEMA2_MANIFEST_CONTENT_TYPE:
return DockerSchema2Manifest(manifest_bytes)