This is semantically valid because Docker only uses the leaf layer as the image config when reading a V2_1 manifest
Fixes https://jira.coreos.com/browse/QUAY-1351
Old V1 code would occasionally do so, which was never correct, but we never hit it because we didn't (practically) share storage rows before. Now that we explicitly do, we were occasionally de-checksum-ing storage rows on V1 pushes (which are extremely rare as-is). This change makes sure we don't do that, and makes sure we always set a proper digest and checksum on storage rows.
This will mean we try to re-index other similar rows, but that's a no-op on the Clair side, and this update query is extremely heavy and easily interleaving locks
Addresses the slow query:
```
SELECT `candidates` . `id` FROM ( SELECT `t1` . `id` FROM `repositorybuild` AS `t1` WHERE ( ( ( `t1` . `phase` IN (...) ) OR ( `t1` . `started` < ? ) ) AND ( `t1` . `logs_archived` = ? ) ) LIMIT ? ) AS `candidates` ORDER BY `Rand` ( ) LIMIT ? OFFSET ?
```
While the cardinality on `logs_archived` will be low, it should also only be in the `false` state for a very small number of records, so it should make this query very, very fast.
Currently, we only have one (the shared empty layer), but this should make the blob lookups for repositories significantly faster, as we won't need to do the massive join.
- Implement logs model using Elasticsearch with tests
- Implement transition model using both elasticsearch and database model
- Add LOGS_MODEL configuration to choose which to use.
Co-authored-by: Sida Chen <sidchen@redhat.com>
Co-authored-by: Kenny Lee Sin Cheong <kenny.lee@redhat.com>
Schema version 1 manifests contain the tag name, and we have a check to ensure we don't point a tag at a manifest with the wrong name embedded. However, this also means that we cannot retarget to that manifest, which will break the UI once we get rid of legacy images.
This change means we can retarget to those manifests, and the OCI model does the work of rewriting the manifest when necessary.
Some customers are hitting this endpoint rapidly for repositories with many, many tags. This change drops the unnecessary joins, which should reduce database load somewhat.
- Adds the charset: utf-8 to all the manifest responses
- Makes sure we connect to MySQL in utf8mb4 mode, to ensure we can properly read and write 4-byte utf8 strings
- Adds tests for all of the above
We were occasionally trying to compute schema 2 version 1 signatures on the *unicode* representation, which was failing the signature check. This PR adds a new wrapper type called `Bytes`, which all manifests must take in, and which handles the unicodes vs encoded utf-8 stuff in a central location. This PR also adds a test for the manifest that was breaking in production.