Previous to this change, repositories were looked up unfiltered in six different queries, and then filtered using the permissions model, which issued a query per repository found, making search incredibly slow. Instead, we now lookup a chunk of repositories unfiltered and then filter them via a single query to the database. By layering the filtering on top of the lookup, each as queries, we can minimize the number of queries necessary, without (at the same time) using a super expensive join.
Other changes:
- Remove the 5 page pre-lookup on V1 search and simply return that there is one more page available, until there isn't. While technically not correct, it is much more efficient, and no one should be using pagination with V1 search anyway.
- Remove the lookup for repos without entries in the RAC table. Instead, we now add a new RAC entry when the repository is created for *the day before*, with count 0, so that it is immediately searchable
- Remove lookup of results with a matching namespace; these aren't very relevant anyway, and it overly complicates sorting
Adds code to ensure we never GC CAS paths that are shared amongst multiple ImageStorage rows, as well as an associated pair of tests to catch the positive and negative cases.
Gitlab doesn't send any commit information for tagging events (because... reasons), and so we have to perform the lookup ourselves to have full metadata.
Fixes#1467
This API is still (apparently) being used by the Docker CLI for `docker search` (why?!) and we therefore have customers expecting this to work the same way as the DockerHub.
With this change, if all entitlements are valid, we sort to show the entitlement that will expire the farthest in the future, as that defines the point at which the user must act before the license becomes invalid.
The `> 0` check fails if the code was found first in the query string, which can occasionally happen under tox due to the `PYTHONHASHSEED` var changing. We simply change to use a proper parse and check to avoid this issue entirely.
Moves all the external login services into a set of classes that share as much code as possible. These services are then registered on both the client and server, allowing us in the followup change to dynamically register new handlers
Fixes namespace validation to use the proper regex for checking length, as well as showing the proper messaging if the entered namespace is invalid
[Delivers #137830461]
Before this change, the queue code would check that none of the fields on the item to be claimed had changed between the time when the item was selected and the item is claimed. While this is a safe approach, it also causes quite a bit of lock contention in MySQL, because InnoDB will take a lock on *any* rows examined by the `where` clause of the `update`, even if they will ultimately thrown out due to other clauses (See: http://dev.mysql.com/doc/refman/5.7/en/innodb-locks-set.html: "A ..., an UPDATE, ... generally set record locks on every index record that is scanned in the processing of the SQL statement. It does not matter whether there are WHERE conditions in the statement that would exclude the row. InnoDB does not remember the exact WHERE condition, but only knows which index ranges were scanned").
As a result, we want to minimize the number of fields accessed in the `where` clause on an update to the QueueItem row. To do so, we introduce a new `state_id` column, which is updated on *every change* to the QueueItem rows with a unique, random value. We can then have the queue item claiming code simply check that the `state_id` column has not changed between the retrieval and claiming steps. This minimizes the number of columns being checked to two (`id` and `state_id`), and thus, should significantly reduce lock contention. Note that we can not (yet) reduce to just a single `state_id` column (which should work in theory), because we need to maintain backwards compatibility with existing items in the QueueItem table, which will be given empty `state_id` values when the migration in this change runs.
Also adds a number of tests for other queue operations that we want to make sure operate correctly following this change.
[Delivers #133632501]
If the feature is enabled and recaptcha keys are given in config, then a recaptcha box is displayed in the UI when creating a user and a recaptcha response code *must* be sent with the create API call for it to succeed.
Add support to GC to invoke a callback with the image+storages removed. Only images whose storage was also removed will be sent to the callback. This will be used by security scanning for its own GC in the followup change.
This ensures that even if security scanner pagination sends Old and New layer IDs on different pages, they will properly be handled across the entire notification.
Fixes https://www.pivotaltracker.com/story/show/136133657
Changes the security scanner code to raise exceptions now for non-successful operations. One of the new exceptions raised is MissingParentLayerException, which, when raised, will cause the security worker to perform a full rescan of all parent images for the current layer, before trying once more to scan the current layer. This should allow the system to be "self-healing" in the case where the security scanner engine somehow loses or corrupts a parent layer.
Currently, if a user tries to confirm an invite sent to them on an account with a mismatching email address, we simply redirect to the org (where they get a 403). This change ensures they get the proper error response message, and restyles the error page to be nicer.
Fixes#2227
Fixes https://www.pivotaltracker.com/story/show/136088507
The FakeSecurityScanner mocks out all calls that Quay is expected to make to the security scanner API, and returns faked data that can be adjusted by the calling test case
Following this change, anytime a layer is indexed by the security scanner, we only send notifications out if the layer previously had a security_indexed_engine value of `-1`, thus ensuring it has *never* been indexed previously. This will allow us to change to version of the security scanner upwards, and have all the images be re-indexed, without firing off notifications in a spammy manner.
Adds the missing field on the query_user calls, updates the external auth tests to ensure it is returned properly, and adds new end-to-end tests which call the external auth engines via the *API*, to ensure this doesn't break again
- Switches database schema creation to alembic, which solves the MySQL issue (and makes sure we test migrations as well)
- Adds a few time.sleep(1) to work around MySQL's second-precision issue when adding items to queues and then immediately retrieving them
- Disables the storage proxy tests when running against non-SQLite databases, as it causes failures with the multiple process and multiple transactions
- Changes initdb to support only populating the database, as well as fixing a few small items around the test data when working with non-SQLite data
Note that the test suite doesn't fully verify that each validation succeeds; rather, it ensures that the proper system (storage, security scanning, etc) is called with the configuration and returns at all (usually with an expected error). This should prevent us from forgetting to update these code paths when we change config-based systems. Longer term, we might want to have these tests stand up fake/mock versions of the endpoint services as well, for end-to-end testing.
Instead of having the Swift storage engine try to delete the empty chunk(s) synchronously, we simply queue them and have a worker come along after 30s to delete the empty chunks. This has a few key benefits: it is async (doesn't slow down the push code), helps deal with Swift's eventual consistency (less retries necessary) and is generic for other storage engines if/when they need this as well
When a user now logs in for the first time for any external auth (LDAP, JWT, Keystone, Github, Google, Dex), they will be presented with a confirmation screen that affords them the opportunity to change their Quay-assigned username.
Addresses most of the user issues around #74