Jimmy Zelinskie
64d0c5b675
data.queue: fix race condition
...
It's possible that multiple consumers will acquire a queue item if they
race on an expired item. To mitigate this, we check that the
processing_expires time hasn't been changed since we last read.
2016-07-14 15:34:22 -04:00
Jimmy Zelinskie
609f4fccd8
data.queue: simplify put method
2016-07-14 15:34:22 -04:00
Joseph Schorr
713ba3abaf
Further updates to the Prometheus client code
2016-07-01 14:16:51 -04:00
Jake Moshenko
668a8edc50
Refactor prometheus integration
...
Move prometheus to SaaS and make it a plugin
Move static callers to use metrics_queue plugin
Change local-docker to support different quay clone dirnames
Change prom_aggregator to use logrus
2016-07-01 14:16:50 -04:00
Matt Jibson
3d9acf2fff
Use prometheus as a metric backend
...
This entails writing a metric aggregation program since each worker has its
own memory, and thus own metrics because of python gunicorn. The python
client is a simple wrapper that makes web requests to it.
2016-07-01 14:16:50 -04:00
Jimmy Zelinskie
1f488acf12
data.queue: move name matching clause
2016-05-31 15:44:11 -04:00
Jimmy Zelinskie
26300d3c8e
data.queue: lint
2016-05-27 14:51:19 -04:00
Jimmy Zelinskie
8a5aa65d74
data.queue: limiting before order by rand
2016-05-27 14:44:30 -04:00
Jimmy Zelinskie
44b56ae2cf
queue: explicitly declare ordering requirement
...
This change defaults the ordering requirement of queue items to be off
and only enables it for the build manager. This should make the queries
for getting queueitems significantly faster for every other use case.
2016-05-27 14:44:30 -04:00
Joseph Schorr
f498e92d58
Implement against new Clair paginated notification system
2016-02-25 15:58:42 -05:00
Joseph Schorr
01723d5546
Catch other cases where the queue item has been removed
...
Fixes #1096
2015-12-22 15:58:51 -05:00
Matt Jibson
a994b367da
Refactor queue locking to not use select for update
...
The test suggests this works.
fixes #622
2015-11-03 11:32:28 -05:00
Matt Jibson
87cc3289a0
Remove transaction from metric reporting
2015-10-06 01:28:43 -04:00
Matt Jibson
4da66c1219
Move the metric put outside the transaction
2015-09-21 13:37:49 -04:00
Jimmy Zelinskie
2ff77df946
Merge pull request #518 from jzelinskie/fixmysqlssl
...
move UseThenDisconnect into queueworker
2015-09-21 13:35:35 -04:00
Jimmy Zelinskie
7c82e0b5b3
move UseThenDisconnect into queueworker
...
This makes the tests pass while maintaining the same behavior.
2015-09-21 13:34:12 -04:00
Jimmy Zelinskie
0de17627d5
Merge pull request #517 from jzelinskie/fixmysqlssl
...
close connections after getting queue metrics
2015-09-21 12:28:23 -04:00
Jimmy Zelinskie
98d6262a7f
close connections after getting queue metrics
2015-09-21 12:21:39 -04:00
Matt Jibson
bba1557437
Monitor queue adds and EC2 node starts
...
fixes #157
see #304
2015-09-18 16:21:16 -04:00
Matt Jibson
39dc4c7d8d
Monitor various sizes for queues
...
see #304
2015-09-14 15:57:08 -04:00
Joseph Schorr
96d5bbb155
Fix exceptions raised by the diffs worker
...
Fixes #465
2015-09-10 14:12:16 -04:00
Matt Jibson
fc671f3dde
Fix test_queue.py tests
...
This restores the reporter class as was before the metrics changes.
2015-08-17 17:22:46 -04:00
Matt Jibson
cfb6e884f2
Refactor metric collection
...
This change adds a generic queue onto which metrics can be pushed. A
separate module removes metrics from the queue and adds them to Cloudwatch.
Since these are now separate ideas, we can easily change the consumer from
Cloudwatch to anything else.
This change maintains near feature parity (the only change is there is now
just one queue instead of two - not a big deal).
2015-08-12 12:15:52 -04:00
Joseph Schorr
d480a204f5
Revert change to queue
2015-08-05 15:27:33 -04:00
Jake Moshenko
ed62339f89
Improve the performance of queue candidate queries.
2015-08-04 18:20:54 -04:00
Joseph Schorr
5f605b7cc8
Fix queue handling to remove the dependency from repobuild, and have a cancel method
2015-02-23 13:38:01 -05:00
Joseph Schorr
524705b88c
Get dashboard working and upgrade bootstrap. Note: the bootstrap fixes will be coming in the followup CL
2015-02-17 19:15:54 -05:00
Joseph Schorr
f84d1bad45
Handle internal errors in a better fashion: If a build would be marked as internal error, only do so if there are retries remaining. Otherwise, we mark it as failed (since it won't be rebuilt anyway)
2015-02-12 16:19:44 -05:00
Jake Moshenko
ce7033489b
Hopefully fix the deadlock in the queue.
2015-02-03 14:50:01 -05:00
Jake Moshenko
64750e31fc
Add the ability to select for update within transactions to fix some write after read hazards. Fix a bug in extend_processing.
2015-01-30 16:32:13 -05:00
Joseph Schorr
3872d29de9
Add a transaction around the extend_processing call
2015-01-29 18:40:41 -05:00
Jake Moshenko
cc70225043
Generalize the ephemeral build managers so that any manager may manage a builder spawned by any other manager.
2014-12-31 11:33:56 -05:00
Joseph Schorr
dbac8c7e3d
Fix build code:
...
- Fix issue with the queue_item in extend processing
- Add the new compiled docker binary with the lxc volume fix
2014-12-04 17:49:39 +01:00
Joseph Schorr
72d613614d
Merge branch 'bagger'
2014-12-01 12:48:59 -05:00
Jimmy Zelinskie
716d7a737b
Strip whitespace from ALL the things.
2014-11-24 16:07:38 -05:00
Joseph Schorr
b8e873b00b
Add support to the build system for tracking if/when the build manager crashes and make sure builds are restarted within a few minutes
2014-11-21 14:27:06 -05:00
Jake Moshenko
f4681f2c18
Merge branch 'master' into nomenclature
...
Conflicts:
test/data/test.db
2014-11-17 17:59:59 -05:00
Joseph Schorr
c06f57a6e7
Make sure builders close the db handle when no work comes in and make the metrics transaction smaller in scope
2014-10-24 11:40:02 -04:00
Jake Moshenko
e8b3d1cc4a
Phase 4 of the namespace to user migration: actually remove the column from the db and remove the dependence on serialized namespaces in the workers and queues
2014-10-01 14:23:46 -04:00
Joseph Schorr
f23038c6ee
Update the worker code to better handle exceptions, fix the utcdate issue and make sure we send the proper retry. Also updates notification workers to send JobExceptions rather than returning true or false
2014-09-22 12:52:57 -04:00
Jake Moshenko
0b6552d6cc
Fix the metrics so they are usable for scaling the workers down and up. Switch all datetimes which touch the database from now to utcnow. Fix the worker Dockerfile.
2014-05-23 14:16:26 -04:00
Jake Moshenko
f4c488f9b6
Fix the queue query for old jobs which won't run.
2014-05-22 13:50:06 -04:00
Jake Moshenko
f6726bd0a4
Merge branch 'ldapper'
...
Conflicts:
Dockerfile
app.py
data/database.py
endpoints/index.py
test/data/test.db
2014-05-22 12:13:41 -04:00
Jake Moshenko
d14798de1d
Add a queue capacity reporter plugin to the queue. Move the queue definitions to app. Add a cloudwatch reporter to the dockerfile build queue.
2014-05-21 19:50:37 -04:00
Jake Moshenko
8a3af93b8c
Improve the builder response to being terminated or dying.
2014-05-06 18:46:19 -04:00
jakedt
4b8217d4ad
Add config to allow for setting the queue names at runtime. Fix a bug in the data model.
2014-04-11 19:23:57 -04:00
jakedt
61a6db236f
Finish the implementation of local userfiles. Strip charsets from mimetypes in the build worker. Add canonical name ordering to the build queue. Port all queues to the canonical naming version.
2014-04-11 18:34:47 -04:00
jakedt
13dea98499
Prepare the build worker to support multiple tags and subdirectories. Change the build database config to accept a job config object instead of breaking out the parameters into independent blocks.
2014-02-24 16:11:23 -05:00
jakedt
e7064f1191
Fix the tests and the one bug that it highlighted.
2014-02-16 18:59:24 -05:00
yackob03
e1a4efe35c
Change the build queue identifier to disambiguate between old and new build jobs.
2014-02-12 12:51:30 -05:00