Commit graph

27 commits

Author SHA1 Message Date
Jake Moshenko
2f626f2691 Unify the database connection lifecycle across all workers 2015-12-04 15:51:53 -05:00
Joseph Schorr
407eaae137 WIP: Towards sec demo 2015-11-09 12:50:39 -05:00
Joseph Schorr
c3d7ef2ec4 Only start workers once setup is complete on the registry
Fixes #326
2015-08-07 13:44:14 -04:00
Joseph Schorr
14f511bb5a Make sure to set a default for Raven client
Fixes #327
2015-08-07 13:03:38 -04:00
Joseph Schorr
ac0cca2d90 Switch to a unified worker system
- Handles logging
- Handles reporting to Sentry
- Removes old code around serving a web endpoint (unused now)
2015-07-28 17:26:12 -04:00
Joseph Schorr
3872d29de9 Add a transaction around the extend_processing call 2015-01-29 18:40:41 -05:00
Joseph Schorr
dbac8c7e3d Fix build code:
- Fix issue with the queue_item in extend processing
  - Add the new compiled docker binary with the lxc volume fix
2014-12-04 17:49:39 +01:00
Joseph Schorr
c06f57a6e7 Make sure builders close the db handle when no work comes in and make the metrics transaction smaller in scope 2014-10-24 11:40:02 -04:00
Jake Moshenko
1ccd6a9c5d Change the max_instances for the workers to only allow one parallel job execution. 2014-10-22 18:09:00 -04:00
Joseph Schorr
f23038c6ee Update the worker code to better handle exceptions, fix the utcdate issue and make sure we send the proper retry. Also updates notification workers to send JobExceptions rather than returning true or false 2014-09-22 12:52:57 -04:00
Joseph Schorr
463a3c55c3 Make worker error messages more descriptive 2014-08-27 19:02:53 -04:00
Joseph Schorr
510bbe7889 Add more check conditions for unhealthy workers and make the messaging better. 2014-08-26 12:41:43 -04:00
Joseph Schorr
728af56384 Make the watchdog in the build worker also requeue the current item if the worker has gone bad 2014-08-13 19:04:51 -04:00
Joseph Schorr
b9e9064af2 Only retry on unhealthy exceptions, not JobException's. 2014-08-10 18:28:20 -04:00
Jake Moshenko
0aa6e92b02 Finish porting the workers over to apscheduler 3.0 2014-08-01 18:38:02 -04:00
Jake Moshenko
6b38ddb9b6 Remove the gpled loremipsum module. 2014-07-31 16:46:02 -04:00
Joseph Schorr
bab3a0949c Make sure completion marking is also under the lock 2014-07-30 18:45:40 -04:00
Joseph Schorr
4aec422e24 Add a lock around accessing the current queue item and make sure to report it as incomplete whenever the worker becomes unhealthy 2014-07-30 18:30:54 -04:00
Joseph Schorr
7e935f5a8c Make build workers report that they are unhealthy when we get an LXC error or a Docker connection issue 2014-07-30 17:54:58 -04:00
Jake Moshenko
74d1c4e6b0 Update the worker status endpoint to be ELB friendly. 2014-07-18 15:04:20 -04:00
Jake Moshenko
0b6552d6cc Fix the metrics so they are usable for scaling the workers down and up. Switch all datetimes which touch the database from now to utcnow. Fix the worker Dockerfile. 2014-05-23 14:16:26 -04:00
Jake Moshenko
cc47e77156 Upgrade to the 0.11.1 tutum version of docker. Package it as a Dockerfile using Docker in Docker. Add a status server option to the workers to utilize the new termination signal and status features of gantry. 2014-05-16 18:31:24 -04:00
Jake Moshenko
8a3af93b8c Improve the builder response to being terminated or dying. 2014-05-06 18:46:19 -04:00
jakedt
58dbb540a1 Run a worker task immediately when it starts. 2014-04-22 13:55:54 -04:00
jakedt
576fbe4f0d Switch over to phusion baseimage. Prevent everything from daemonizing and start it with runit under phusion. Make workers trap and handle sigint and sigterm. Extend the reservation to 1hr for dockerfilebuild. Update nginx to remove the dependency on libgd. Merge the requirements and requirements enterprise files. 2014-04-11 13:32:45 -04:00
jakedt
b95d3ec329 Add a watchdog timer to the build worker to kill a build step that takes more than 20 minutes. 2014-04-02 19:32:41 -04:00
yackob03
14263de7f8 Extract some boilerplate from the worker and create a base class. Port the diffs worker over to the base. 2013-11-15 15:50:20 -05:00