Commit graph

362 commits

Author SHA1 Message Date
Joseph Schorr
463a3c55c3 Make worker error messages more descriptive 2014-08-27 19:02:53 -04:00
Joseph Schorr
510bbe7889 Add more check conditions for unhealthy workers and make the messaging better. 2014-08-26 12:41:43 -04:00
Joseph Schorr
67905c277e Remove webhook worker 2014-08-25 19:13:40 -04:00
Joseph Schorr
daa43c3bb9 Add better messaging around pulling of base images when they fail due to invalid or missing credentials 2014-08-18 20:34:39 -04:00
Joseph Schorr
736af3165b Add a default message if the build pack MIME processor fails 2014-08-15 18:23:43 -04:00
Joseph Schorr
8681dd9cb9 Add a new exposed 'unpacking' phase to the build and make sure that the unzip/untar/etc always occurs under a try-except 2014-08-15 17:58:11 -04:00
Joseph Schorr
728af56384 Make the watchdog in the build worker also requeue the current item if the worker has gone bad 2014-08-13 19:04:51 -04:00
Joseph Schorr
b9e9064af2 Only retry on unhealthy exceptions, not JobException's. 2014-08-10 18:28:20 -04:00
Joseph Schorr
1b7379df29 Fix workers to not always be marked as unhealthy 2014-08-08 15:24:19 -04:00
Jake Moshenko
0372013f70 Merge remote-tracking branch 'origin/redalert'
Conflicts:
	app.py
2014-08-04 16:56:34 -04:00
Jake Moshenko
0aa6e92b02 Finish porting the workers over to apscheduler 3.0 2014-08-01 18:38:02 -04:00
Jake Moshenko
6b38ddb9b6 Remove the gpled loremipsum module. 2014-07-31 16:46:02 -04:00
Joseph Schorr
49801bc2c4 - Add web hook queue code back in. We'll remove it and turn it off after this CL goes to prod
- Make notification lookup always be by repo and its UUID, rather than the internal DB ID
- Add the init script for the notification worker
2014-07-31 13:30:54 -04:00
Joseph Schorr
bab3a0949c Make sure completion marking is also under the lock 2014-07-30 18:45:40 -04:00
Joseph Schorr
4aec422e24 Add a lock around accessing the current queue item and make sure to report it as incomplete whenever the worker becomes unhealthy 2014-07-30 18:30:54 -04:00
Joseph Schorr
7e935f5a8c Make build workers report that they are unhealthy when we get an LXC error or a Docker connection issue 2014-07-30 17:54:58 -04:00
Joseph Schorr
752efb9e0f Fix the spawn_notification to work in all cases and clean up some of the remaining code 2014-07-18 16:34:52 -04:00
Joseph Schorr
591cd020b8 Merge branch 'master' into redalert 2014-07-18 15:58:56 -04:00
Joseph Schorr
af31bde997 Add support for the remaining events to the frontend and the backend 2014-07-18 15:58:18 -04:00
Jake Moshenko
74d1c4e6b0 Update the worker status endpoint to be ELB friendly. 2014-07-18 15:04:20 -04:00
Joseph Schorr
8d7493cb86 Convert over to notifications system. Note this is incomplete 2014-07-17 22:51:58 -04:00
Jake Moshenko
cceb09d4f6 Remove some unused dependencies and update the rest. 2014-07-17 12:08:07 -04:00
Joseph Schorr
8b3659fefa Dockerfile build worker should not report inner JobException's twice 2014-07-11 12:05:52 -04:00
Joseph Schorr
9d1ae8ba87 FROM line check needs to be on the tuple result, not the join 2014-06-16 14:01:17 -04:00
Joseph Schorr
f795868b5b Handle the case where there is no FROM command in the Dockerfile 2014-06-13 16:56:48 -04:00
Jake Moshenko
e8355f301e Remove our deploy key from the workers/Readme which gets included in the Docker image. 2014-05-27 15:19:23 -04:00
Jake Moshenko
0b6552d6cc Fix the metrics so they are usable for scaling the workers down and up. Switch all datetimes which touch the database from now to utcnow. Fix the worker Dockerfile. 2014-05-23 14:16:26 -04:00
Jake Moshenko
d14798de1d Add a queue capacity reporter plugin to the queue. Move the queue definitions to app. Add a cloudwatch reporter to the dockerfile build queue. 2014-05-21 19:50:37 -04:00
Jake Moshenko
b8466169ac Integrate sentry with the build worker. 2014-05-19 13:50:45 -04:00
Jake Moshenko
212a4650f4 Rework the config to use runit logging. 2014-05-18 17:19:14 -04:00
Jake Moshenko
cc47e77156 Upgrade to the 0.11.1 tutum version of docker. Package it as a Dockerfile using Docker in Docker. Add a status server option to the workers to utilize the new termination signal and status features of gantry. 2014-05-16 18:31:24 -04:00
Jake Moshenko
c92ce54a37 Reduce a step in the worker bootstrap. 2014-05-13 17:44:45 -04:00
Jake Moshenko
bcb993a914 Set up the build logs to use our fake build logs on test and local. 2014-05-09 18:45:11 -04:00
Jake Moshenko
8a3af93b8c Improve the builder response to being terminated or dying. 2014-05-06 18:46:19 -04:00
Jake Moshenko
55f18a2ecf Add the missing uid translation range to the root user. 2014-05-01 17:54:59 -04:00
Jake Moshenko
ec282999bf Use the docker version which works with 14.04 lxc. 2014-05-01 17:24:58 -04:00
Jake Moshenko
32583a5675 First steps toward running the builder on trusty. 2014-05-01 15:39:33 -04:00
Jake Moshenko
b888c05bc4 Change the version of our docker binary because the public registry is blocking the tutum agent name. 2014-05-01 11:44:59 -04:00
Jake Moshenko
450928674b Use a new caching algorithm which can limit the size for the build nodes. Stop treating public images as special. Add a new phase to the builder for pulling. 2014-04-30 18:48:36 -04:00
jakedt
58dbb540a1 Run a worker task immediately when it starts. 2014-04-22 13:55:54 -04:00
jakedt
2bc3d24543 Update the build worker to remove all tags from expired images. 2014-04-18 18:36:11 -04:00
jakedt
0d8725e778 Update the instructions for starting and running the workers. 2014-04-17 16:18:53 -04:00
jakedt
0a9ee6c49f Bust the dockerfile build cache across repository lines. 2014-04-16 15:45:41 -04:00
jakedt
0827e0fbac Merge remote-tracking branch 'origin/master' into ncc1701
Conflicts:
	endpoints/web.py
	static/directives/signup-form.html
	static/js/app.js
	static/js/controllers.js
	static/partials/landing.html
	static/partials/view-repo.html
	test/data/test.db
2014-04-14 19:37:22 -04:00
jakedt
724fec1b74 Test third party repo images for public-ness in the builder. Always clean up private images that we dont know about before build. Pull the base image to refresh before every build. 2014-04-14 18:54:39 -04:00
jakedt
40f82a9d16 Work harder to reset the state of the docker env on the build worker. 2014-04-14 15:59:57 -04:00
jakedt
de18236358 Allow for caching of previous docker builds for 24 hours. 2014-04-14 15:21:05 -04:00
jakedt
61a6db236f Finish the implementation of local userfiles. Strip charsets from mimetypes in the build worker. Add canonical name ordering to the build queue. Port all queues to the canonical naming version. 2014-04-11 18:34:47 -04:00
jakedt
576fbe4f0d Switch over to phusion baseimage. Prevent everything from daemonizing and start it with runit under phusion. Make workers trap and handle sigint and sigterm. Extend the reservation to 1hr for dockerfilebuild. Update nginx to remove the dependency on libgd. Merge the requirements and requirements enterprise files. 2014-04-11 13:32:45 -04:00
jakedt
8fac0474b5 Get staging to run under docker on an EC2 host. 2014-04-10 18:30:09 -04:00