Fix some issues around state in the build managers
- Make sure to cleanup the job if the executor could not be started - Change the setup leeway to further ensure there isn't any crossover between the queue item timing out and the cleanup of the jobs - Make the lock used for marking jobs as internal error extremely long, but also based on the execution ID. This should ensure we don't get duplicates while allowing different executions to be handled properly. - Make sure to invoke the callback update for the queue before we run off to etcd; should reduce certain timeouts Hopefully Fixes #1836
This commit is contained in:
parent
949ceae4eb
commit
f9f60b9faf
2 changed files with 24 additions and 13 deletions
|
@ -17,15 +17,14 @@ from buildman.jobutil.buildstatus import StatusHandler
|
|||
from buildman.jobutil.buildjob import BuildJob, BuildJobLoadException
|
||||
from data import database
|
||||
from app import app, metric_queue
|
||||
from app import app
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
WORK_CHECK_TIMEOUT = 10
|
||||
TIMEOUT_PERIOD_MINUTES = 20
|
||||
JOB_TIMEOUT_SECONDS = 300
|
||||
SETUP_LEEWAY_SECONDS = 10
|
||||
MINIMUM_JOB_EXTENSION = timedelta(minutes=2)
|
||||
SETUP_LEEWAY_SECONDS = 30
|
||||
MINIMUM_JOB_EXTENSION = timedelta(minutes=1)
|
||||
|
||||
HEARTBEAT_PERIOD_SEC = 30
|
||||
|
||||
|
|
Reference in a new issue