[Buildbot-devel] retries on SVN step

Wed Apr 13 19:31:57 UTC 2005

> Would it be possible to extend the BuildBot default SVN step to
> retry a few times?  or add a new SVNRetry step that will do this?

Yeah, that shouldn't be too hard. You'd want to implement it in
commands.SourceBase, on the slave-side, probably in doVC(), since that way
you could get the same benefit for all VC modes. The tricky part is deciding
when it makes sense to re-try the operation, and when it doesn't. For
example, it's possible to get CVS into a state where updates fail, and the
only fix is to blow away your tree and check out a new one, but you'd have to
parse the stderr output to determine whether your 'cvs update' failure was
because of this problem or because the master wasn't reachable.

Likewise, if the VC operation failed halfway through, the local tree might be
left in some bogus state. I just looked at the twisted buildbot (the full-2.4
slave), and the SVN update is failing because the previous update was
interrupted by the slave disappearing. There's some sort of lock file left
over which will require either manual intervention or simply blowing away the
tree to recover. Any new code to re-try VC operations would need a way to
tell how it ought to fall back.

This could take the form of the doVCUpdate/doVCFull methods returning a value
to indicate that they failed but that a retry might be useful, perhaps one
value for "please try the same thing again in a few seconds", and a different
value for "please blow away the tree and try again". The individual VC modes
could parse their stdout/stderr to decide which values to return.

Are transient network failures that common? And how transient are they, i.e.
how long should the slave wait before doing a retry to give the new VC
command a fighting chance of not hitting the same problem as the earlier one?

thanks,
 -Brian