[Buildbot-devel] copying information between steps or schedulers
A.T.Hofkamp
a.t.hofkamp at tue.nl
Tue Mar 14 12:08:17 UTC 2006
Hello,
Brian Warner wrote:
> For this purpose, and some others which have been raised on the list before,
> I'm thinking that some sort of general-purpose "build attributes" might be
> appropriate. These would be a set of key-value pairs made available to all
> Steps, and persisted in the long-term BuildStatus object to be made available
> to status plugins. I like the "%(attrname)s" substitution syntax suggested
The syntax is not my invention, it is borrowed from string formatting in
Python. Note that it is not necessary to restrict yourself to such syntax; you
could eg have a key-value pair
( "%(revision)", "12345" )
and do a literal search/replace of the keys by their values.
In this way, a user can choose arbitrary syntax for his keys.
>
> s(ShellCommand, args=["mv", "source.tar.gz",
> SubstituteAttrs("build-r%(revision)s.tar.gz")])
>
> or maybe
>
> s(ShellCommand, args=["mv", "source.tar.gz",
> SubstituteAttrs("build-r%ss.tar.gz", "revision")])
>
>
> I think that you'd still have to write some code to set these attributes, but
I don't understand this line. Do you mean writing some Python derived class?
If so, how is the connection made from my make/sh/perl/whatever script to this
Python code?
In a previous mail I suggested to parse stdout for lines of a certain form
(defined as RE), and extract key/value pairs from them.
The difficulty here is that parsing stdout can result in unexpected matches if
one is not careful.
On the other hand, having an explicit definition of what to match does make
the program more transparent (instead of hiding such magic in a derived class).
A problem related to the above approach (%s substitution and stdout parsing)
is that information is limited to strings only as far as I can see, ie I don't
know how to pass on a list of files as argument for example, without adding
complicated pattern-expansions.
The example below is about a list of libraryfiles that I want to add to a
command, but it applies equally bad to a list of changed files of a commit.
Example:
attributes = { 'libfiles': [ 'file1', 'file2', 'file3' ] }
ShellCommand(command=["./mycommand", "--lib=%(libfiles)s"])
wanted output:
./mycommand --lib=file1 --lib=file2 --lib=file3
I believe there is not enough information in the input to get the wanted
output. I also don't see how extend this %s approach for such cases without
introducing ad hoc special solutions (ie one can extend %s replacement to
behave as wanted in the example, but then how does one state to want
"./mycommand --linb=file1,file2,file3" for example?).
An other approach may be to use a file with key/value pairs (eg one key/value
at each line), or have a directory of files, each for a different key.
Since reading/writing files is a quite primitive operation supported by all
build-related programs, this approach may be more generally applicable.
Also, there are little inherent restrictions to the contents of key/value pairs.
If this (set of) attributes is kept in sync between slaves in some way, you
get file transfer for free (or, you implement file-transfer and you get
attribute transfer for free :-) ).
Last but not least, if you give the webserver access to this set of files, one
can serve files and inspect state (although I do not know whether that is
something one'd want to do).
> what exactly should be checked out. (and for forced builds of HEAD, the
> buildbot never actually learns what revision is used, unless we add some code
> to parse the output of the checkout/update step somehow).
pySVN is your friend :-)
(SVN is designed to have an API available so it is very easy to build scripts
that do things with a repository, pySVN is the Python version of that API).
That reduces the problem to convincing the entire world that SVN is the
solution to SCM problems....
Just kidding, ;-)
> In general, I'm of two minds on this sort of thing. When I'm wearing my
> "Continuous Integration" hat, I want to see the buildbot know as little as
> possible about the build process. In my mind, developers should only need to
> know how to do "make all; make test", or something equivalent, and the rest
> of the intelligence should be built into the tree somewhere, in the top-level
> Makefile, or scons script, or Ant recipe, or whatever. This is important
> because if it isn't checked in, it will be hard to keep in sync, hard to
> maintain, and eventually forgotten. From this point of view, the buildbot
> should be as dumb as possible.
I don't think I agree with the conclusion of your final sentence.
If 'make all; make test' would be enough, we wouldn't need buildbot.
What is missing here imho is the project and site configuration.
'make test' is designed to run tests in the general sense, but it assumes
things about its environment, eg where are the tests, how do I run them, how
is it indicated when a test fails, etc.
All such things are (I think) part of the project configuration. Other stuff
is the SCM used, organization of the repository, release policies, issue
tracking, etc. I do agree with you that such information should be managed as
part of the project code, eg by adding it to the repository.
Site configuration is about the structure of the site, like which machines
exist, where is what running, what user accounts are used, how is the file
system organized, etc.
Such site configuration is a one-time-only use, so where the information is
kept is not terribly important, unless the site is very big or there are a lot
of sites that need to be maintained, but in the latter case we are way off
topic on this list...
To make 'make test' work, a lot of these configuration settings need to be
decided. To make it work automagically using buildbot, buildbot needs to be
informed about these decisions such that it fits in the project and site
configuration.
Imho buildbot should not be dumb, it should be smart to allow easy entry of
site configuration, and flexible to obtain project configuration from the
project data, although I am worried that the latter is not always feasible.
> On the other hand, when I'm wearing my "Release Engineer" hat, I want to see
> the buildbot help with release automation, which frequently involves more
> complicated steps that those needed by mere developers. If the buildbot is a
> useful place to build installers/.deb-packages, upload them to a distribution
> directory, rebuild docs, update a website.. then sure, go for it.
>
> So these two views are sometimes in conflict.
I don't think so. A release engineer is almost by definition much more
involved in site configuration and much less in project configuration imho,
hence the shift in needed information.
> complicated features to the buildbot, to enable the automation features, but
> I'm also afraid that if they are available, people will be tempted to put the
> build intelligence in the buildbot's config, where it is generally invisible
> to developers who may need to do the same thing too.
I am not sure that this is a buildbot development problem. Preventing people
from doing stupid things should not be a project goal imho. I much more
believe that people should have access to powerful primitives and a very good
guide how to use them effectively (and in a lot of cases, the latter is the
most difficult).
Sure, there will be people that ignore any advice and will try and succeed to
shoot themselves in the foot, but hey, if you want to spend your life that way....
> Anyway, the build attributes are an example of a feature that I'm inclined to
> put in, but am still concerned that it might be a bad idea in the long run.
I don't understand the core problem that we try to solve here. I experience a
need to communicate information between steps in a build and between different
builds (from an upstream scheduler to a downstream scheduler), but is that
really the problem or just an symptom of something larger?
I tend to think the latter, but I cannot put my finger on it.
Maybe an implicit assumption in buildbot is that it is desgined for doing "svn
export; make; make test" at all slaves?
[ Sorry if I offended you with this question, that is not my intention.
I am just trying to figure out what buildbot rule I am breaking that causes
my need for communication and file transfer. Am I trying to do something
that I should not do? Is my project setup broken? (I would not be surprised)
]
> The other feature I'm considering that falls into this category is a pair of
> BuildSteps that move files between the master and the slave. The syntax would
> probably be something like:
If you use a (set of) files to contain the attributes, and you keep these in
sync between slaves, you get this implicit.
The trouble with files is that you immediately also get the file management
problem, ie I need to implement procedures that remove files every now and
then otherwise I run out of disk space. At this moment that is covered
elsewhere, if you include file stuff in buildbot there will be a need for
cleaning up (in the form of another 'build'?).
> where all filenames are relative to the master/slave's respective basedirs.
> It would probably be useful to add a third command:
>
> s(MasterShellCommand, args=["scp", "foo.tar.gz", "website:latest.tar.gz"])
>
> so you could actually do something with the file once you'd copied it to the
> master.
Not sure. why bother to copy it to the master then in the first place?
(wouldn't it be equally easy to copy the file from the last slave?)
I'd probably consider publishing files to be done using another slave-build.
This slave may run at the same machine as the buildmaster (if you consider
copying a file twice as something that should be avoided).
> thoughts?
Plenty, just not very organized, I am afraid.
Albert
More information about the devel
mailing list