[Buildbot-devel] Multi-master setup (or not)
Vasily
vasslitvinov at pisem.net
Wed Nov 26 06:25:10 UTC 2014
Hello all,
Just in case anybody is interested in my thoughts, I like Benoit's
suggestion more from the user's point of view - "eight" was self-managed in
terms of it didn't need some utilities being run from time to time to do
the bookkeeping, so achieving the same in "nine" looks appealing to me.
I understand that to do so would be harder than to follow what Pierre said.
I must also note that I'm no expert neither in multi-master mode nor in
distributed systems overall, I'm speaking as a Buildbot user with quite a
few years of using experience.
Thanks,
Vasily
26 нояб. 2014 г. 2:09 пользователь "Dustin J. Mitchell" <dustin at v.igoro.us>
написал:
> The fact that Buildbot's configuration is code makes this tricky.
> Otherwise, we could just load the configuration into the DB and have
> all masters pull from there. Configuration as code is one of
> Buildbot's advantages over other tools, so I don't want to lose that.
>
> I'd love to have more of the smart people on this list looking at the
> problem and thinking about the right solution. There's time for some
> simple modifications to the model, like you suggest, to make it into
> nine. There's also an argument to be made (as Pierre did) that the
> model isn't fundamentally broken, but just has a buggy implementation
> and needs some utilities built up.
>
> In particular, I'd love to hear how other software has solved similar
> problems -- we've reinvented enough wheels already here at Buildbot
> HQ!
>
> Dustin
>
> On Tue, Nov 25, 2014 at 2:43 PM, Benoît Allard
> <benoit.allard at greenbone.net> wrote:
> > Hi there,
> >
> > [TL;DR: I propose introducing a supervisor to manage the master(s).]
> >
> > The build properties PR are flowing in [0] (more review welcome !), so
> > it's time to start tackling the next bigger trouble I have with the
> > current development branch, namely the absence of master's hierarchy.
> >
> > Let me explain.
> >
> > I believed for years (indeed !) that I wasn't in the need to bother
> > about those multi-master stuffs, I don't have hundreds of slaves, not
> > more than a dozen of repositories to care of, so why should I care ?
> > Well, that's what I thought until I realised that even without using
> > this feature, it beat me quite a few times already since I started
> > experimenting with the current development branch.
> >
> > In the current development branch ('nine'), the whole data is stored in
> > a common database. Each master (one in most of the cases) is responsible
> > for its own configuration (the master.cfg + dependencies), and as such,
> > register it in the db: its slaves, its builders, its schedulers, ...
> > They further will populate the database with sourcestamps,
> > buildrequests, buildset, builds, and all the rest.
> >
> > Nothing was wrong, until I tried to reconfigure my master, and my old
> > builders (I had renamed some of them) where still to be seen on the
> > waterfall page. A few reconfig/restart further, half of that waterfall
> > page (and builder list) is taken with builders that are not defined
> > anywhere in my configuration any more. I'm afraid of further modifying
> > my configuration ! Looking further, old slaves (actually the current
> > one, but with a different username/password), are still present, (and
> > linked to my master !), although not existing in any configuration !
> > Same for change source, I guess you got the picture.
> >
> > I didn't realised immediately the size of the trouble I had met. I
> > opened an issue [1], and expected an easy answer like ... "Yes, sure,
> > you just forgot to ..." or something similar. The answer I got was quite
> > different, it tried to explain that it was a consequence of the current
> > design, that the slaves / builders / change sources / ... could have
> > switched master, or could have belonged to a master that is not up at
> > that moment, so no one was in the position to delete their entries from
> > the database. I had just had hit a design flaw.
> >
> > Few days later, my SVNPoller stopped polling [1], and nothing could
> > bring it back to life: restart, reconfig, delete from configuration /
> > reinsert, nothing ... The point was In the (common) database, the poller
> > was still marked as active on a master, so my (one and only) master
> > didn't tried to start it ! I was hit by the same design trouble.
> >
> > Few weeks later (now), I haven't met any other manifestation of this
> > trouble. But I know, it's still there ...
> >
> > Hope you got the picture now.
> >
> > The good news is, I have an idea how to solve it. I'm just not sure if
> > it's the best one, it involves quite a few modifications, and comes at a
> > price ...
> >
> > I've been wondering how do other distributed systems do ?
> >
> > Are they any other distributed system that rely on a common database,
> > and is able to identify active vs. inactive stuff ? I don't know, and so
> > far, I've not met any. If you know of any of them, please speak-up, I'd
> > be interested to know how they manage their data.
> >
> > Back in eight, the trouble was not that big: The database was only
> > there to pass information from schedulers to builders. Neither
> > schedulers, not builders, nor ... where put in the db, they belonged to
> > the personal data of the master that was responsible for them. If that
> > master disappeared, so did that information. The old builders (and
> > builds) did not disappeared from the disk, but they were not visible any
> > more in the web interface, as the master knew which information to show.
> >
> > My idea is quite simple (in theory), I believe the main trouble is that
> > no one has authority on all the master: hence I propose introducing a
> > 'supervisor' that would be the only one to know about the configuration,
> > and manages the master(s). The configuration would probably gain some
> > 'sections' (one per master), so that the supervisor knows what part to
> > send to which master. For instance, the master responsible for the web
> > interface would get a list of active identities (schedulers, slaves,
> > builders, ...) and just show them.
> >
> > I'm convinced that this solution could completely solve the trouble
> > I've identified, however, it's not an easy one, it involves quite a few
> > modifications (not **too** much, the goal is to keep is as small as
> > possible - KISS), and they come a a price, namely time ...
> >
> > Do you have an other / better idea ?
> >
> > Thanks for reading so far.
> >
> > Best Regards,
> > Ben.
> >
> > [0] #1380, #1382, #1384, #1385, #1886, #1887 (and a few more to come)
> > [1] TRAC-2959
> > [2] TRAC-3012
> >
> >
> ------------------------------------------------------------------------------
> > Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
> > from Actuate! Instantly Supercharge Your Business Reports and Dashboards
> > with Interactivity, Sharing, Native Excel Exports, App Integration & more
> > Get technology previously reserved for billion-dollar corporations, FREE
> >
> http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk
> > _______________________________________________
> > Buildbot-devel mailing list
> > Buildbot-devel at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/buildbot-devel
>
>
> ------------------------------------------------------------------------------
> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
> from Actuate! Instantly Supercharge Your Business Reports and Dashboards
> with Interactivity, Sharing, Native Excel Exports, App Integration & more
> Get technology previously reserved for billion-dollar corporations, FREE
>
> http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk
> _______________________________________________
> Buildbot-devel mailing list
> Buildbot-devel at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/buildbot-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://buildbot.net/pipermail/devel/attachments/20141126/0764d6b9/attachment.html>
More information about the devel
mailing list