[Buildbot-devel] Multi-master setup (or not)
Dustin J. Mitchell
dustin at v.igoro.us
Sun Nov 30 15:58:09 UTC 2014
Vasily, the master already does this housekeeping automatically,
periodically. What Ben's discussing is a way to do it on-demand, to
avoid waiting for the next periodic housekeeping run.
Dustin
On Wed, Nov 26, 2014 at 1:25 AM, Vasily <vasslitvinov at pisem.net> wrote:
> Hello all,
>
> Just in case anybody is interested in my thoughts, I like Benoit's
> suggestion more from the user's point of view - "eight" was self-managed in
> terms of it didn't need some utilities being run from time to time to do the
> bookkeeping, so achieving the same in "nine" looks appealing to me.
>
> I understand that to do so would be harder than to follow what Pierre said.
>
> I must also note that I'm no expert neither in multi-master mode nor in
> distributed systems overall, I'm speaking as a Buildbot user with quite a
> few years of using experience.
>
> Thanks,
> Vasily
>
> 26 нояб. 2014 г. 2:09 пользователь "Dustin J. Mitchell" <dustin at v.igoro.us>
> написал:
>>
>> The fact that Buildbot's configuration is code makes this tricky.
>> Otherwise, we could just load the configuration into the DB and have
>> all masters pull from there. Configuration as code is one of
>> Buildbot's advantages over other tools, so I don't want to lose that.
>>
>> I'd love to have more of the smart people on this list looking at the
>> problem and thinking about the right solution. There's time for some
>> simple modifications to the model, like you suggest, to make it into
>> nine. There's also an argument to be made (as Pierre did) that the
>> model isn't fundamentally broken, but just has a buggy implementation
>> and needs some utilities built up.
>>
>> In particular, I'd love to hear how other software has solved similar
>> problems -- we've reinvented enough wheels already here at Buildbot
>> HQ!
>>
>> Dustin
>>
>> On Tue, Nov 25, 2014 at 2:43 PM, Benoît Allard
>> <benoit.allard at greenbone.net> wrote:
>> > Hi there,
>> >
>> > [TL;DR: I propose introducing a supervisor to manage the master(s).]
>> >
>> > The build properties PR are flowing in [0] (more review welcome !), so
>> > it's time to start tackling the next bigger trouble I have with the
>> > current development branch, namely the absence of master's hierarchy.
>> >
>> > Let me explain.
>> >
>> > I believed for years (indeed !) that I wasn't in the need to bother
>> > about those multi-master stuffs, I don't have hundreds of slaves, not
>> > more than a dozen of repositories to care of, so why should I care ?
>> > Well, that's what I thought until I realised that even without using
>> > this feature, it beat me quite a few times already since I started
>> > experimenting with the current development branch.
>> >
>> > In the current development branch ('nine'), the whole data is stored in
>> > a common database. Each master (one in most of the cases) is responsible
>> > for its own configuration (the master.cfg + dependencies), and as such,
>> > register it in the db: its slaves, its builders, its schedulers, ...
>> > They further will populate the database with sourcestamps,
>> > buildrequests, buildset, builds, and all the rest.
>> >
>> > Nothing was wrong, until I tried to reconfigure my master, and my old
>> > builders (I had renamed some of them) where still to be seen on the
>> > waterfall page. A few reconfig/restart further, half of that waterfall
>> > page (and builder list) is taken with builders that are not defined
>> > anywhere in my configuration any more. I'm afraid of further modifying
>> > my configuration ! Looking further, old slaves (actually the current
>> > one, but with a different username/password), are still present, (and
>> > linked to my master !), although not existing in any configuration !
>> > Same for change source, I guess you got the picture.
>> >
>> > I didn't realised immediately the size of the trouble I had met. I
>> > opened an issue [1], and expected an easy answer like ... "Yes, sure,
>> > you just forgot to ..." or something similar. The answer I got was quite
>> > different, it tried to explain that it was a consequence of the current
>> > design, that the slaves / builders / change sources / ... could have
>> > switched master, or could have belonged to a master that is not up at
>> > that moment, so no one was in the position to delete their entries from
>> > the database. I had just had hit a design flaw.
>> >
>> > Few days later, my SVNPoller stopped polling [1], and nothing could
>> > bring it back to life: restart, reconfig, delete from configuration /
>> > reinsert, nothing ... The point was In the (common) database, the poller
>> > was still marked as active on a master, so my (one and only) master
>> > didn't tried to start it ! I was hit by the same design trouble.
>> >
>> > Few weeks later (now), I haven't met any other manifestation of this
>> > trouble. But I know, it's still there ...
>> >
>> > Hope you got the picture now.
>> >
>> > The good news is, I have an idea how to solve it. I'm just not sure if
>> > it's the best one, it involves quite a few modifications, and comes at a
>> > price ...
>> >
>> > I've been wondering how do other distributed systems do ?
>> >
>> > Are they any other distributed system that rely on a common database,
>> > and is able to identify active vs. inactive stuff ? I don't know, and so
>> > far, I've not met any. If you know of any of them, please speak-up, I'd
>> > be interested to know how they manage their data.
>> >
>> > Back in eight, the trouble was not that big: The database was only
>> > there to pass information from schedulers to builders. Neither
>> > schedulers, not builders, nor ... where put in the db, they belonged to
>> > the personal data of the master that was responsible for them. If that
>> > master disappeared, so did that information. The old builders (and
>> > builds) did not disappeared from the disk, but they were not visible any
>> > more in the web interface, as the master knew which information to show.
>> >
>> > My idea is quite simple (in theory), I believe the main trouble is that
>> > no one has authority on all the master: hence I propose introducing a
>> > 'supervisor' that would be the only one to know about the configuration,
>> > and manages the master(s). The configuration would probably gain some
>> > 'sections' (one per master), so that the supervisor knows what part to
>> > send to which master. For instance, the master responsible for the web
>> > interface would get a list of active identities (schedulers, slaves,
>> > builders, ...) and just show them.
>> >
>> > I'm convinced that this solution could completely solve the trouble
>> > I've identified, however, it's not an easy one, it involves quite a few
>> > modifications (not **too** much, the goal is to keep is as small as
>> > possible - KISS), and they come a a price, namely time ...
>> >
>> > Do you have an other / better idea ?
>> >
>> > Thanks for reading so far.
>> >
>> > Best Regards,
>> > Ben.
>> >
>> > [0] #1380, #1382, #1384, #1385, #1886, #1887 (and a few more to come)
>> > [1] TRAC-2959
>> > [2] TRAC-3012
>> >
>> >
>> > ------------------------------------------------------------------------------
>> > Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
>> > from Actuate! Instantly Supercharge Your Business Reports and Dashboards
>> > with Interactivity, Sharing, Native Excel Exports, App Integration &
>> > more
>> > Get technology previously reserved for billion-dollar corporations, FREE
>> >
>> > http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk
>> > _______________________________________________
>> > Buildbot-devel mailing list
>> > Buildbot-devel at lists.sourceforge.net
>> > https://lists.sourceforge.net/lists/listinfo/buildbot-devel
>>
>>
>> ------------------------------------------------------------------------------
>> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
>> from Actuate! Instantly Supercharge Your Business Reports and Dashboards
>> with Interactivity, Sharing, Native Excel Exports, App Integration & more
>> Get technology previously reserved for billion-dollar corporations, FREE
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Buildbot-devel mailing list
>> Buildbot-devel at lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/buildbot-devel
More information about the devel
mailing list