[Buildbot-devel] MQ design plan and code review
Dustin J. Mitchell
dustin at v.igoro.us
Sun Jun 15 22:03:58 UTC 2014
(I think the email filtering was because of the images. I doubled the
maximum message size)
I'd like to avoid doing filtering on each master, because that means that
each master must process all messages, which won't scale well. If each
master creates queues which only bind to the exchange for messages they
need, then no master ever sees the total volume of messages. Keep in mind
that Buildbot will be sending messages on the order of once per second for
every active build, as long files grow, in addition to messages about
changes, new builds, and so on. In a large system with many masters, we're
looking at 100's of messages per second, easily.
This will, indeed, result in a lot of queues. I think there will be ways
to optimize that, but let's do so after we have the basic interface working.
As ensuring -- the sender of a message's target is the exchange, and kombu
basically supports this out of the box: if the publish method succeeds, the
message was sent to its target. (you need to enable this by setting
`confirm_publish`, but that's easy)
The issue about "persistence" is on the queue side. We have two types of
subscribers:
1. This type of subscriber would like to know about messages matching
certain patterns, but only for a little while. Once it stops listening for
those messages, the messages can be discarded. For example, a web UI that
is displaying a logfile would like to get messages when new lines arrive,
but once the web page is closed, any additional messages about logfile
lines can be discarded.
2. This type of subscriber would like to know about *every* message
matching a certain pattern, even if those messages are sent while the
subscriber is not active. The best example is a scheduler, which needs to
see a message for every change. If the master on which a scheduler is
running shuts down, then another master should be able to start up and
receive a message about the next change.
Type 1, in AMQP terms, is a queue declared with 'exclusive' set ("Exclusive
queues may only be accessed by the current connection, and are deleted when
that connection closes.")
Type 2 is what Buildbot calls a "persistent queue". This is a queue with a
well-known name (so that other schedulers can find the same queue) that is
not exclusive and does not auto-delete. It should also be durable.
I'm glad we're hashing this out at the beginning, rather than in finished
code :)
On Sun, Jun 15, 2014 at 10:32 AM, Damon Wang <damon.devops at gmail.com> wrote:
> Hi,
>
> Thanks for replying! Your suggestion arr very enlightening.
>
>
> > Actually, the first bit is false: in AMQP, if a message is sent to an
> > exchange which has no active queues, then the message is dropped.>
>
> > It's the *queues* that are persistent -- or not. And Buildbot needs
> > both. A persistent queue would be used by a scheduler, for example,
> > which needs to get a notification of *every* change, even one which
> > occurs while the scheduler is not connected to the MQ server. So that
> > MQ server needs to have a queue that stays around. But we need
> > non-persistent queues for connections from web clients, which come and
> > go all the time. We can't have the MQ server caching every message
> > sent since last Wednesday when I looked at the site with my laptop!>
>
> > Persistent queues should also be durable.
>
> Well, I don't know it before, thanks a lot, I have a plan that we can use
> kombu's ensure method
> (
> http://kombu.readthedocs.org/en/latest/reference/kombu.html#kombu.Connection.ensure
> )
> to try best to make message will deliver to our target. I haven't take
> experiment, if max_retries can't be infinite, we can deliver it into a
> protected queue, and check this queue periodically, and try to deliver
> to the original target queue again.
>
>
>
>
> Why use a queue to save these message? we can make it durable, so we'll
> get these messages after restart. How to make the messages still at the
> protected queue after we get message from it? After we consume it, if
> resend failed, we can publish to the protected queue again.
>
>
> > It's the queue that performs the filtering. So a consumer which wants
> > messages matching "foo.#.bar.*.bing" would create a new queue, bound
> > to the exchange using that pattern.>
>
> > In other words, every consumer would have its own queue. I suspect
> > you're thinking of a different model, where all consumers would
> > subscribe to the same queue. That won't work for Buildbot.
>
> Well, my idea is not subscribe the same queue. I think that there are so
> many routing keys, we can combine routing key and filter like this:
> we divide messages to different queues by the first one or two word of
> routing key, and do filter after consuming, just like:
>
>
>
>
> The advantage is less a much smaller number of queues, but your idea
> maybe better after my reconsideration
>
>
> > From a brief look, it seems like you're registering all of the queues
> > in advance. The queues need to be created and deleted as consumers
> > subscribe and stop their subscriptions, instead.
>
> > I'm not sure what the standardizeKey method is trying to accomplish -
> > it looks like it sometimes returns a value, and sometimes doesn't?
> > This is the sort of method for which unit tests can be *very* helpful!
>
> You can get its meaning from the last image, I use standardizeKey to
> convert a original routing key like "scheduler.$schedulerid.started" to
> "scheduler.#"
>
> I will write test code as soon as possible!
>
> Regards,
> Wei
>
>
> 2014-06-15 6:15 GMT+08:00 Dustin J. Mitchell <dustin at v.igoro.us>:
>
> On Fri, Jun 13, 2014 at 11:35 PM, Damon Wang <damon.devops at gmail.com>
>> wrote:
>> > First, I have some a news, that kombu doesn't support Qpid as we
>> expect, if
>> > we want to keep our doc (
>> http://docs.buildbot.net/nine/developer/mq.html) 's
>> > accuracy, we must implement another mq plugin by qpid-python.
>>
>> Then there's no need to support qpid. It's fine for Buildbot to
>> support what Kombu supports and no more.
>>
>> > Besides, we use persistent to make sure the message will send to a
>> consumer
>> > as soon as it active, but if we use mq, the message will always
>> available in
>> > the mq, so persistent can be ignored here, but our new mq has another
>> attr
>> > -- durable, durable queues remain active when a server restarts, my
>> design
>> > is in default, exchange and queues will establish when buildbot start,
>> and
>> > delete when buildbot shutdown, durable can help us when buildbot stop
>> > anomaly, so queues are not deleted as expect, then we can get our
>> messages
>> > which not consumed.
>>
>> Actually, the first bit is false: in AMQP, if a message is sent to an
>> exchange which has no active queues, then the message is dropped.
>>
>> It's the *queues* that are persistent -- or not. And Buildbot needs
>> both. A persistent queue would be used by a scheduler, for example,
>> which needs to get a notification of *every* change, even one which
>> occurs while the scheduler is not connected to the MQ server. So that
>> MQ server needs to have a queue that stays around. But we need
>> non-persistent queues for connections from web clients, which come and
>> go all the time. We can't have the MQ server caching every message
>> sent since last Wednesday when I looked at the site with my laptop!
>>
>> Persistent queues should also be durable.
>>
>> > What's more, our former ued filter to make consumer get right message,
>> > standard mq often make consumer bind to a explicit queue, and will get
>> all
>> > messages from this queue, do we still need filter, this can implement by
>> > filter message's routing_key after our consumer get it from queue, but I
>> > don't know whether it is necessary.
>>
>> It's the queue that performs the filtering. So a consumer which wants
>> messages matching "foo.#.bar.*.bing" would create a new queue, bound
>> to the exchange using that pattern.
>>
>> In other words, every consumer would have its own queue. I suspect
>> you're thinking of a different model, where all consumers would
>> subscribe to the same queue. That won't work for Buildbot.
>>
>> > About serializer, we use json as doc says, but does json support
>> datetime
>> > type in python? I find some message contain a time info like this:
>> > "'complete_at': datetime.datetime(2014, 6, 14, 1, 47, 39,
>> > tzinfo=<buildbot.util.UTC object at 0x2c8b7d0>)", but when I test in my
>> > environment, I got "datetime.datetime(2014, 6, 14, 1, 47, 39) is not
>> JSON
>> > serializable", does buildbot has extend json?
>>
>> The REST API uses a _toJson method to handle that issue:
>>
>> https://github.com/buildbot/buildbot/blob/nine/master/buildbot/www/rest.py#L427
>> that could be refactored a little bit so that the MQ code can use it, too.
>>
>> > Last but not least, to consume message asynchronous, our former mq use
>> > twisted's derfer, I'm not very familiar with it, but after read
>> twisted's
>> > doc, I find it is not easy to use defer here, first, kombu consume
>> messages
>> > by something like: "connection.drain_events", this function will return
>> when
>> > any consumer get a message (event), or use hub(kombu's asyn method),
>> all of
>> > this are face to connection not certain queue or consumer, so make a
>> thread
>> > to run hub maybe a solution, but I know that twisted's aysn IO is very
>> > famous, there must be a solution better than use system's thread or
>> process,
>> > please make comments if you have any.
>>
>> From a brief google around, it looks like this is not a common
>> solution. There is a tool called txamqp which can speak AMQP
>> directly, but of course that bypasses Kombu!
>>
>> I think that the best way to do this to start is to use a separate
>> thread, dedicated to Kombu. Later, we can work on a more natural way
>> to integrate the two.
>>
>> > The current code is on my github, link is
>> >
>> https://github.com/MatheMatrix/Gist/blob/master/kumbu.test.buildbot.2.py,
>> > maybe I can make a pull request for review? Not for merge, only for
>> > convenience to comment.
>>
>> From a brief look, it seems like you're registering all of the queues
>> in advance. The queues need to be created and deleted as consumers
>> subscribe and stop their subscriptions, instead.
>>
>> I'm not sure what the standardizeKey method is trying to accomplish -
>> it looks like it sometimes returns a value, and sometimes doesn't?
>> This is the sort of method for which unit tests can be *very* helpful!
>>
>> Dustin
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://buildbot.net/pipermail/devel/attachments/20140615/b1b25c8d/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ??2.png
Type: image/png
Size: 11142 bytes
Desc: not available
URL: <http://buildbot.net/pipermail/devel/attachments/20140615/b1b25c8d/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ??1.png
Type: image/png
Size: 15964 bytes
Desc: not available
URL: <http://buildbot.net/pipermail/devel/attachments/20140615/b1b25c8d/attachment-0001.png>
More information about the devel
mailing list