[Buildbot-devel] MQ design plan and code review

Damon Wang damon.devops at gmail.com
Wed Jun 18 09:55:53 UTC 2014


Hi,

I have a question that all master can connect one amqp server, so masters
will see all queues?

Besides, I wanna share my new design plan, set all queues at start is a bad
idea indeed, especially if we set lots of queues, I do this for setup
consumers easily -- we can give consumer the certain queue, but how to set
consumer when we do not set queues at start? My idea is to search the queue
list first and establish one if doesn't exist, we can make the queue list a
dict, and use routing_key as dict's keys, so we just get
queues["builder.*.started"], and catch exception.

Regards,
Damon


2014-06-17 22:19 GMT+08:00 Damon Wang <damon.devops at gmail.com>:

> Thanks for your hints and suggestion, make the basic interface working is
> truly my first priority!
>
>
> 2014-06-16 6:03 GMT+08:00 Dustin J. Mitchell <dustin at v.igoro.us>:
>
> (I think the email filtering was because of the images.  I doubled the
>> maximum message size)
>>
>> I'd like to avoid doing filtering on each master, because that means that
>> each master must process all messages, which won't scale well.  If each
>> master creates queues which only bind to the exchange for messages they
>> need, then no master ever sees the total volume of messages.  Keep in mind
>> that Buildbot will be sending messages on the order of once per second for
>> every active build, as long files grow, in addition to messages about
>> changes, new builds, and so on.  In a large system with many masters, we're
>> looking at 100's of messages per second, easily.
>>
>> This will, indeed, result in a lot of queues.  I think there will be ways
>> to optimize that, but let's do so after we have the basic interface working.
>>
>> As ensuring -- the sender of a message's target is the exchange, and
>> kombu basically supports this out of the box: if the publish method
>> succeeds, the message was sent to its target.  (you need to enable this by
>> setting `confirm_publish`, but that's easy)
>>
>> The issue about "persistence" is on the queue side.  We have two types of
>> subscribers:
>>
>> 1. This type of subscriber would like to know about messages matching
>> certain patterns, but only for a little while.  Once it stops listening for
>> those messages, the messages can be discarded.  For example, a web UI that
>> is displaying a logfile would like to get messages when new lines arrive,
>> but once the web page is closed, any additional messages about logfile
>> lines can be discarded.
>>
>> 2. This type of subscriber would like to know about *every* message
>> matching a certain pattern, even if those messages are sent while the
>> subscriber is not active.  The best example is a scheduler, which needs to
>> see a message for every change.  If the master on which a scheduler is
>> running shuts down, then another master should be able to start up and
>> receive a message about the next change.
>>
>> Type 1, in AMQP terms, is a queue declared with 'exclusive' set
>> ("Exclusive queues may only be accessed by the current connection, and are
>> deleted when that connection closes.")
>>
>> Type 2 is what Buildbot calls a "persistent queue".  This is a queue with
>> a well-known name (so that other schedulers can find the same queue) that
>> is not exclusive and does not auto-delete.  It should also be durable.
>>
>> I'm glad we're hashing this out at the beginning, rather than in finished
>> code :)
>>
>>
>>
>> On Sun, Jun 15, 2014 at 10:32 AM, Damon Wang <damon.devops at gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> Thanks for replying! Your suggestion arr very enlightening.
>>>
>>>
>>> > Actually, the first bit is false: in AMQP, if a message is sent to an
>>> > exchange which has no active queues, then the message is dropped.>
>>>
>>> > It's the *queues* that are persistent -- or not.  And Buildbot needs
>>> > both.  A persistent queue would be used by a scheduler, for example,
>>> > which needs to get a notification of *every* change, even one which
>>> > occurs while the scheduler is not connected to the MQ server.  So that
>>> > MQ server needs to have a queue that stays around.  But we need
>>> > non-persistent queues for connections from web clients, which come and
>>> > go all the time.  We can't have the MQ server caching every message
>>> > sent since last Wednesday when I looked at the site with my laptop!>
>>>
>>> > Persistent queues should also be durable.
>>>
>>> Well, I don't know it before, thanks a lot, I have a plan that we can
>>> use
>>> kombu's ensure method
>>> (
>>> http://kombu.readthedocs.org/en/latest/reference/kombu.html#kombu.Connection.ensure
>>> )
>>> to try best to make message will deliver to our target. I haven't take
>>> experiment, if max_retries can't be infinite, we can deliver it into a
>>> protected queue, and check this queue periodically, and try to deliver
>>> to the original target queue again.
>>>
>>>
>>>>>>
>>> Why use a queue to save these message? we can make it durable, so we'll
>>> get these messages after restart. How to make the messages still at the
>>> protected queue after we get message from it? After we consume it, if
>>> resend failed, we can publish to the protected queue again.
>>>
>>>
>>> > It's the queue that performs the filtering.  So a consumer which wants
>>> > messages matching "foo.#.bar.*.bing" would create a new queue, bound
>>> > to the exchange using that pattern.>
>>>
>>> > In other words, every consumer would have its own queue.  I suspect
>>> > you're thinking of a different model, where all consumers would
>>> > subscribe to the same queue.  That won't work for Buildbot.
>>>
>>> Well, my idea is not subscribe the same queue. I think that there are so
>>> many routing keys, we can combine routing key and filter like this:
>>> we divide messages to different queues by the first one or two word of
>>> routing key, and do filter after consuming, just like:
>>>
>>>
>>>>>>
>>> The advantage is less a much smaller number of queues, but your idea
>>> maybe better after my reconsideration
>>>
>>>
>>> > From a brief look, it seems like you're registering all of the queues
>>> > in advance.  The queues need to be created and deleted as consumers
>>> > subscribe and stop their subscriptions, instead.
>>>
>>> > I'm not sure what the standardizeKey method is trying to accomplish -
>>> > it looks like it sometimes returns a value, and sometimes doesn't?
>>> > This is the sort of method for which unit tests can be *very* helpful!
>>>
>>> You can get its meaning from the last image, I use standardizeKey to
>>> convert a original routing key like "scheduler.$schedulerid.started" to
>>> "scheduler.#"
>>>
>>> I will write test code as soon as possible!
>>>
>>> Regards,
>>> Wei
>>>
>>>
>>> 2014-06-15 6:15 GMT+08:00 Dustin J. Mitchell <dustin at v.igoro.us>:
>>>
>>> On Fri, Jun 13, 2014 at 11:35 PM, Damon Wang <damon.devops at gmail.com>
>>>> wrote:
>>>> > First, I have some a news, that kombu doesn't support Qpid as we
>>>> expect, if
>>>> > we want to keep our doc (
>>>> http://docs.buildbot.net/nine/developer/mq.html) 's
>>>> > accuracy, we must implement another mq plugin by qpid-python.
>>>>
>>>> Then there's no need to support qpid.  It's fine for Buildbot to
>>>> support what Kombu supports and no more.
>>>>
>>>> > Besides, we use persistent to make sure the message will send to a
>>>> consumer
>>>> > as soon as it active, but if we use mq, the message will always
>>>> available in
>>>> > the mq, so persistent can be ignored here, but our new mq has another
>>>> attr
>>>> > -- durable, durable queues remain active when a server restarts, my
>>>> design
>>>> > is in default, exchange and queues will establish when buildbot
>>>> start, and
>>>> > delete when buildbot shutdown, durable can help us when buildbot stop
>>>> > anomaly, so queues are not deleted as expect, then we can get our
>>>> messages
>>>> > which not consumed.
>>>>
>>>> Actually, the first bit is false: in AMQP, if a message is sent to an
>>>> exchange which has no active queues, then the message is dropped.
>>>>
>>>> It's the *queues* that are persistent -- or not.  And Buildbot needs
>>>> both.  A persistent queue would be used by a scheduler, for example,
>>>> which needs to get a notification of *every* change, even one which
>>>> occurs while the scheduler is not connected to the MQ server.  So that
>>>> MQ server needs to have a queue that stays around.  But we need
>>>> non-persistent queues for connections from web clients, which come and
>>>> go all the time.  We can't have the MQ server caching every message
>>>> sent since last Wednesday when I looked at the site with my laptop!
>>>>
>>>> Persistent queues should also be durable.
>>>>
>>>> > What's more, our former ued filter to make consumer get right message,
>>>> > standard mq often make consumer bind to a explicit queue, and will
>>>> get all
>>>> > messages from this queue, do we still need filter, this can implement
>>>> by
>>>> > filter message's routing_key after our consumer get it from queue,
>>>> but I
>>>> > don't know whether it is necessary.
>>>>
>>>> It's the queue that performs the filtering.  So a consumer which wants
>>>> messages matching "foo.#.bar.*.bing" would create a new queue, bound
>>>> to the exchange using that pattern.
>>>>
>>>> In other words, every consumer would have its own queue.  I suspect
>>>> you're thinking of a different model, where all consumers would
>>>> subscribe to the same queue.  That won't work for Buildbot.
>>>>
>>>> > About serializer, we use json as doc says, but does json support
>>>> datetime
>>>> > type in python? I find some message contain a time info like this:
>>>> > "'complete_at': datetime.datetime(2014, 6, 14, 1, 47, 39,
>>>> > tzinfo=<buildbot.util.UTC object at 0x2c8b7d0>)", but when I test in
>>>> my
>>>> > environment, I got "datetime.datetime(2014, 6, 14, 1, 47, 39) is not
>>>> JSON
>>>> > serializable", does buildbot has extend json?
>>>>
>>>> The REST API uses a _toJson method to handle that issue:
>>>>
>>>> https://github.com/buildbot/buildbot/blob/nine/master/buildbot/www/rest.py#L427
>>>> that could be refactored a little bit so that the MQ code can use it,
>>>> too.
>>>>
>>>> > Last but not least, to consume message asynchronous, our former mq use
>>>> > twisted's derfer, I'm not very familiar with it, but after read
>>>> twisted's
>>>> > doc, I find it is not easy to use defer here, first, kombu consume
>>>> messages
>>>> > by something like: "connection.drain_events", this function will
>>>> return when
>>>> > any consumer get a message (event), or use hub(kombu's asyn method),
>>>> all of
>>>> > this are face to connection not certain queue or consumer, so make a
>>>> thread
>>>> > to run hub maybe a solution, but I know that twisted's aysn IO is very
>>>> > famous, there must be a solution better than use system's thread or
>>>> process,
>>>> > please make comments if you have any.
>>>>
>>>> From a brief google around, it looks like this is not a common
>>>> solution.  There is a tool called txamqp which can speak AMQP
>>>> directly, but of course that bypasses Kombu!
>>>>
>>>> I think that the best way to do this to start is to use a separate
>>>> thread, dedicated to Kombu.  Later, we can work on a more natural way
>>>> to integrate the two.
>>>>
>>>> > The current code is on my github, link is
>>>> >
>>>> https://github.com/MatheMatrix/Gist/blob/master/kumbu.test.buildbot.2.py
>>>> ,
>>>> > maybe I can make a pull request for review? Not for merge, only for
>>>> > convenience to comment.
>>>>
>>>> From a brief look, it seems like you're registering all of the queues
>>>> in advance.  The queues need to be created and deleted as consumers
>>>> subscribe and stop their subscriptions, instead.
>>>>
>>>> I'm not sure what the standardizeKey method is trying to accomplish -
>>>> it looks like it sometimes returns a value, and sometimes doesn't?
>>>> This is the sort of method for which unit tests can be *very* helpful!
>>>>
>>>> Dustin
>>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://buildbot.net/pipermail/devel/attachments/20140618/20e9f468/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ??1.png
Type: image/png
Size: 15964 bytes
Desc: not available
URL: <http://buildbot.net/pipermail/devel/attachments/20140618/20e9f468/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ??2.png
Type: image/png
Size: 11142 bytes
Desc: not available
URL: <http://buildbot.net/pipermail/devel/attachments/20140618/20e9f468/attachment-0001.png>


More information about the devel mailing list