ARM buildbot not connecting

Elliot Saba staticfloat at gmail.com
Tue Oct 6 02:37:12 UTC 2015


I'm not seeing the same source code as what you are describing, perhaps
that's because I'm using version 0.8.12.

Using printf debugging, I can verify that:

1) The buildmaster is calling _get_info()
<https://github.com/buildbot/buildbot/blob/4f9f91531cffbff12d57e0bd9b3d8b8589779cf2/master/buildbot/buildslave/base.py#L416-L435>,
which invokes getSlaveInfo() on the buildslave side.

2) The buildslave is running remote_getSlaveInfo()
<https://github.com/buildbot/buildbot/blob/4f9f91531cffbff12d57e0bd9b3d8b8589779cf2/slave/buildslave/bot.py#L310-L328>
to
completion.

3) The buildmaster is NOT running either _got_info()
<https://github.com/buildbot/buildbot/blob/4f9f91531cffbff12d57e0bd9b3d8b8589779cf2/master/buildbot/buildslave/base.py#L419-L427>
or
_info_unavailable()
<https://github.com/buildbot/buildbot/blob/4f9f91531cffbff12d57e0bd9b3d8b8589779cf2/master/buildbot/buildslave/base.py#L429-L433>.
I'm not yet certain why that is happening, but this is what I've managed to
track down so far.
-E

On Mon, Oct 5, 2015 at 6:04 PM, Dustin J. Mitchell <dustin at v.igoro.us>
wrote:

> You can see a little more detail about the communication patterns here:
>   http://docs.buildbot.net/current/developer/master-slave.html
>
> No, it all occurs over a single TCP connection from the buildslave to the
> buildmaster on port 9989.
>
> master/buildbot/buildslave/manager.py
> 126             yield conn.remotePrint(message="attached")
> 127             info = yield conn.remoteGetSlaveInfo()
> 128             log.msg("Got slaveinfo from '%s'" % buildslaveName)
>
> line 126 is what results in the "message from master: attached" log on the
> buildslave, so you know that is being executed.  But apparently the
> `remoteGetSlaveInfo` call never completes.
>
> So either that RPC call is never reaching the buildslave, or no response
> is ever reaching the buildmaster.  You could distinguish those
> possibilities by editing buildslave/base.py to add some `print` statements
> to `remote_getSlaveInfo`.  At a guess, I wonder if
> `multiprocessing.cpu_count` hangs on the ARM platform for some reason.
>
> Dustin
>
>
> On Mon, Oct 5, 2015 at 3:53 PM, Elliot Saba <staticfloat at gmail.com> wrote:
>
>> To give a small update on this, it looks like some kind of failure is
>> going on in the initial communication between master and slave.  Contrast
>> the twistd.log file when a working slave connects:
>>
>> 2015-10-05 19:49:06+0000 [Broker,15,aaa.bbb.ccc.ddd] slave
>> 'centos6.7-x64' attaching from IPv4Address(TCP, 'aaa.bbb.ccc.ddd', 53629)
>> 2015-10-05 19:49:06+0000 [Broker,15,aaa.bbb.ccc.ddd] Got slaveinfo from
>> 'centos6.7-x64'
>> 2015-10-05 19:49:06+0000 [Broker,15,aaa.bbb.ccc.ddd] Starting buildslave
>> keepalive timer for 'centos6.7-x64'
>> 2015-10-05 19:49:06+0000 [Broker,15,aaa.bbb.ccc.ddd] bot attached
>> 2015-10-05 19:49:06+0000 [Broker,15,aaa.bbb.ccc.ddd] Buildslave
>> centos6.7-x64 attached to nuke_centos6.7-x64
>> 2015-10-05 19:49:06+0000 [Broker,15,aaa.bbb.ccc.ddd] Buildslave
>> centos6.7-x64 attached to nightly_cxx64
>> 2015-10-05 19:49:06+0000 [Broker,15,aaa.bbb.ccc.ddd] Buildslave
>> centos6.7-x64 attached to clean_centos6.7-x64
>>
>> Versus when my new ARM slave connects:
>> 2015-10-05 19:49:16+0000 [Broker,16,www.xxx.yyy.zzz] slave
>> 'ubuntu14.04-armv7l' attaching from IPv4Address(TCP, 'www.xxx.yyy.zzz',
>> 47946)
>>
>> It never says "Got slaveinfo from ubuntu14.04-armv7l", which seems to me
>> like there's a problem with duplex communication.  Does anyone know if I
>> need to open ports other than the default buildmaster port of 9989?  Do I
>> need to open ports on the buildslave side?
>> -E
>>
>> On Fri, Oct 2, 2015 at 11:47 PM, nachaat hassis <nachaat05 at yahoo.fr>
>> wrote:
>>
>>> The fact that the slave is an arm-board does not matter.
>>> I have many arm- and armhf- slaves and they are working fine.
>>>
>>> Von meinem iPhone gesendet
>>>
>>> Am 03.10.2015 um 08:21 schrieb Elliot Saba <staticfloat at gmail.com>:
>>>
>>> Hello all,
>>>
>>> I have an ubuntu14.04 ARM machine that I'm trying to connect to my
>>> buildmaster, and although both twistd.log files show the connection being
>>> made, the web UI always says that the buildslave is disconnected, and no
>>> jobs are ever dispatched to the buildslave.
>>>
>>> In slightly more detail, my buildmaster has logs of the form:
>>>
>>> 2015-10-03 06:16:42+0000 [Broker,12,www.xxx.yyy.zzz] slave
>>> 'ubuntu14.04-armv7l' attaching from IPv4Address(TCP, 'aaa.bbb.ccc.ddd',
>>> 47418)
>>>
>>> And my buildslave shows:
>>>
>>> 2015-10-03 06:16:41+0000 [-] Connecting to buildbot.e.ip.saba.us:9989
>>> 2015-10-03 06:16:41+0000 [Broker,client] message from master: attached
>>>
>>> You can see the buildslave on the webui here
>>> <http://buildbot.e.ip.saba.us:8010/builders/package_tarballarm7vl>,
>>> what can I do to figure out why this is happening?  Could this have
>>> something to do with the fact that the buildslave is an ARM machine?
>>>
>>> Thanks,
>>> -Elliot
>>>
>>> _______________________________________________
>>> users mailing list
>>> users at buildbot.net
>>> https://lists.buildbot.net/mailman/listinfo/users
>>>
>>>
>>
>> _______________________________________________
>> users mailing list
>> users at buildbot.net
>> https://lists.buildbot.net/mailman/listinfo/users
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.buildbot.net/pipermail/users/attachments/20151005/c94705c8/attachment.html>


More information about the users mailing list