ARM buildbot not connecting

Nachaat Hassis nachaat05 at yahoo.fr
Mon Oct 12 20:36:28 UTC 2015


Could you please ls -la the buildslave-info dir and poste it here?
Since your buildbot could not retrieve slave-infos maybe is there a problem
with the files in the info-dir.
Did you try to lunch your buildslave with sudo?

best regards.

2015-10-12 22:12 GMT+02:00 Elliot Saba <staticfloat at gmail.com>:

> So is there anything else I can do here to see what is going wrong?  This
> error is quite mysterious, and I'd really like to be able to use this
> machine with Buildbot.
> -E
>
> On Tue, Oct 6, 2015 at 11:19 AM, Dustin J. Mitchell <dustin at v.igoro.us>
> wrote:
>
>> No, sorry, I shouldn't have assumed you were.
>>
>> Dustin
>>
>> On Tue, Oct 6, 2015 at 12:54 PM, Elliot Saba <staticfloat at gmail.com>
>> wrote:
>>
>>> Should I be using v0.9 instead of v0.8.12?
>>> -E
>>>
>>> On Mon, Oct 5, 2015 at 7:37 PM, Elliot Saba <staticfloat at gmail.com>
>>> wrote:
>>>
>>>> I'm not seeing the same source code as what you are describing, perhaps
>>>> that's because I'm using version 0.8.12.
>>>>
>>>> Using printf debugging, I can verify that:
>>>>
>>>> 1) The buildmaster is calling _get_info()
>>>> <https://github.com/buildbot/buildbot/blob/4f9f91531cffbff12d57e0bd9b3d8b8589779cf2/master/buildbot/buildslave/base.py#L416-L435>,
>>>> which invokes getSlaveInfo() on the buildslave side.
>>>>
>>>> 2) The buildslave is running remote_getSlaveInfo()
>>>> <https://github.com/buildbot/buildbot/blob/4f9f91531cffbff12d57e0bd9b3d8b8589779cf2/slave/buildslave/bot.py#L310-L328> to
>>>> completion.
>>>>
>>>> 3) The buildmaster is NOT running either _got_info()
>>>> <https://github.com/buildbot/buildbot/blob/4f9f91531cffbff12d57e0bd9b3d8b8589779cf2/master/buildbot/buildslave/base.py#L419-L427> or
>>>> _info_unavailable()
>>>> <https://github.com/buildbot/buildbot/blob/4f9f91531cffbff12d57e0bd9b3d8b8589779cf2/master/buildbot/buildslave/base.py#L429-L433>.
>>>> I'm not yet certain why that is happening, but this is what I've managed to
>>>> track down so far.
>>>> -E
>>>>
>>>> On Mon, Oct 5, 2015 at 6:04 PM, Dustin J. Mitchell <dustin at v.igoro.us>
>>>> wrote:
>>>>
>>>>> You can see a little more detail about the communication patterns here:
>>>>>   http://docs.buildbot.net/current/developer/master-slave.html
>>>>>
>>>>> No, it all occurs over a single TCP connection from the buildslave to
>>>>> the buildmaster on port 9989.
>>>>>
>>>>> master/buildbot/buildslave/manager.py
>>>>> 126             yield conn.remotePrint(message="attached")
>>>>> 127             info = yield conn.remoteGetSlaveInfo()
>>>>> 128             log.msg("Got slaveinfo from '%s'" % buildslaveName)
>>>>>
>>>>> line 126 is what results in the "message from master: attached" log on
>>>>> the buildslave, so you know that is being executed.  But apparently the
>>>>> `remoteGetSlaveInfo` call never completes.
>>>>>
>>>>> So either that RPC call is never reaching the buildslave, or no
>>>>> response is ever reaching the buildmaster.  You could distinguish those
>>>>> possibilities by editing buildslave/base.py to add some `print` statements
>>>>> to `remote_getSlaveInfo`.  At a guess, I wonder if
>>>>> `multiprocessing.cpu_count` hangs on the ARM platform for some reason.
>>>>>
>>>>> Dustin
>>>>>
>>>>>
>>>>> On Mon, Oct 5, 2015 at 3:53 PM, Elliot Saba <staticfloat at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> To give a small update on this, it looks like some kind of failure is
>>>>>> going on in the initial communication between master and slave.  Contrast
>>>>>> the twistd.log file when a working slave connects:
>>>>>>
>>>>>> 2015-10-05 19:49:06+0000 [Broker,15,aaa.bbb.ccc.ddd] slave
>>>>>> 'centos6.7-x64' attaching from IPv4Address(TCP, 'aaa.bbb.ccc.ddd', 53629)
>>>>>> 2015-10-05 19:49:06+0000 [Broker,15,aaa.bbb.ccc.ddd] Got slaveinfo
>>>>>> from 'centos6.7-x64'
>>>>>> 2015-10-05 19:49:06+0000 [Broker,15,aaa.bbb.ccc.ddd] Starting
>>>>>> buildslave keepalive timer for 'centos6.7-x64'
>>>>>> 2015-10-05 19:49:06+0000 [Broker,15,aaa.bbb.ccc.ddd] bot attached
>>>>>> 2015-10-05 19:49:06+0000 [Broker,15,aaa.bbb.ccc.ddd] Buildslave
>>>>>> centos6.7-x64 attached to nuke_centos6.7-x64
>>>>>> 2015-10-05 19:49:06+0000 [Broker,15,aaa.bbb.ccc.ddd] Buildslave
>>>>>> centos6.7-x64 attached to nightly_cxx64
>>>>>> 2015-10-05 19:49:06+0000 [Broker,15,aaa.bbb.ccc.ddd] Buildslave
>>>>>> centos6.7-x64 attached to clean_centos6.7-x64
>>>>>>
>>>>>> Versus when my new ARM slave connects:
>>>>>> 2015-10-05 19:49:16+0000 [Broker,16,www.xxx.yyy.zzz] slave
>>>>>> 'ubuntu14.04-armv7l' attaching from IPv4Address(TCP, 'www.xxx.yyy.zzz',
>>>>>> 47946)
>>>>>>
>>>>>> It never says "Got slaveinfo from ubuntu14.04-armv7l", which seems
>>>>>> to me like there's a problem with duplex communication.  Does anyone know
>>>>>> if I need to open ports other than the default buildmaster port of 9989?
>>>>>> Do I need to open ports on the buildslave side?
>>>>>> -E
>>>>>>
>>>>>> On Fri, Oct 2, 2015 at 11:47 PM, nachaat hassis <nachaat05 at yahoo.fr>
>>>>>> wrote:
>>>>>>
>>>>>>> The fact that the slave is an arm-board does not matter.
>>>>>>> I have many arm- and armhf- slaves and they are working fine.
>>>>>>>
>>>>>>> Von meinem iPhone gesendet
>>>>>>>
>>>>>>> Am 03.10.2015 um 08:21 schrieb Elliot Saba <staticfloat at gmail.com>:
>>>>>>>
>>>>>>> Hello all,
>>>>>>>
>>>>>>> I have an ubuntu14.04 ARM machine that I'm trying to connect to my
>>>>>>> buildmaster, and although both twistd.log files show the connection being
>>>>>>> made, the web UI always says that the buildslave is disconnected, and no
>>>>>>> jobs are ever dispatched to the buildslave.
>>>>>>>
>>>>>>> In slightly more detail, my buildmaster has logs of the form:
>>>>>>>
>>>>>>> 2015-10-03 06:16:42+0000 [Broker,12,www.xxx.yyy.zzz] slave
>>>>>>> 'ubuntu14.04-armv7l' attaching from IPv4Address(TCP, 'aaa.bbb.ccc.ddd',
>>>>>>> 47418)
>>>>>>>
>>>>>>> And my buildslave shows:
>>>>>>>
>>>>>>> 2015-10-03 06:16:41+0000 [-] Connecting to
>>>>>>> buildbot.e.ip.saba.us:9989
>>>>>>> 2015-10-03 06:16:41+0000 [Broker,client] message from master:
>>>>>>> attached
>>>>>>>
>>>>>>> You can see the buildslave on the webui here
>>>>>>> <http://buildbot.e.ip.saba.us:8010/builders/package_tarballarm7vl>,
>>>>>>> what can I do to figure out why this is happening?  Could this have
>>>>>>> something to do with the fact that the buildslave is an ARM machine?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> -Elliot
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> users at buildbot.net
>>>>>>> https://lists.buildbot.net/mailman/listinfo/users
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> users at buildbot.net
>>>>>> https://lists.buildbot.net/mailman/listinfo/users
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.buildbot.net/pipermail/users/attachments/20151012/44e5a6cb/attachment.html>


More information about the users mailing list