[Buildbot-commits] [Buildbot] #2757: Use chardet on incoming bytestrings
Buildbot trac
trac at buildbot.net
Tue Apr 15 14:07:54 UTC 2014
#2757: Use chardet on incoming bytestrings
------------------------+----------------------
Reporter: dustin | Owner:
Type: enhancement | Status: new
Priority: major | Milestone: 0.9.+
Version: 0.8.8 | Keywords: encoding
------------------------+----------------------
There's a library, chardet, which can do a reasonable job of guessing the
charset of a bytestring.
There are a number of places in Buildbot where incoming data is a
bytestring. Most of those allow the user to specify an encoding, and
default to UTF-8. For example, change sources generally get bytestrings
for commit comments, authors, and so on.
In the default case, it may be more convenient for users if we dynamically
detect the character encoding of these strings. This would amount to
"doing the right thing" when possible, with the fallback option for users
to supply an explicit encoding.
Chardet would also be useful in the `ascii2unicode` method, which
currently only allows ascii bytestrings. Then a little mojibake is the
unlikely worst case, rather than an exception
--
Ticket URL: <http://trac.buildbot.net/ticket/2757>
Buildbot <http://buildbot.net/>
Buildbot: build/test automation
More information about the Commits
mailing list