3716: Diff crashing with non-utf8 characters

msunde
david
david
Jan. 14, 2015
What version are you running?
2.0.12


What's the URL of the page containing the problem?
Any URL that involves a diff on files containing non-utf8 characters.


What steps will reproduce the problem?
1. Create new review request.
2. Browse repo
3. Create a review from an existing change.
4. The new review page does not come up with the diff information.
5. The problem is also reproducible when uploading a patch file.


What is the expected output? What do you see instead?
The patch and diff files, etc should be processed based on the encoding configured on the repository advanced settings in reviewboard. In this case, the advanced setting was configured to: ISO-8859-1


What operating system are you using? What browser?
Server is running on Amazon Linux. Latest Version.
Browser is Google Chrome latest version.


Please provide any additional information below.

The files checked into Subversion do not have any svn:mime-type properties set so relying on this reviewboard advanced setting to specify the character encoding.

The specific character that is causing a problem is: é
This character is encoded as E9 in ISO-8859-1, which is how it appears in the repository or svn diff file.
But as "C3 A9" in UTF-8, which is why review board gets confused.


The error from the logs:

23:12:21	ERROR	
 - Failed to generate diff using pysvn for revisions 14128:14129 for path https://subversion.devfactory.com/repos/AureaGCE_Generix: 'utf8' codec can't decode byte 0xe9 in position 534: invalid continuation byte
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/ReviewBoard-2.0.12-py2.6.egg/reviewboard/scmtools/svn/pysvn.py", line 269, in diff
    diff_options=['-u']).decode('utf-8')
  File "/usr/lib64/python2.6/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xe9 in position 534: invalid continuation byte
23:12:21	ERROR	
 - Unable to update new review request from commit ID 14129: Unable to get diff revisions 14128 through 14129: 'utf8' codec can't decode byte 0xe9 in position 534: invalid continuation byte
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/ReviewBoard-2.0.12-py2.6.egg/reviewboard/reviews/managers.py", line 146, in create
    review_request.update_from_commit_id(commit_id)
  File "/usr/lib/python2.6/site-packages/ReviewBoard-2.0.12-py2.6.egg/reviewboard/reviews/models/base_review_request_details.py", line 195, in update_from_commit_id
    self.update_from_committed_change(commit_id)
  File "/usr/lib/python2.6/site-packages/ReviewBoard-2.0.12-py2.6.egg/reviewboard/reviews/models/base_review_request_details.py", line 243, in update_from_committed_change
    commit = self.repository.get_change(commit_id)
  File "/usr/lib/python2.6/site-packages/ReviewBoard-2.0.12-py2.6.egg/reviewboard/scmtools/models.py", line 435, in get_change
    return self.get_scmtool().get_change(revision)
  File "/usr/lib/python2.6/site-packages/ReviewBoard-2.0.12-py2.6.egg/reviewboard/scmtools/svn/__init__.py", line 221, in get_change
    raise self.normalize_error(e)
  File "/usr/lib/python2.6/site-packages/ReviewBoard-2.0.12-py2.6.egg/reviewboard/scmtools/svn/__init__.py", line 312, in normalize_error
    raise SCMError(e)
SCMError: Unable to get diff revisions 14128 through 14129: 'utf8' codec can't decode byte 0xe9 in position 534: invalid continuation byte
#1 msunde
This problem is preventing French users from trying out reviewboard.
david
#2 david
  • +PendingReview
  • +Component-SCMTools
  • +david
david
#3 david
Fixed in release-2.0.x (ad8ccf1). Thanks!
  • -PendingReview
    +Fixed
#4 msunde
FYI, I applied the change locally and it is working. Thanks for the quick turn around.