3707: Non-ASCII Unicode characters cause email notifications not to be sent

erwaman
Jan. 7, 2015
What version are you running?
2.0.11

What's the URL of the page containing the problem?
Any Review Request

What steps will reproduce the problem?
1. Publish a Review Request with code that contains non-ASCII characters and add some Reviewers.
2. Comment on a line containing non-ASCII characters.
3. Publish your comment.

(Alternatively, set your browser to a language with lots of non-ASCII characters like Chinese and email notifications will not be sent for any comment because the month gets rendered in Chinese (non-ASCII).)

What is the expected output? What do you see instead?
An email notification should be sent to the reviewers. However, no email notification is sent.

What operating system are you using? What browser?
I can reproduce the issue on both RHEL 6.4 and OS X 10.9.4 in both Chrome and Firefox.


Please provide any additional information below.
I reported this bug to my company's Tools team and an engineer responded saying he believes this commit - https://github.com/reviewboard/reviewboard/commit/eea73a5cab0a5c935f08456b3531e9d39629797f - caused the problem.
#1 erwaman
The Tools team engineer just let me know that he meant this commit: https://github.com/reviewboard/reviewboard/commit/c31bdf5136c389c4fc29aaa311f1d3e509890c1e
chipx86
#2 chipx86
We haven't put out any releases with that commit yet. That commit's scheduled for 2.0.12.
#3 erwaman
The Tools team engineer said they manually applied that commit. Have you noticed this issue in your testing with that commit applied? If so, any fix?
chipx86
#4 chipx86
We haven't yet, but it's possible there is a problem there. Can you give us a repro case, along with information on your database encoding settings?
  • +NeedInfo
#5 erwaman
Publish a Review Request with some áccéntéd wórds. Then comment on the line with the áccéntéd wórds and publish your review. In my case, no email notification is sent, even though one should be. I will ask the Tools team engineer tomorrow to provide more DB information.
#6 jaspe******@gmai***** (Google Code) (Is this you? Claim this profile.)
The database encoding is in UTF8. If this is the recommended setting, we can provide a patch for it as we have a fix in the works that is just pending testing on our production instance.
chipx86
#7 chipx86
What happens if you change this line:

    kwargs['continuation_ws'] = ' '

to:

    kwargs['continuation_ws'] = b' '
#8 jaspe******@gmai***** (Google Code) (Is this you? Claim this profile.)
Changing it to b' ' also appears to fix this issue. Could you explain why you chose to do this as "b" to my understanding is ignored in Python 2 (except for upgrading to Python 3). The specific change we made was to use "kwargs['continuation_ws'] = u' '.encode('utf8')" which has not exhibited issues in the past few days.
david
#9 david
It's usually ignored, but because we're working to make things compatible with python 3, we've imported unicode_literals from __future__, making ' ' a unicode object. We temporarily reverted the offending change from 2.0.12, and I'm about to push a corrected version.
  • -NeedInfo
    +Confirmed
david
#10 david
Fixed in release-2.0.x (306dd9a). Thanks!
  • -Confirmed
    +Fixed