3707: Non-ASCII Unicode characters cause email notifications not to be sent
- Fixed
- Review Board
erwaman | |
Jan. 7, 2015 |
What version are you running? 2.0.11 What's the URL of the page containing the problem? Any Review Request What steps will reproduce the problem? 1. Publish a Review Request with code that contains non-ASCII characters and add some Reviewers. 2. Comment on a line containing non-ASCII characters. 3. Publish your comment. (Alternatively, set your browser to a language with lots of non-ASCII characters like Chinese and email notifications will not be sent for any comment because the month gets rendered in Chinese (non-ASCII).) What is the expected output? What do you see instead? An email notification should be sent to the reviewers. However, no email notification is sent. What operating system are you using? What browser? I can reproduce the issue on both RHEL 6.4 and OS X 10.9.4 in both Chrome and Firefox. Please provide any additional information below. I reported this bug to my company's Tools team and an engineer responded saying he believes this commit - https://github.com/reviewboard/reviewboard/commit/eea73a5cab0a5c935f08456b3531e9d39629797f - caused the problem.
The Tools team engineer just let me know that he meant this commit: https://github.com/reviewboard/reviewboard/commit/c31bdf5136c389c4fc29aaa311f1d3e509890c1e
The Tools team engineer said they manually applied that commit. Have you noticed this issue in your testing with that commit applied? If so, any fix?
We haven't yet, but it's possible there is a problem there. Can you give us a repro case, along with information on your database encoding settings?
-
+ NeedInfo
Publish a Review Request with some áccéntéd wórds. Then comment on the line with the áccéntéd wórds and publish your review. In my case, no email notification is sent, even though one should be. I will ask the Tools team engineer tomorrow to provide more DB information.
The database encoding is in UTF8. If this is the recommended setting, we can provide a patch for it as we have a fix in the works that is just pending testing on our production instance.
What happens if you change this line: kwargs['continuation_ws'] = ' ' to: kwargs['continuation_ws'] = b' '
Changing it to b' ' also appears to fix this issue. Could you explain why you chose to do this as "b" to my understanding is ignored in Python 2 (except for upgrading to Python 3). The specific change we made was to use "kwargs['continuation_ws'] = u' '.encode('utf8')" which has not exhibited issues in the past few days.
It's usually ignored, but because we're working to make things compatible with python 3, we've imported unicode_literals from __future__, making ' ' a unicode object. We temporarily reverted the offending change from 2.0.12, and I'm about to push a corrected version.
-
- NeedInfo + Confirmed