|
|
Created:
7 years, 10 months ago by agable Modified:
7 years, 10 months ago CC:
chromium-reviews, cmp+cc_chromium.org Visibility:
Public. |
DescriptionBetter unicode output processing.
BUG=Console crashing when commit messages contain unicode.
Committed: http://src.chromium.org/viewvc/chrome?view=rev&revision=180756
Patch Set 1 #
Total comments: 4
Patch Set 2 : Brackets. #
Total comments: 2
Patch Set 3 : Fix Python 2.6 encode keyword error. #Messages
Total messages: 16 (0 generated)
Quick fix to force everything that gets processed by jinja2 through an extra round of unicode-to-ascii conversion. Already uploaded (i.e. TBR'd) and serving, fixes the 500s we were seeing this morning.
Can this be tested? On Tue, Feb 5, 2013 at 9:29 AM, <agable@chromium.org> wrote: > Reviewers: Marc-Antoine Ruel, Nico, > > Message: > Quick fix to force everything that gets processed by jinja2 through an extra > round of unicode-to-ascii conversion. Already uploaded (i.e. TBR'd) and > serving, > fixes the 500s we were seeing this morning. > > Description: > Better unicode output processing. > > BUG=Console crashing when commit messages contain unicode. > > > Please review this at https://chromiumcodereview.appspot.com/12211015/ > > SVN Base: svn://svn.chromium.org/chrome/trunk/tools/chromium-build > > Affected files: > M app.py > > > Index: app.py > diff --git a/app.py b/app.py > index > e634f28ac88874b13cc5c1c36f21aa3fc355e723..374823c21d42d969e502bd1d35bf4c844758f256 > 100644 > --- a/app.py > +++ b/app.py > @@ -298,7 +298,8 @@ class ConsoleData(object): > > @staticmethod > def ContentsToHtml(contents): > - return ''.join([str(content) for content in contents]) > + return ''.join([unicode(content).encode('ascii', errors='replace') > + for content in contents]) > > @property > def last_row(self): > @@ -369,15 +370,17 @@ class ConsoleData(object): > attrvalue = re.sub(r'^(\S+).*', r'\1', attrvalue) > if attrvalue == 'DevRev': > revision = cells[0] > - self.SawRevision(revision=revision.findAll('a')[0].contents[0]) > - self.SetLink(revlink=revision.findAll('a')[0].attrs[0][1]) > + self.SawRevision(self.ContentsToHtml( > + revision.findAll('a')[0].contents[0])) > + > self.SetLink(self.ContentsToHtml(revision.findAll('a')[0].attrs[0][1])) > nameparts = cells[1].contents > - self.SetName(who=re.sub(r'^\s+(.*)\s*$', > - r'\1', > - self.ContentsToHtml(nameparts))) > + self.SetName(re.sub(r'^\s+(.*)\s*$', > + r'\1', > + self.ContentsToHtml(nameparts))) > for i, bs in enumerate(cells[2:]): > - > self.SetStatus(category=self.category_order[self.lastMasterSeen][i], > - status=str(bs.findAll('table', > recursive=False)[0])) > + self.SetStatus(self.category_order[self.lastMasterSeen][i], > + self.ContentsToHtml(bs.findAll('table', > + > recursive=False)[0])) > if attrvalue == 'DevComment': > self.SetComment(comment=self.ContentsToHtml(cells[0].contents)) > if attrvalue == 'DevDetails': > >
This entire merging flow (using the ConsoleData object, etc) is going to be dead code and gone in a week or so. I have new tests being brought in for the new code, which is more unicode-safe. On Tue, Feb 5, 2013 at 9:31 AM, Nico Weber <thakis@chromium.org> wrote: > Can this be tested? > > On Tue, Feb 5, 2013 at 9:29 AM, <agable@chromium.org> wrote: > > Reviewers: Marc-Antoine Ruel, Nico, > > > > Message: > > Quick fix to force everything that gets processed by jinja2 through an > extra > > round of unicode-to-ascii conversion. Already uploaded (i.e. TBR'd) and > > serving, > > fixes the 500s we were seeing this morning. > > > > Description: > > Better unicode output processing. > > > > BUG=Console crashing when commit messages contain unicode. > > > > > > Please review this at https://chromiumcodereview.appspot.com/12211015/ > > > > SVN Base: svn://svn.chromium.org/chrome/trunk/tools/chromium-build > > > > Affected files: > > M app.py > > > > > > Index: app.py > > diff --git a/app.py b/app.py > > index > > > e634f28ac88874b13cc5c1c36f21aa3fc355e723..374823c21d42d969e502bd1d35bf4c844758f256 > > 100644 > > --- a/app.py > > +++ b/app.py > > @@ -298,7 +298,8 @@ class ConsoleData(object): > > > > @staticmethod > > def ContentsToHtml(contents): > > - return ''.join([str(content) for content in contents]) > > + return ''.join([unicode(content).encode('ascii', errors='replace') > > + for content in contents]) > > > > @property > > def last_row(self): > > @@ -369,15 +370,17 @@ class ConsoleData(object): > > attrvalue = re.sub(r'^(\S+).*', r'\1', attrvalue) > > if attrvalue == 'DevRev': > > revision = cells[0] > > - self.SawRevision(revision=revision.findAll('a')[0].contents[0]) > > - self.SetLink(revlink=revision.findAll('a')[0].attrs[0][1]) > > + self.SawRevision(self.ContentsToHtml( > > + revision.findAll('a')[0].contents[0])) > > + > > self.SetLink(self.ContentsToHtml(revision.findAll('a')[0].attrs[0][1])) > > nameparts = cells[1].contents > > - self.SetName(who=re.sub(r'^\s+(.*)\s*$', > > - r'\1', > > - self.ContentsToHtml(nameparts))) > > + self.SetName(re.sub(r'^\s+(.*)\s*$', > > + r'\1', > > + self.ContentsToHtml(nameparts))) > > for i, bs in enumerate(cells[2:]): > > - > > self.SetStatus(category=self.category_order[self.lastMasterSeen][i], > > - status=str(bs.findAll('table', > > recursive=False)[0])) > > + self.SetStatus(self.category_order[self.lastMasterSeen][i], > > + self.ContentsToHtml(bs.findAll('table', > > + > > recursive=False)[0])) > > if attrvalue == 'DevComment': > > self.SetComment(comment=self.ContentsToHtml(cells[0].contents)) > > if attrvalue == 'DevDetails': > > > > >
I'll leave this to maruel then. On Tue, Feb 5, 2013 at 9:34 AM, Aaron Gable <agable@chromium.org> wrote: > This entire merging flow (using the ConsoleData object, etc) is going to be > dead code and gone in a week or so. I have new tests being brought in for > the new code, which is more unicode-safe. > > > On Tue, Feb 5, 2013 at 9:31 AM, Nico Weber <thakis@chromium.org> wrote: >> >> Can this be tested? >> >> On Tue, Feb 5, 2013 at 9:29 AM, <agable@chromium.org> wrote: >> > Reviewers: Marc-Antoine Ruel, Nico, >> > >> > Message: >> > Quick fix to force everything that gets processed by jinja2 through an >> > extra >> > round of unicode-to-ascii conversion. Already uploaded (i.e. TBR'd) and >> > serving, >> > fixes the 500s we were seeing this morning. >> > >> > Description: >> > Better unicode output processing. >> > >> > BUG=Console crashing when commit messages contain unicode. >> > >> > >> > Please review this at https://chromiumcodereview.appspot.com/12211015/ >> > >> > SVN Base: svn://svn.chromium.org/chrome/trunk/tools/chromium-build >> > >> > Affected files: >> > M app.py >> > >> > >> > Index: app.py >> > diff --git a/app.py b/app.py >> > index >> > >> > e634f28ac88874b13cc5c1c36f21aa3fc355e723..374823c21d42d969e502bd1d35bf4c844758f256 >> > 100644 >> > --- a/app.py >> > +++ b/app.py >> > @@ -298,7 +298,8 @@ class ConsoleData(object): >> > >> > @staticmethod >> > def ContentsToHtml(contents): >> > - return ''.join([str(content) for content in contents]) >> > + return ''.join([unicode(content).encode('ascii', errors='replace') >> > + for content in contents]) >> > >> > @property >> > def last_row(self): >> > @@ -369,15 +370,17 @@ class ConsoleData(object): >> > attrvalue = re.sub(r'^(\S+).*', r'\1', attrvalue) >> > if attrvalue == 'DevRev': >> > revision = cells[0] >> > - self.SawRevision(revision=revision.findAll('a')[0].contents[0]) >> > - self.SetLink(revlink=revision.findAll('a')[0].attrs[0][1]) >> > + self.SawRevision(self.ContentsToHtml( >> > + revision.findAll('a')[0].contents[0])) >> > + >> > self.SetLink(self.ContentsToHtml(revision.findAll('a')[0].attrs[0][1])) >> > nameparts = cells[1].contents >> > - self.SetName(who=re.sub(r'^\s+(.*)\s*$', >> > - r'\1', >> > - self.ContentsToHtml(nameparts))) >> > + self.SetName(re.sub(r'^\s+(.*)\s*$', >> > + r'\1', >> > + self.ContentsToHtml(nameparts))) >> > for i, bs in enumerate(cells[2:]): >> > - >> > self.SetStatus(category=self.category_order[self.lastMasterSeen][i], >> > - status=str(bs.findAll('table', >> > recursive=False)[0])) >> > + self.SetStatus(self.category_order[self.lastMasterSeen][i], >> > + self.ContentsToHtml(bs.findAll('table', >> > + >> > recursive=False)[0])) >> > if attrvalue == 'DevComment': >> > self.SetComment(comment=self.ContentsToHtml(cells[0].contents)) >> > if attrvalue == 'DevDetails': >> > >> > > >
https://chromiumcodereview.appspot.com/12211015/diff/1/app.py File app.py (right): https://chromiumcodereview.appspot.com/12211015/diff/1/app.py#newcode301 app.py:301: return ''.join([unicode(content).encode('ascii', errors='replace') Why encode to ascii and not utf-8? No need for [].
https://chromiumcodereview.appspot.com/12211015/diff/1/app.py File app.py (right): https://chromiumcodereview.appspot.com/12211015/diff/1/app.py#newcode301 app.py:301: return ''.join([unicode(content).encode('ascii', errors='replace') On 2013/02/05 18:11:56, Marc-Antoine Ruel wrote: > Why encode to ascii and not utf-8? > No need for []. Because when you encode to utf-8 jinja2 still breaks while rendering the template. It wants ascii. Looking into how to change that for the longer-term fix in the new backend.
https://chromiumcodereview.appspot.com/12211015/diff/1/app.py File app.py (right): https://chromiumcodereview.appspot.com/12211015/diff/1/app.py#newcode301 app.py:301: return ''.join([unicode(content).encode('ascii', errors='replace') On 2013/02/05 18:17:37, agable wrote: > On 2013/02/05 18:11:56, Marc-Antoine Ruel wrote: > > Why encode to ascii and not utf-8? > > No need for []. > > Because when you encode to utf-8 jinja2 still breaks while rendering the > template. It wants ascii. Looking into how to change that for the longer-term > fix in the new backend. Probably because jinja2 wants non-encoded unicode instances? Which would make sense. [] are still not needed.
PTAL https://chromiumcodereview.appspot.com/12211015/diff/1/app.py File app.py (right): https://chromiumcodereview.appspot.com/12211015/diff/1/app.py#newcode301 app.py:301: return ''.join([unicode(content).encode('ascii', errors='replace') On 2013/02/05 18:20:18, Marc-Antoine Ruel wrote: > On 2013/02/05 18:17:37, agable wrote: > > On 2013/02/05 18:11:56, Marc-Antoine Ruel wrote: > > > Why encode to ascii and not utf-8? > > > No need for []. > > > > Because when you encode to utf-8 jinja2 still breaks while rendering the > > template. It wants ascii. Looking into how to change that for the longer-term > > fix in the new backend. > > Probably because jinja2 wants non-encoded unicode instances? Which would make > sense. > > [] are still not needed. Done.
https://chromiumcodereview.appspot.com/12211015/diff/2002/app.py File app.py (right): https://chromiumcodereview.appspot.com/12211015/diff/2002/app.py#newcode299 app.py:299: @staticmethod I think it'd be better to not make it a static function. It'd make the call sites shorter.
https://chromiumcodereview.appspot.com/12211015/diff/2002/app.py File app.py (right): https://chromiumcodereview.appspot.com/12211015/diff/2002/app.py#newcode299 app.py:299: @staticmethod On 2013/02/05 18:37:30, Marc-Antoine Ruel wrote: > I think it'd be better to not make it a static function. It'd make the call > sites shorter. It's logically owned by this class, yet doesn't reference self. I could tell pylint to ignore it (which always makes me sad), I could move it and change all of its callsites too... or I could submit my big merger.py CL and delete this code entirely :)
On 2013/02/05 18:48:45, agable wrote: > It's logically owned by this class, yet doesn't reference self. It's logical only in the way you use it but it's a agnostic algorithm in itself. Decouple the algorithm from the use case and you'll be enlightened. #ingress. > I could tell > pylint to ignore it (which always makes me sad), I could move it and change all > of its callsites too... or I could submit my big merger.py CL and delete this > code entirely :) Refactoring FTW IMHO. Anyhow, lgtm as-is.
Any other day I'd say Refactoring FTW too, but this code has already been razed to the ground on my local branch and that's a much nicer refactoring IMHO :) Thanks.
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/agable@chromium.org/12211015/2002
Presubmit check for 12211015-2002 failed and returned exit status 1. Running presubmit commit checks ... Running pylint on 8 files. ** Presubmit ERRORS ** nosetests failed! Command nosetests --with-gae --gae-lib-root=/b/google_appengine returned non-zero exit status 1 .........EE...... ====================================================================== ERROR: test_console_merger (app_test.ConsoleTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "/b/commit-queue/workdir/tools/chromium-build/app_test.py", line 359, in test_console_merger num_rows_to_merge=1) File "/b/commit-queue/workdir/tools/chromium-build/app.py", line 452, in console_merger mergedconsole.AddRow(row_data) File "/b/commit-queue/workdir/tools/chromium-build/app.py", line 353, in AddRow self.SetName(self.ContentsToHtml(name)) File "/b/commit-queue/workdir/tools/chromium-build/app.py", line 302, in ContentsToHtml for content in contents) File "/b/commit-queue/workdir/tools/chromium-build/app.py", line 302, in <genexpr> for content in contents) TypeError: encode() takes no keyword arguments -------------------- >> begin captured logging << -------------------- root: DEBUG: save_page: content size is 32464 root: INFO: Saved and cached page with localpath surroundings root: DEBUG: save_page: content size is 196 root: INFO: Saved and cached page with localpath chromium.linux/console/categories root: DEBUG: save_page: content size is 3667 root: INFO: Saved and cached page with localpath chromium.linux/console/summary root: INFO: Saved and cached row with localpath chromium.linux/console/171895 root: DEBUG: content for latest_rev found in memcache root: ERROR: get_and_cache_rowdata('latest_rev'): no matching localpath in datastore root: DEBUG: save_page: content size is 0 root: INFO: Saved and cached page with localpath chromium.mac/console/categories root: DEBUG: save_page: content size is 4738 root: INFO: Saved and cached page with localpath chromium.mac/console/summary root: INFO: Saved and cached row with localpath chromium.mac/console/171895 root: DEBUG: content for latest_rev found in memcache root: DEBUG: save_page: content size is 0 root: INFO: Saved and cached page with localpath chromium.win/console/categories root: DEBUG: save_page: content size is 7712 root: INFO: Saved and cached page with localpath chromium.win/console/summary root: INFO: Saved and cached row with localpath chromium.win/console/171895 root: DEBUG: content for latest_rev found in memcache root: DEBUG: save_page: content size is 335 root: INFO: Saved and cached page with localpath chromium.memory/console/categories root: DEBUG: save_page: content size is 4796 root: INFO: Saved and cached page with localpath chromium.memory/console/summary root: INFO: Saved and cached row with localpath chromium.memory/console/171895 root: DEBUG: content for latest_rev found in memcache root: DEBUG: save_page: content size is 66811 root: INFO: Saved and cached page with localpath merged_output root: DEBUG: content for surroundings found in memcache root: DEBUG: content for surroundings found in cache root: DEBUG: content for latest_rev found in memcache root: DEBUG: content for chromium.linux/console/summary found in memcache root: DEBUG: content for chromium.linux/console/summary found in cache root: DEBUG: content for chromium.linux/console/categories found in memcache root: DEBUG: content for chromium.linux/console/categories found in cache root: DEBUG: content for chromium.linux/console/171895 found in memcache root: DEBUG: SawRevision: guessing at row ordering --------------------- >> end captured logging << --------------------- ====================================================================== ERROR: test_console_merger_splitrevs (app_test.ConsoleTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "/b/commit-queue/workdir/tools/chromium-build/app_test.py", line 409, in test_console_merger_splitrevs num_rows_to_merge=1) File "/b/commit-queue/workdir/tools/chromium-build/app.py", line 452, in console_merger mergedconsole.AddRow(row_data) File "/b/commit-queue/workdir/tools/chromium-build/app.py", line 353, in AddRow self.SetName(self.ContentsToHtml(name)) File "/b/commit-queue/workdir/tools/chromium-build/app.py", line 302, in ContentsToHtml for content in contents) File "/b/commit-queue/workdir/tools/chromium-build/app.py", line 302, in <genexpr> for content in contents) TypeError: encode() takes no keyword arguments -------------------- >> begin captured logging << -------------------- root: DEBUG: save_page: content size is 32464 root: INFO: Saved and cached page with localpath surroundings root: DEBUG: save_page: content size is 196 root: INFO: Saved and cached page with localpath chromium.linux/console/categories root: DEBUG: save_page: content size is 3667 root: INFO: Saved and cached page with localpath chromium.linux/console/summary root: INFO: Saved and cached row with localpath chromium.linux/console/171895 root: DEBUG: content for latest_rev found in memcache root: ERROR: get_and_cache_rowdata('latest_rev'): no matching localpath in datastore root: DEBUG: save_page: content size is 0 root: INFO: Saved and cached page with localpath chromium.mac/console/categories root: DEBUG: save_page: content size is 4738 root: INFO: Saved and cached page with localpath chromium.mac/console/summary root: INFO: Saved and cached row with localpath chromium.mac/console/171895 root: DEBUG: content for latest_rev found in memcache root: DEBUG: save_page: content size is 54266 root: INFO: Saved and cached page with localpath merged_output root: DEBUG: content for surroundings found in memcache root: DEBUG: content for surroundings found in cache root: DEBUG: content for latest_rev found in memcache root: DEBUG: content for chromium.linux/console/summary found in memcache root: DEBUG: content for chromium.linux/console/summary found in cache root: DEBUG: content for chromium.linux/console/categories found in memcache root: DEBUG: content for chromium.linux/console/categories found in cache root: DEBUG: content for chromium.linux/console/171895 found in memcache root: DEBUG: SawRevision: guessing at row ordering --------------------- >> end captured logging << --------------------- ---------------------------------------------------------------------- Ran 17 tests in 5.004s FAILED (errors=2) Presubmit checks took 10.9s to calculate. Was the presubmit check useful? Please send feedback & hate mail to maruel@chromium.org!
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/agable@chromium.org/12211015/12001
Message was sent while issue was closed.
Change committed as 180756 |