Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(278)

Issue 9961009: Regexp: Unify the lookahead code for the BoyerMoore-like skipping (Closed)

Created:
8 years, 8 months ago by Erik Corry
Modified:
8 years, 8 months ago
Reviewers:
ulan
CC:
v8-dev
Visibility:
Public.

Description

Regexp: Unify the lookahead code for the BoyerMoore-like skipping and the lookahead code for simplifying \b. Also cache lookahead results on the nodes, fix a memory leak and remove some character class code we are no longer using. Committed: https://code.google.com/p/v8/source/detail?r=11345

Patch Set 1 #

Patch Set 2 : #

Total comments: 18
Unified diffs Side-by-side diffs Delta from patch set Stats (+421 lines, -869 lines) Patch
M src/jsregexp.h View 1 23 chunks +161 lines, -182 lines 2 comments Download
M src/jsregexp.cc View 1 31 chunks +258 lines, -614 lines 16 comments Download
M test/cctest/test-regexp.cc View 1 4 chunks +2 lines, -73 lines 0 comments Download

Messages

Total messages: 3 (0 generated)
Erik Corry
8 years, 8 months ago (2012-04-04 14:42:28 UTC) #1
ulan
LGTM with comments. http://codereview.chromium.org/9961009/diff/1005/src/jsregexp.cc File src/jsregexp.cc (right): http://codereview.chromium.org/9961009/diff/1005/src/jsregexp.cc#newcode111 src/jsregexp.cc:111: static ContainedInLattice Combine(ContainedInLattice a, bool inside) ...
8 years, 8 months ago (2012-04-16 13:52:22 UTC) #2
Erik Corry
8 years, 8 months ago (2012-04-17 07:51:45 UTC) #3
http://codereview.chromium.org/9961009/diff/1005/src/jsregexp.cc
File src/jsregexp.cc (right):

http://codereview.chromium.org/9961009/diff/1005/src/jsregexp.cc#newcode111
src/jsregexp.cc:111: static ContainedInLattice Combine(ContainedInLattice a,
bool inside) {
On 2012/04/16 13:52:22, ulan wrote:
> Passing kLatticeIn/kLatticeOut instead of bool would make it possible to write
> ContainedInLattice Combine(ContainedInLattice a, ContainedInLattice b) {
>   return static_cast<ContainedInLattice>(a | b);
> }

Done.

http://codereview.chromium.org/9961009/diff/1005/src/jsregexp.cc#newcode133
src/jsregexp.cc:133: if (last <= new_range.from() && ranges[i] > new_range.to())
{
On 2012/04/16 13:52:22, ulan wrote:
> So ranges[i] is not inclusive, but new_range.to() is inclusive. It's a bit
> confusing, maybe worthwhile adding a comment.
> 
> Nit, I would reverse comparison:
> last <= new_range.from() && new_range.to() < ranges[i]

Done.

http://codereview.chromium.org/9961009/diff/1005/src/jsregexp.cc#newcode136
src/jsregexp.cc:136: // Otherwise it
On 2012/04/16 13:52:22, ulan wrote:
> Trailing space.

Done.

http://codereview.chromium.org/9961009/diff/1005/src/jsregexp.cc#newcode2192
src/jsregexp.cc:2192: if (offset == 0) set_bm_info(not_at_start, bm);
On 2012/04/16 13:52:22, ulan wrote:
> As discussed offline, it probably better to move this into a separate function
> with comments.

Done.

http://codereview.chromium.org/9961009/diff/1005/src/jsregexp.cc#newcode2764
src/jsregexp.cc:2764: if (bm->at(0)->is_word()) next_is_word_character =
Trace::TRUE;
On 2012/04/16 13:52:22, ulan wrote:
> Consider writing the two lines above as:
> next_is_word_character = bm->at(0)->is_word() ? Trace::TRUE : Trace::FALSE;

Not the same, since it could be neither known to be a word, nor known to be a
non-word.

http://codereview.chromium.org/9961009/diff/1005/src/jsregexp.cc#newcode3286
src/jsregexp.cc:3286: static const int kSpaceRanges2[] = { '\t', '\r' + 1, ' ',
' ' + 1, 0x00A0,
On 2012/04/16 13:52:22, ulan wrote:
> Consider adding a static assert that the range counts are odd or a comment
that
> 0x10000 is dummy. Maybe I missed something, but it looks like 0x10000 is never
> used in the code below, because all loops do (i+=2) and go while i < count-1.
> 
> I didn't verify that the ranges have correct values.

AddRange jsregexp.cc:118 uses all the elements in the table.
ASSERTS added

http://codereview.chromium.org/9961009/diff/1005/src/jsregexp.cc#newcode3300
src/jsregexp.cc:3300: static const int kLineTerminatorRange2Count =
ARRAY_SIZE(kLineTerminatorRanges2);
On 2012/04/16 13:52:22, ulan wrote:
> Long line above and too many new lines below.

Done.

http://codereview.chromium.org/9961009/diff/1005/src/jsregexp.cc#newcode3364
src/jsregexp.cc:3364: max_number_of_chars < 32;
On 2012/04/16 13:52:22, ulan wrote:
> Now 32 is more magical :)

Done.

http://codereview.chromium.org/9961009/diff/1005/src/jsregexp.h
File src/jsregexp.h (right):

http://codereview.chromium.org/9961009/diff/1005/src/jsregexp.h#newcode1203
src/jsregexp.h:1203: static const int kMask = kMapSize -1 ;
On 2012/04/16 13:52:22, ulan wrote:
> Should be kMapSize - 1;

Done.

Powered by Google App Engine
This is Rietveld 408576698