|
|
Created:
7 years, 3 months ago by pstanek Modified:
7 years, 3 months ago CC:
blink-reviews, dglazkov+blink, eae+blinkwatch, adamk+blink_chromium.org Base URL:
https://chromium.googlesource.com/chromium/blink.git@master Visibility:
Public. |
DescriptionaddMarker() optimizations.
Essentially, DocumentMarkerController maintained a list
(WTF::Vector) of markers. And linearly traversed that list
to coalesce markers. If the marker count is large, that's
not optimal.
To improve things the list has been split into couple by the
type of a marker and linear traverse has been replaced with
the binary search (std::lower_bound() & std::upper_bound()).
Also traversing is not performed if not needed. After this
PerformanceTests/Interactive/spellcheck-paste-huge-text.html
shows 7-12% improvement.
BUG=279293
Committed: https://src.chromium.org/viewvc/blink?view=rev&revision=158288
Patch Set 1 #
Total comments: 28
Patch Set 2 : Addressing review comments #Patch Set 3 : Rebase & split marker list by marker type. #
Total comments: 42
Patch Set 4 : More review commments (style, additions: upper/lower bounding where easily possible, creating the l… #
Total comments: 1
Patch Set 5 : Style #Patch Set 6 : Compilation fix #
Messages
Total messages: 51 (0 generated)
No separate test for this as Interactive\spellcheck-paste-huge-text.html shows some improvement.
Mind sharing the performance improvement numbers? (I think the patch is valuable even without significant improvement, since it cleans up the code, but I'm curious) https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... File Source/core/dom/DocumentMarkerController.cpp (right): https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... Source/core/dom/DocumentMarkerController.cpp:136: static bool startsFurther(const DocumentMarker& lhv, const DocumentMarker& rhv) Chromium preference is putting helper functions into an anonymous namespace instead of declaring them static - but I'll defer final judgment to the Blink reviewer. morrita1? tony? https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... Source/core/dom/DocumentMarkerController.cpp:164: if (list->last().endOffset() >= newMarker.startOffset()) { Isn't that covered by lower_bound already? https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... Source/core/dom/DocumentMarkerController.cpp:169: if (pos != list->end()) AFAIK, insert works for pos == list->end()-list->begin() too, so no need for the else branch https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... Source/core/dom/DocumentMarkerController.cpp:184: void DocumentMarkerController::mergeOverlapping(MarkerList* list, DocumentMarker& toInsert) I think this can be simplified - find the first overlapping marker (lower_bound), then keep merging with the following markers as long as they overlap. (And if we had separate lists for each marker type, we could remove the whole sequence and replace with the merged one. But unfortunately, we don't. We _could_ do in-place moving during the merge and a final remove of a range, but I doubt it buys us much)
https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... File Source/core/dom/DocumentMarkerController.cpp (right): https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... Source/core/dom/DocumentMarkerController.cpp:136: static bool startsFurther(const DocumentMarker& lhv, const DocumentMarker& rhv) On 2013/09/13 18:05:48, groby wrote: > Chromium preference is putting helper functions into an anonymous namespace > instead of declaring them static - but I'll defer final judgment to the Blink > reviewer. morrita1? tony? Most blink code uses static functions, but I don't think there's a rule one way or another. I don't have a strong preference.
https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... File Source/core/dom/DocumentMarkerController.cpp (right): https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... Source/core/dom/DocumentMarkerController.cpp:164: if (list->last().endOffset() >= newMarker.startOffset()) { On 2013/09/13 18:05:48, groby wrote: > Isn't that covered by lower_bound already? Yes, but the point was to avoid searching if no merging is needed. https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... Source/core/dom/DocumentMarkerController.cpp:184: void DocumentMarkerController::mergeOverlapping(MarkerList* list, DocumentMarker& toInsert) On 2013/09/13 18:05:48, groby wrote: > I think this can be simplified - find the first overlapping marker > (lower_bound), then keep merging with the following markers as long as they > overlap. This is exactly what it does. upper_bound is used to find the stop place in order to stop as soon as possible. Then lower_bound is done up to this place (taking the marker type into account) and later all overlapping markers are removed from the list and merged into toInsert which is inserted later. > > (And if we had separate lists for each marker type, we could remove the whole > sequence and replace with the merged one. But unfortunately, we don't. We > _could_ do in-place moving during the merge and a final remove of a range, but I > doubt it buys us much)
On 2013/09/13 18:05:47, groby wrote: > Mind sharing the performance improvement numbers? > > (I think the patch is valuable even without significant improvement, since it > cleans up the code, but I'm curious) > The TC shows 7-10% improvement here. > https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... > File Source/core/dom/DocumentMarkerController.cpp (right): > > https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... > Source/core/dom/DocumentMarkerController.cpp:136: static bool > startsFurther(const DocumentMarker& lhv, const DocumentMarker& rhv) > Chromium preference is putting helper functions into an anonymous namespace > instead of declaring them static - but I'll defer final judgment to the Blink > reviewer. morrita1? tony? > > https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... > Source/core/dom/DocumentMarkerController.cpp:164: if (list->last().endOffset() > >= newMarker.startOffset()) { > Isn't that covered by lower_bound already? > > https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... > Source/core/dom/DocumentMarkerController.cpp:169: if (pos != list->end()) > AFAIK, insert works for pos == list->end()-list->begin() too, so no need for the > else branch > > https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... > Source/core/dom/DocumentMarkerController.cpp:184: void > DocumentMarkerController::mergeOverlapping(MarkerList* list, DocumentMarker& > toInsert) > I think this can be simplified - find the first overlapping marker > (lower_bound), then keep merging with the following markers as long as they > overlap. > > (And if we had separate lists for each marker type, we could remove the whole > sequence and replace with the merged one. But unfortunately, we don't. We > _could_ do in-place moving during the merge and a final remove of a range, but I > doubt it buys us much)
https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... File Source/core/dom/DocumentMarkerController.cpp (right): https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... Source/core/dom/DocumentMarkerController.cpp:164: if (list->last().endOffset() >= newMarker.startOffset()) { You probably want '>', though. if last.end == new.start, we probably want to merge the two? Definitely should be '>', though - if last.end == new.start, we probably want to merge them :) On 2013/09/14 08:18:50, pstanek wrote: > On 2013/09/13 18:05:48, groby wrote: > > Isn't that covered by lower_bound already? > > Yes, but the point was to avoid searching if no merging is needed. https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... Source/core/dom/DocumentMarkerController.cpp:184: void DocumentMarkerController::mergeOverlapping(MarkerList* list, DocumentMarker& toInsert) Caveat:I might be missing something and all this might be wrong :) But let me try to sketch out what I meant. first = lower_bound(); first = list->insert(first, RenderedDocumentMarker(toInsert)); for(i = first + 1; i != list->end() && !startsFurther(*first, *i); ) if (toInsert.type == i->type) { *first = merge(*first, *i); // merge just takes min(start), max(end), can be done in-place list->remove(i-list->begin()) } else { ++i; } } This would * remove the need to scan lower_bound for the right type * remove the special case for the first item * remove the call to upper_bound * shorten the code quite a bit. In case I did miss something and we need to take the longer approach, I've added feedback to that too. On 2013/09/14 08:18:50, pstanek wrote: > On 2013/09/13 18:05:48, groby wrote: > > I think this can be simplified - find the first overlapping marker > > (lower_bound), then keep merging with the following markers as long as they > > overlap. > > This is exactly what it does. upper_bound is used to find the stop place in > order to stop as soon as possible. Then lower_bound is done up to this place > (taking the marker type into account) and later all overlapping markers are > removed from the list and merged into toInsert which is inserted later. > > > > > (And if we had separate lists for each marker type, we could remove the whole > > sequence and replace with the merged one. But unfortunately, we don't. We > > _could_ do in-place moving during the merge and a final remove of a range, but > I > > doubt it buys us much) > https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... Source/core/dom/DocumentMarkerController.cpp:211: DocumentMarker marker = list->at(i); Calling at() will bounds-check each element - iterators would prevent that. https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... Source/core/dom/DocumentMarkerController.cpp:218: break; Why break here? Couldn't we potentially have more than one marker that need to be merged? I realize this mirrors the old code, but the comment seemed to indicate that we _could_ merge more than one marker. I think the code is right and the comment is in need of an update.
https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... File Source/core/dom/DocumentMarkerController.cpp (right): https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... Source/core/dom/DocumentMarkerController.cpp:141: static bool doesNotOverlap(const DocumentMarker& lhv, const DocumentMarker& rhv) I just woke up at 4am to realize this does not work :) doesNotOverlap assumes the list is sorted by start _and_ end offset, but it's only sorted by start offset. If the list only contained one marker type, the merging would assure we could compare end & start offsets - but with multiple types we cannot.
https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... File Source/core/dom/DocumentMarkerController.cpp (right): https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... Source/core/dom/DocumentMarkerController.cpp:141: static bool doesNotOverlap(const DocumentMarker& lhv, const DocumentMarker& rhv) On 2013/09/17 11:52:42, groby wrote: > I just woke up at 4am to realize this does not work :) doesNotOverlap assumes > the list is sorted by start _and_ end offset, but it's only sorted by start > offset. > > If the list only contained one marker type, the merging would assure we could > compare end & start offsets - but with multiple types we cannot. I thought about the same but I convinced myself it's ok as far as do-while checks the type (which it does) but I might be wrong.
https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... File Source/core/dom/DocumentMarkerController.cpp (right): https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... Source/core/dom/DocumentMarkerController.cpp:164: if (list->last().endOffset() >= newMarker.startOffset()) { On 2013/09/17 02:23:47, groby wrote: > You probably want '>', though. if last.end == new.start, we probably want to > merge the two? > > Definitely should be '>', though - if last.end == new.start, we probably want to > merge them :) But this if controls entering merging block so we'll merge if last().endOffset >= toInsert.startOffset() > On 2013/09/14 08:18:50, pstanek wrote: > > On 2013/09/13 18:05:48, groby wrote: > > > Isn't that covered by lower_bound already? > > > > Yes, but the point was to avoid searching if no merging is needed. >
Sorry for this long-ish review process :( https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... File Source/core/dom/DocumentMarkerController.cpp (right): https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... Source/core/dom/DocumentMarkerController.cpp:141: static bool doesNotOverlap(const DocumentMarker& lhv, const DocumentMarker& rhv) Do-while is fine - but in line 194, you use doesNotOverlap for lower_bound. I fear that we'll need some sort of linear search after all. Maybe lower_bound for a starting point, then scan to the left for merge candidates if it's a type mismatch. On 2013/09/17 12:46:30, pstanek wrote: > On 2013/09/17 11:52:42, groby wrote: > > I just woke up at 4am to realize this does not work :) doesNotOverlap assumes > > the list is sorted by start _and_ end offset, but it's only sorted by start > > offset. > > > > If the list only contained one marker type, the merging would assure we could > > compare end & start offsets - but with multiple types we cannot. > > I thought about the same but I convinced myself it's ok as far as do-while > checks the type (which it does) but I might be wrong. >
https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... File Source/core/dom/DocumentMarkerController.cpp (right): https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... Source/core/dom/DocumentMarkerController.cpp:164: if (list->last().endOffset() >= newMarker.startOffset()) { Sigh. I misread that, sorry. Done. On 2013/09/17 13:54:00, pstanek wrote: > On 2013/09/17 02:23:47, groby wrote: > > You probably want '>', though. if last.end == new.start, we probably want to > > merge the two? > > > > Definitely should be '>', though - if last.end == new.start, we probably want > to > > merge them :) > > But this if controls entering merging block so we'll merge if last().endOffset > >= toInsert.startOffset() > > > On 2013/09/14 08:18:50, pstanek wrote: > > > On 2013/09/13 18:05:48, groby wrote: > > > > Isn't that covered by lower_bound already? > > > > > > Yes, but the point was to avoid searching if no merging is needed. > > >
https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... File Source/core/dom/DocumentMarkerController.cpp (right): https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... Source/core/dom/DocumentMarkerController.cpp:141: static bool doesNotOverlap(const DocumentMarker& lhv, const DocumentMarker& rhv) But with lower_bound() we find the first overlapping marker regardless its type, but if its type is different then the one's to insert we get back to searching from the next position. If the container contained one type only do while wouldn't be needed. On 2013/09/17 14:20:26, groby wrote: > Do-while is fine - but in line 194, you use doesNotOverlap for lower_bound. > > I fear that we'll need some sort of linear search after all. Maybe lower_bound > for a starting point, then scan to the left for merge candidates if it's a type > mismatch. > > > > > On 2013/09/17 12:46:30, pstanek wrote: > > On 2013/09/17 11:52:42, groby wrote: > > > I just woke up at 4am to realize this does not work :) doesNotOverlap > assumes > > > the list is sorted by start _and_ end offset, but it's only sorted by start > > > offset. > > > > > > If the list only contained one marker type, the merging would assure we > could > > > compare end & start offsets - but with multiple types we cannot. > > > > I thought about the same but I convinced myself it's ok as far as do-while > > checks the type (which it does) but I might be wrong. > > >
https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... File Source/core/dom/DocumentMarkerController.cpp (right): https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... Source/core/dom/DocumentMarkerController.cpp:184: void DocumentMarkerController::mergeOverlapping(MarkerList* list, DocumentMarker& toInsert) I'll try to combine your approach with the original code. Why not just using yours? Because the original code merges one marker from each side only and there is different stop condition for different sides startsFurther() when merging to the left and i->startOffset() > inserted->endOffset() when merging to the right. On 2013/09/17 02:23:47, groby wrote: > Caveat:I might be missing something and all this might be wrong :) > > But let me try to sketch out what I meant. > > first = lower_bound(); > first = list->insert(first, RenderedDocumentMarker(toInsert)); > > for(i = first + 1; i != list->end() && !startsFurther(*first, *i); ) > if (toInsert.type == i->type) { > *first = merge(*first, *i); // merge just takes min(start), max(end), can > be done in-place > list->remove(i-list->begin()) > } else { > ++i; > } > } > > This would > * remove the need to scan lower_bound for the right type > * remove the special case for the first item > * remove the call to upper_bound > * shorten the code quite a bit. > > In case I did miss something and we need to take the longer approach, I've added > feedback to that too. > > On 2013/09/14 08:18:50, pstanek wrote: > > On 2013/09/13 18:05:48, groby wrote: > > > I think this can be simplified - find the first overlapping marker > > > (lower_bound), then keep merging with the following markers as long as they > > > overlap. > > > > This is exactly what it does. upper_bound is used to find the stop place in > > order to stop as soon as possible. Then lower_bound is done up to this place > > (taking the marker type into account) and later all overlapping markers are > > removed from the list and merged into toInsert which is inserted later. > > > > > > > > (And if we had separate lists for each marker type, we could remove the > whole > > > sequence and replace with the merged one. But unfortunately, we don't. We > > > _could_ do in-place moving during the merge and a final remove of a range, > but > > I > > > doubt it buys us much) > > >
https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... File Source/core/dom/DocumentMarkerController.cpp (right): https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... Source/core/dom/DocumentMarkerController.cpp:169: if (pos != list->end()) On 2013/09/13 18:05:48, groby wrote: > AFAIK, insert works for pos == list->end()-list->begin() too, so no need for the > else branch Done. https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... Source/core/dom/DocumentMarkerController.cpp:184: void DocumentMarkerController::mergeOverlapping(MarkerList* list, DocumentMarker& toInsert) On 2013/09/17 02:23:47, groby wrote: > Caveat:I might be missing something and all this might be wrong :) > > But let me try to sketch out what I meant. > > first = lower_bound(); > first = list->insert(first, RenderedDocumentMarker(toInsert)); > > for(i = first + 1; i != list->end() && !startsFurther(*first, *i); ) > if (toInsert.type == i->type) { > *first = merge(*first, *i); // merge just takes min(start), max(end), can > be done in-place > list->remove(i-list->begin()) > } else { > ++i; > } > } > > This would > * remove the need to scan lower_bound for the right type > * remove the special case for the first item > * remove the call to upper_bound > * shorten the code quite a bit. > > In case I did miss something and we need to take the longer approach, I've added > feedback to that too. > > On 2013/09/14 08:18:50, pstanek wrote: > > On 2013/09/13 18:05:48, groby wrote: > > > I think this can be simplified - find the first overlapping marker > > > (lower_bound), then keep merging with the following markers as long as they > > > overlap. > > > > This is exactly what it does. upper_bound is used to find the stop place in > > order to stop as soon as possible. Then lower_bound is done up to this place > > (taking the marker type into account) and later all overlapping markers are > > removed from the list and merged into toInsert which is inserted later. > > > > > > > > (And if we had separate lists for each marker type, we could remove the > whole > > > sequence and replace with the merged one. But unfortunately, we don't. We > > > _could_ do in-place moving during the merge and a final remove of a range, > but > > I > > > doubt it buys us much) > > > Done. (at least partially) https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... Source/core/dom/DocumentMarkerController.cpp:211: DocumentMarker marker = list->at(i); On 2013/09/17 02:23:47, groby wrote: > Calling at() will bounds-check each element - iterators would prevent that. Done. https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... Source/core/dom/DocumentMarkerController.cpp:218: break; On 2013/09/17 02:23:47, groby wrote: > Why break here? Couldn't we potentially have more than one marker that need to > be merged? I realize this mirrors the old code, but the comment seemed to > indicate that we _could_ merge more than one marker. > > I think the code is right and the comment is in need of an update. Done.
https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... File Source/core/dom/DocumentMarkerController.cpp (right): https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... Source/core/dom/DocumentMarkerController.cpp:141: static bool doesNotOverlap(const DocumentMarker& lhv, const DocumentMarker& rhv) Yes, but the vector is unsorted with respect to endOffset, so lower_bound will return bad results. Example (marker types are X/O, then start offset, then end offset) existing: X, 2, 8 O, 3, 4 X, 12, 20 Now try to insert (X, 7, 10) Clearly, it _should_ start with overlapping = 0. Instead, it starts with overlapping =2, never merging (X, 2, 8) with the new marker. And by clearly, I mean I had to write a sample to convince myself I wasn't crazy :) See http://pastebin.com/eBqYj2hq Change the end offset on the 'O' marker to 10, it's still a well-ordered list, and now overlapping == 0 On 2013/09/17 14:48:05, pstanek wrote: > But with lower_bound() we find the first overlapping marker regardless its type, > but if its type is different then the one's to insert we get back to searching > from the next position. If the container contained one type only do while > wouldn't be needed. > > On 2013/09/17 14:20:26, groby wrote: > > Do-while is fine - but in line 194, you use doesNotOverlap for lower_bound. > > > > I fear that we'll need some sort of linear search after all. Maybe lower_bound > > for a starting point, then scan to the left for merge candidates if it's a > type > > mismatch. > > > > > > > > > > On 2013/09/17 12:46:30, pstanek wrote: > > > On 2013/09/17 11:52:42, groby wrote: > > > > I just woke up at 4am to realize this does not work :) doesNotOverlap > > assumes > > > > the list is sorted by start _and_ end offset, but it's only sorted by > start > > > > offset. > > > > > > > > If the list only contained one marker type, the merging would assure we > > could > > > > compare end & start offsets - but with multiple types we cannot. > > > > > > I thought about the same but I convinced myself it's ok as far as do-while > > > checks the type (which it does) but I might be wrong. > > > > > >
https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... File Source/core/dom/DocumentMarkerController.cpp (right): https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... Source/core/dom/DocumentMarkerController.cpp:141: static bool doesNotOverlap(const DocumentMarker& lhv, const DocumentMarker& rhv) Ah :/ On 2013/09/17 17:14:09, groby wrote: > Yes, but the vector is unsorted with respect to endOffset, so lower_bound will > return bad results. Example (marker types are X/O, then start offset, then end > offset) > > existing: > X, 2, 8 > O, 3, 4 > X, 12, 20 > > Now try to insert (X, 7, 10) > > Clearly, it _should_ start with overlapping = 0. Instead, it starts with > overlapping =2, never merging (X, 2, 8) with the new marker. > > And by clearly, I mean I had to write a sample to convince myself I wasn't crazy > :) See http://pastebin.com/eBqYj2hq > > Change the end offset on the 'O' marker to 10, it's still a well-ordered list, > and now overlapping == 0 > > > > > On 2013/09/17 14:48:05, pstanek wrote: > > But with lower_bound() we find the first overlapping marker regardless its > type, > > but if its type is different then the one's to insert we get back to searching > > from the next position. If the container contained one type only do while > > wouldn't be needed. > > > > On 2013/09/17 14:20:26, groby wrote: > > > Do-while is fine - but in line 194, you use doesNotOverlap for lower_bound. > > > > > > I fear that we'll need some sort of linear search after all. Maybe > lower_bound > > > for a starting point, then scan to the left for merge candidates if it's a > > type > > > mismatch. > > > > > > > > > > > > > > > On 2013/09/17 12:46:30, pstanek wrote: > > > > On 2013/09/17 11:52:42, groby wrote: > > > > > I just woke up at 4am to realize this does not work :) doesNotOverlap > > > assumes > > > > > the list is sorted by start _and_ end offset, but it's only sorted by > > start > > > > > offset. > > > > > > > > > > If the list only contained one marker type, the merging would assure we > > > could > > > > > compare end & start offsets - but with multiple types we cannot. > > > > > > > > I thought about the same but I convinced myself it's ok as far as do-while > > > > checks the type (which it does) but I might be wrong. > > > > > > > > > >
https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... File Source/core/dom/DocumentMarkerController.cpp (right): https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... Source/core/dom/DocumentMarkerController.cpp:141: static bool doesNotOverlap(const DocumentMarker& lhv, const DocumentMarker& rhv) Then to make this efficient markers would have to be kept separated by type. Otherwise there is no good starting point as all starting earlier has to be checked unless I'm missing something. On 2013/09/17 17:30:41, pstanek wrote: > Ah :/ > > On 2013/09/17 17:14:09, groby wrote: > > Yes, but the vector is unsorted with respect to endOffset, so lower_bound will > > return bad results. Example (marker types are X/O, then start offset, then end > > offset) > > > > existing: > > X, 2, 8 > > O, 3, 4 > > X, 12, 20 > > > > Now try to insert (X, 7, 10) > > > > Clearly, it _should_ start with overlapping = 0. Instead, it starts with > > overlapping =2, never merging (X, 2, 8) with the new marker. > > > > And by clearly, I mean I had to write a sample to convince myself I wasn't > crazy > > :) See http://pastebin.com/eBqYj2hq > > > > Change the end offset on the 'O' marker to 10, it's still a well-ordered list, > > and now overlapping == 0 > > > > > > > > > > On 2013/09/17 14:48:05, pstanek wrote: > > > But with lower_bound() we find the first overlapping marker regardless its > > type, > > > but if its type is different then the one's to insert we get back to > searching > > > from the next position. If the container contained one type only do while > > > wouldn't be needed. > > > > > > On 2013/09/17 14:20:26, groby wrote: > > > > Do-while is fine - but in line 194, you use doesNotOverlap for > lower_bound. > > > > > > > > I fear that we'll need some sort of linear search after all. Maybe > > lower_bound > > > > for a starting point, then scan to the left for merge candidates if it's a > > > type > > > > mismatch. > > > > > > > > > > > > > > > > > > > > On 2013/09/17 12:46:30, pstanek wrote: > > > > > On 2013/09/17 11:52:42, groby wrote: > > > > > > I just woke up at 4am to realize this does not work :) doesNotOverlap > > > > assumes > > > > > > the list is sorted by start _and_ end offset, but it's only sorted by > > > start > > > > > > offset. > > > > > > > > > > > > If the list only contained one marker type, the merging would assure > we > > > > could > > > > > > compare end & start offsets - but with multiple types we cannot. > > > > > > > > > > I thought about the same but I convinced myself it's ok as far as > do-while > > > > > checks the type (which it does) but I might be wrong. > > > > > > > > > > > > > > >
https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... File Source/core/dom/DocumentMarkerController.cpp (right): https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... Source/core/dom/DocumentMarkerController.cpp:141: static bool doesNotOverlap(const DocumentMarker& lhv, const DocumentMarker& rhv) Separating them by type would certainly be helpful :) If you don't want to do that - there are other functions that might rely on mixed markers - there might be another way. (Or skip it, if the 10% improvement isn't worth it) Let me check the assumptions: - We are guaranteed that we can insert before the first node that has node.start() >= toInsert.start), since nodes are ordered by start. - We are guaranteed that we're at most merging one node "to the left", since all nodes are already merged. At most one node can start earlier than the insertion node. (That part is a linear scan, though ) - We are guaranteed that we can stop merging once node.start() > insert.end(), since nodes are ordered by start *and* insert.start() must be <= insert.end(). That seems to translate into basically your structure, except we can't use lower_bound to find the overlapping one, and we use lower_bound for finding the start of the overlapping search. Alternatively, you can go for a single while loop by finding the end-node (node.start() > insert.end()) and merging to the left. Does that make sense? Or are there holes in this? On 2013/09/17 17:39:45, pstanek wrote: > Then to make this efficient markers would have to be kept separated by type. > Otherwise there is no good starting point as all starting earlier has to be > checked unless I'm missing something. > > On 2013/09/17 17:30:41, pstanek wrote: > > Ah :/ > > > > On 2013/09/17 17:14:09, groby wrote: > > > Yes, but the vector is unsorted with respect to endOffset, so lower_bound > will > > > return bad results. Example (marker types are X/O, then start offset, then > end > > > offset) > > > > > > existing: > > > X, 2, 8 > > > O, 3, 4 > > > X, 12, 20 > > > > > > Now try to insert (X, 7, 10) > > > > > > Clearly, it _should_ start with overlapping = 0. Instead, it starts with > > > overlapping =2, never merging (X, 2, 8) with the new marker. > > > > > > And by clearly, I mean I had to write a sample to convince myself I wasn't > > crazy > > > :) See http://pastebin.com/eBqYj2hq > > > > > > Change the end offset on the 'O' marker to 10, it's still a well-ordered > list, > > > and now overlapping == 0 > > > > > > > > > > > > > > > On 2013/09/17 14:48:05, pstanek wrote: > > > > But with lower_bound() we find the first overlapping marker regardless its > > > type, > > > > but if its type is different then the one's to insert we get back to > > searching > > > > from the next position. If the container contained one type only do while > > > > wouldn't be needed. > > > > > > > > On 2013/09/17 14:20:26, groby wrote: > > > > > Do-while is fine - but in line 194, you use doesNotOverlap for > > lower_bound. > > > > > > > > > > I fear that we'll need some sort of linear search after all. Maybe > > > lower_bound > > > > > for a starting point, then scan to the left for merge candidates if it's > a > > > > type > > > > > mismatch. > > > > > > > > > > > > > > > > > > > > > > > > > On 2013/09/17 12:46:30, pstanek wrote: > > > > > > On 2013/09/17 11:52:42, groby wrote: > > > > > > > I just woke up at 4am to realize this does not work :) > doesNotOverlap > > > > > assumes > > > > > > > the list is sorted by start _and_ end offset, but it's only sorted > by > > > > start > > > > > > > offset. > > > > > > > > > > > > > > If the list only contained one marker type, the merging would assure > > we > > > > > could > > > > > > > compare end & start offsets - but with multiple types we cannot. > > > > > > > > > > > > I thought about the same but I convinced myself it's ok as far as > > do-while > > > > > > checks the type (which it does) but I might be wrong. > > > > > > > > > > > > > > > > > > > > >
https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... File Source/core/dom/DocumentMarkerController.cpp (right): https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... Source/core/dom/DocumentMarkerController.cpp:141: static bool doesNotOverlap(const DocumentMarker& lhv, const DocumentMarker& rhv) Yes. The assumptions seem to be right but I fear using the linear search is a step back to the starting point performance wise. I'll take a look at separating markers how much work would that be. Note however that since the perceived performance is OK atm this gets a lot lower priority for me now. On 2013/09/17 18:39:37, groby wrote: > Separating them by type would certainly be helpful :) > > If you don't want to do that - there are other functions that might rely on > mixed markers - there might be another way. (Or skip it, if the 10% improvement > isn't worth it) > > Let me check the assumptions: > > - We are guaranteed that we can insert before the first node that has > node.start() >= toInsert.start), since nodes are ordered by start. > - We are guaranteed that we're at most merging one node "to the left", since all > nodes are already merged. At most one node can start earlier than the insertion > node. (That part is a linear scan, though ) > - We are guaranteed that we can stop merging once node.start() > insert.end(), > since nodes are ordered by start *and* insert.start() must be <= insert.end(). > > That seems to translate into basically your structure, except we can't use > lower_bound to find the overlapping one, and we use lower_bound for finding the > start of the overlapping search. > > Alternatively, you can go for a single while loop by finding the end-node > (node.start() > insert.end()) and merging to the left. > > Does that make sense? Or are there holes in this? > > On 2013/09/17 17:39:45, pstanek wrote: > > Then to make this efficient markers would have to be kept separated by type. > > Otherwise there is no good starting point as all starting earlier has to be > > checked unless I'm missing something. > > > > On 2013/09/17 17:30:41, pstanek wrote: > > > Ah :/ > > > > > > On 2013/09/17 17:14:09, groby wrote: > > > > Yes, but the vector is unsorted with respect to endOffset, so lower_bound > > will > > > > return bad results. Example (marker types are X/O, then start offset, then > > end > > > > offset) > > > > > > > > existing: > > > > X, 2, 8 > > > > O, 3, 4 > > > > X, 12, 20 > > > > > > > > Now try to insert (X, 7, 10) > > > > > > > > Clearly, it _should_ start with overlapping = 0. Instead, it starts with > > > > overlapping =2, never merging (X, 2, 8) with the new marker. > > > > > > > > And by clearly, I mean I had to write a sample to convince myself I wasn't > > > crazy > > > > :) See http://pastebin.com/eBqYj2hq > > > > > > > > Change the end offset on the 'O' marker to 10, it's still a well-ordered > > list, > > > > and now overlapping == 0 > > > > > > > > > > > > > > > > > > > > On 2013/09/17 14:48:05, pstanek wrote: > > > > > But with lower_bound() we find the first overlapping marker regardless > its > > > > type, > > > > > but if its type is different then the one's to insert we get back to > > > searching > > > > > from the next position. If the container contained one type only do > while > > > > > wouldn't be needed. > > > > > > > > > > On 2013/09/17 14:20:26, groby wrote: > > > > > > Do-while is fine - but in line 194, you use doesNotOverlap for > > > lower_bound. > > > > > > > > > > > > I fear that we'll need some sort of linear search after all. Maybe > > > > lower_bound > > > > > > for a starting point, then scan to the left for merge candidates if > it's > > a > > > > > type > > > > > > mismatch. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On 2013/09/17 12:46:30, pstanek wrote: > > > > > > > On 2013/09/17 11:52:42, groby wrote: > > > > > > > > I just woke up at 4am to realize this does not work :) > > doesNotOverlap > > > > > > assumes > > > > > > > > the list is sorted by start _and_ end offset, but it's only sorted > > by > > > > > start > > > > > > > > offset. > > > > > > > > > > > > > > > > If the list only contained one marker type, the merging would > assure > > > we > > > > > > could > > > > > > > > compare end & start offsets - but with multiple types we cannot. > > > > > > > > > > > > > > I thought about the same but I convinced myself it's ok as far as > > > do-while > > > > > > > checks the type (which it does) but I might be wrong. > > > > > > > > > > > > > > > > > > > > > > > > > > > >
https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... File Source/core/dom/DocumentMarkerController.cpp (right): https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... Source/core/dom/DocumentMarkerController.cpp:141: static bool doesNotOverlap(const DocumentMarker& lhv, const DocumentMarker& rhv) Postponing this seems fine to me. (If you do, you might want to close this CL and update the bug to reflect that status) On 2013/09/18 10:49:46, pstanek wrote: > Yes. The assumptions seem to be right but I fear using the linear search is a > step back to the starting point performance wise. I'll take a look at separating > markers how much work would that be. Note however that since the perceived > performance is OK atm this gets a lot lower priority for me now. > > On 2013/09/17 18:39:37, groby wrote: > > Separating them by type would certainly be helpful :) > > > > If you don't want to do that - there are other functions that might rely on > > mixed markers - there might be another way. (Or skip it, if the 10% > improvement > > isn't worth it) > > > > Let me check the assumptions: > > > > - We are guaranteed that we can insert before the first node that has > > node.start() >= toInsert.start), since nodes are ordered by start. > > - We are guaranteed that we're at most merging one node "to the left", since > all > > nodes are already merged. At most one node can start earlier than the > insertion > > node. (That part is a linear scan, though ) > > - We are guaranteed that we can stop merging once node.start() > insert.end(), > > since nodes are ordered by start *and* insert.start() must be <= insert.end(). > > > > That seems to translate into basically your structure, except we can't use > > lower_bound to find the overlapping one, and we use lower_bound for finding > the > > start of the overlapping search. > > > > Alternatively, you can go for a single while loop by finding the end-node > > (node.start() > insert.end()) and merging to the left. > > > > Does that make sense? Or are there holes in this? > > > > On 2013/09/17 17:39:45, pstanek wrote: > > > Then to make this efficient markers would have to be kept separated by type. > > > Otherwise there is no good starting point as all starting earlier has to be > > > checked unless I'm missing something. > > > > > > On 2013/09/17 17:30:41, pstanek wrote: > > > > Ah :/ > > > > > > > > On 2013/09/17 17:14:09, groby wrote: > > > > > Yes, but the vector is unsorted with respect to endOffset, so > lower_bound > > > will > > > > > return bad results. Example (marker types are X/O, then start offset, > then > > > end > > > > > offset) > > > > > > > > > > existing: > > > > > X, 2, 8 > > > > > O, 3, 4 > > > > > X, 12, 20 > > > > > > > > > > Now try to insert (X, 7, 10) > > > > > > > > > > Clearly, it _should_ start with overlapping = 0. Instead, it starts with > > > > > overlapping =2, never merging (X, 2, 8) with the new marker. > > > > > > > > > > And by clearly, I mean I had to write a sample to convince myself I > wasn't > > > > crazy > > > > > :) See http://pastebin.com/eBqYj2hq > > > > > > > > > > Change the end offset on the 'O' marker to 10, it's still a well-ordered > > > list, > > > > > and now overlapping == 0 > > > > > > > > > > > > > > > > > > > > > > > > > On 2013/09/17 14:48:05, pstanek wrote: > > > > > > But with lower_bound() we find the first overlapping marker regardless > > its > > > > > type, > > > > > > but if its type is different then the one's to insert we get back to > > > > searching > > > > > > from the next position. If the container contained one type only do > > while > > > > > > wouldn't be needed. > > > > > > > > > > > > On 2013/09/17 14:20:26, groby wrote: > > > > > > > Do-while is fine - but in line 194, you use doesNotOverlap for > > > > lower_bound. > > > > > > > > > > > > > > I fear that we'll need some sort of linear search after all. Maybe > > > > > lower_bound > > > > > > > for a starting point, then scan to the left for merge candidates if > > it's > > > a > > > > > > type > > > > > > > mismatch. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On 2013/09/17 12:46:30, pstanek wrote: > > > > > > > > On 2013/09/17 11:52:42, groby wrote: > > > > > > > > > I just woke up at 4am to realize this does not work :) > > > doesNotOverlap > > > > > > > assumes > > > > > > > > > the list is sorted by start _and_ end offset, but it's only > sorted > > > by > > > > > > start > > > > > > > > > offset. > > > > > > > > > > > > > > > > > > If the list only contained one marker type, the merging would > > assure > > > > we > > > > > > > could > > > > > > > > > compare end & start offsets - but with multiple types we cannot. > > > > > > > > > > > > > > > > I thought about the same but I convinced myself it's ok as far as > > > > do-while > > > > > > > > checks the type (which it does) but I might be wrong. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
I did it a go... On 2013/09/18 18:24:08, groby wrote: > https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... > File Source/core/dom/DocumentMarkerController.cpp (right): > > https://codereview.chromium.org/23728006/diff/1/Source/core/dom/DocumentMarke... > Source/core/dom/DocumentMarkerController.cpp:141: static bool > doesNotOverlap(const DocumentMarker& lhv, const DocumentMarker& rhv) > Postponing this seems fine to me. (If you do, you might want to close this CL > and update the bug to reflect that status) > > On 2013/09/18 10:49:46, pstanek wrote: > > Yes. The assumptions seem to be right but I fear using the linear search is a > > step back to the starting point performance wise. I'll take a look at > separating > > markers how much work would that be. Note however that since the perceived > > performance is OK atm this gets a lot lower priority for me now. > > > > On 2013/09/17 18:39:37, groby wrote: > > > Separating them by type would certainly be helpful :) > > > > > > If you don't want to do that - there are other functions that might rely on > > > mixed markers - there might be another way. (Or skip it, if the 10% > > improvement > > > isn't worth it) > > > > > > Let me check the assumptions: > > > > > > - We are guaranteed that we can insert before the first node that has > > > node.start() >= toInsert.start), since nodes are ordered by start. > > > - We are guaranteed that we're at most merging one node "to the left", since > > all > > > nodes are already merged. At most one node can start earlier than the > > insertion > > > node. (That part is a linear scan, though ) > > > - We are guaranteed that we can stop merging once node.start() > > insert.end(), > > > since nodes are ordered by start *and* insert.start() must be <= > insert.end(). > > > > > > That seems to translate into basically your structure, except we can't use > > > lower_bound to find the overlapping one, and we use lower_bound for finding > > the > > > start of the overlapping search. > > > > > > Alternatively, you can go for a single while loop by finding the end-node > > > (node.start() > insert.end()) and merging to the left. > > > > > > Does that make sense? Or are there holes in this? > > > > > > On 2013/09/17 17:39:45, pstanek wrote: > > > > Then to make this efficient markers would have to be kept separated by > type. > > > > Otherwise there is no good starting point as all starting earlier has to > be > > > > checked unless I'm missing something. > > > > > > > > On 2013/09/17 17:30:41, pstanek wrote: > > > > > Ah :/ > > > > > > > > > > On 2013/09/17 17:14:09, groby wrote: > > > > > > Yes, but the vector is unsorted with respect to endOffset, so > > lower_bound > > > > will > > > > > > return bad results. Example (marker types are X/O, then start offset, > > then > > > > end > > > > > > offset) > > > > > > > > > > > > existing: > > > > > > X, 2, 8 > > > > > > O, 3, 4 > > > > > > X, 12, 20 > > > > > > > > > > > > Now try to insert (X, 7, 10) > > > > > > > > > > > > Clearly, it _should_ start with overlapping = 0. Instead, it starts > with > > > > > > overlapping =2, never merging (X, 2, 8) with the new marker. > > > > > > > > > > > > And by clearly, I mean I had to write a sample to convince myself I > > wasn't > > > > > crazy > > > > > > :) See http://pastebin.com/eBqYj2hq > > > > > > > > > > > > Change the end offset on the 'O' marker to 10, it's still a > well-ordered > > > > list, > > > > > > and now overlapping == 0 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On 2013/09/17 14:48:05, pstanek wrote: > > > > > > > But with lower_bound() we find the first overlapping marker > regardless > > > its > > > > > > type, > > > > > > > but if its type is different then the one's to insert we get back to > > > > > searching > > > > > > > from the next position. If the container contained one type only do > > > while > > > > > > > wouldn't be needed. > > > > > > > > > > > > > > On 2013/09/17 14:20:26, groby wrote: > > > > > > > > Do-while is fine - but in line 194, you use doesNotOverlap for > > > > > lower_bound. > > > > > > > > > > > > > > > > I fear that we'll need some sort of linear search after all. Maybe > > > > > > lower_bound > > > > > > > > for a starting point, then scan to the left for merge candidates > if > > > it's > > > > a > > > > > > > type > > > > > > > > mismatch. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On 2013/09/17 12:46:30, pstanek wrote: > > > > > > > > > On 2013/09/17 11:52:42, groby wrote: > > > > > > > > > > I just woke up at 4am to realize this does not work :) > > > > doesNotOverlap > > > > > > > > assumes > > > > > > > > > > the list is sorted by start _and_ end offset, but it's only > > sorted > > > > by > > > > > > > start > > > > > > > > > > offset. > > > > > > > > > > > > > > > > > > > > If the list only contained one marker type, the merging would > > > assure > > > > > we > > > > > > > > could > > > > > > > > > > compare end & start offsets - but with multiple types we > cannot. > > > > > > > > > > > > > > > > > > I thought about the same but I convinced myself it's ok as far > as > > > > > do-while > > > > > > > > > checks the type (which it does) but I might be wrong. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
This did turn out pretty awesome - thank you! If you'd like to finish this up soon-ish, feel free to file bug for the upper/lower-boundification of functions that aren't yet, as well as for the "create list on demand". It'd be lovely to see those fixes, but it certainly wouldn't be crucial. https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... File Source/core/dom/DocumentMarker.h (right): https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarker.h:42: SpellingMarkerIndex, Personal nit: I prefer a "= 0" after the first enum, just to make clear we rely on the value of the enum https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... File Source/core/dom/DocumentMarkerController.cpp (right): https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:177: for (size_t markerListIndex = 0; markerListIndex < DocumentMarker::MarkerTypeIndexesCount; ++markerListIndex) { I'd suggest only adding a MarkerList for the needed index. https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:194: list->append(RenderedDocumentMarker(toInsert)); I'd combine that into the list->empty branch() i.e. if( list->isEmpty() || list->last().endOffset() >= newMarker.startOffset()) { list->append... https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:197: Side note: I'm surprised this doesn't set docDirty https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:208: MarkerList::iterator firstOverlapping = list->begin(); I think this can now be simplified significantly. Since we know all markers for a given type are merged, we're sorted by startOffset AND endOffset So I think this would work: * find lower bound for toInsert.start < node.end - all nodes that potentially need merging come after that point * insert toInsert at that point * keep merging until merged.end < node.start If you want to be more efficient, determine if the node after the insertion point overlaps with toInsert at all. If it doesn't, insert and be done. If it does, merge that node in place with toInsert, then continue merging right. (That means one insertion less in the merge case, and with that, one bulk element move less) This should result in _much_ shorter code https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:266: if (marker.endOffset() < startOffset) We can lower_bound this. https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:307: if (!markerTypes.contains(list->begin()->type())) { Personal nit: No need for curlies on single-line branches. https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:319: if (marker.endOffset() <= startOffset) { I suppose we could upper_bound this, saving us iteration. https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:353: if (emptyListsCount == DocumentMarker::MarkerTypeIndexesCount) { If we do ad-hoc creation for individual lists, we need to add the list->isEmpty check back, too. https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:401: for (size_t i = 0; i < list->size(); ++i) { nit: no curlies on single-line statements. https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:409: Vector<DocumentMarker*> DocumentMarkerController::markers() I'm wondering if the callers of this expect the markers' sort order to be across types. https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:416: for (size_t j = 0; j < list->size(); ++j) I believe Vector has appendVector, which does just what this loop does. https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:423: Vector<DocumentMarker*> DocumentMarkerController::markersInRange(Range* range, DocumentMarker::MarkerTypes markerTypes) Same as above: Do they need to be sorted? https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:443: if (node == startContainer && marker->endOffset() <= static_cast<unsigned>(range->startOffset())) I suppose we could do a lower bounds check and just use appendRange, that'd simplify this function. https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:573: renderer->repaint(); I believe this needs a break - otherwise, this can trigger multiple repaints. https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:609: if (marker.startOffset() >= startOffset) { upper_bound https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:660: if (marker.endOffset() <= startOffset || marker.type() != DocumentMarker::TextMatch) upper_bound https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:692: if (node == startContainer && marker->endOffset() <= static_cast<unsigned>(range->startOffset())) upper/lower_bound
https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... File Source/core/dom/DocumentMarkerController.cpp (right): https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:409: Vector<DocumentMarker*> DocumentMarkerController::markers() On 2013/09/19 18:17:09, groby wrote: > I'm wondering if the callers of this expect the markers' sort order to be across > types. Good point. it seems they has to. https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:416: for (size_t j = 0; j < list->size(); ++j) On 2013/09/19 18:17:09, groby wrote: > I believe Vector has appendVector, which does just what this loop does. Type mismatch. MarkerList is Vector<DocumentMarker> where the result is Vector<DocumentMarker*>. https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:443: if (node == startContainer && marker->endOffset() <= static_cast<unsigned>(range->startOffset())) On 2013/09/19 18:17:09, groby wrote: > I suppose we could do a lower bounds check and just use appendRange, that'd > simplify this function. This calls makersFor() returning one vector with all markers so it'd have to rewritten completely. https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:692: if (node == startContainer && marker->endOffset() <= static_cast<unsigned>(range->startOffset())) On 2013/09/19 18:17:09, groby wrote: > upper/lower_bound markersFor() again
https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... File Source/core/dom/DocumentMarkerController.cpp (right): https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:660: if (marker.endOffset() <= startOffset || marker.type() != DocumentMarker::TextMatch) On 2013/09/19 18:17:09, groby wrote: > upper_bound Oh. this does anything for TextMatch markers only so it may be simplified a lot.
On 2013/09/20 15:17:32, pstanek wrote: > https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... > File Source/core/dom/DocumentMarkerController.cpp (right): > > https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... > Source/core/dom/DocumentMarkerController.cpp:660: if (marker.endOffset() <= > startOffset || marker.type() != DocumentMarker::TextMatch) > On 2013/09/19 18:17:09, groby wrote: > > upper_bound > > Oh. this does anything for TextMatch markers only so it may be simplified a lot. Well. Maybe not so much... :)
https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... File Source/core/dom/DocumentMarkerController.cpp (right): https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:409: Vector<DocumentMarker*> DocumentMarkerController::markers() A quick code search actually reveals no callers? On 2013/09/20 15:07:40, pstanek wrote: > On 2013/09/19 18:17:09, groby wrote: > > I'm wondering if the callers of this expect the markers' sort order to be > across > > types. > > Good point. it seems they has to. https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:423: Vector<DocumentMarker*> DocumentMarkerController::markersInRange(Range* range, DocumentMarker::MarkerTypes markerTypes) WebFrameImpl::replaceMisspelledRange does - but maybe you'd want the sort there. SpellCheckRequest::create _might_ need that. Probably not, but ping rouslan. (Actually, just checked - it doesn't) And HitTestResult::isMisspelled pains me because it doesn't - all it wants to know if there's _any_ marker in the range. Expensive way to test :) On 2013/09/19 18:17:09, groby wrote: > Same as above: Do they need to be sorted? https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:443: if (node == startContainer && marker->endOffset() <= static_cast<unsigned>(range->startOffset())) Yes, that's why I suggested filing those as separate bugs :) markersFor is an interesting beast anyways - it seems most calls to it want only a range of markers, so that should be part of the API... As I said, it's better off separate. On 2013/09/20 15:07:40, pstanek wrote: > On 2013/09/19 18:17:09, groby wrote: > > I suppose we could do a lower bounds check and just use appendRange, that'd > > simplify this function. > > This calls makersFor() returning one vector with all markers so it'd have to > rewritten completely. https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:660: if (marker.endOffset() <= startOffset || marker.type() != DocumentMarker::TextMatch) Well, a tiny bit at least :) On 2013/09/20 15:17:33, pstanek wrote: > On 2013/09/19 18:17:09, groby wrote: > > upper_bound > > Oh. this does anything for TextMatch markers only so it may be simplified a lot.
https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... File Source/core/dom/DocumentMarkerController.cpp (right): https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:609: if (marker.startOffset() >= startOffset) { On 2013/09/19 18:17:09, groby wrote: > upper_bound "Unlike lower_bound, the value pointed by the iterator returned by this function cannot be equivalent to val, only greater."
https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... File Source/core/dom/DocumentMarkerController.cpp (right): https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:177: for (size_t markerListIndex = 0; markerListIndex < DocumentMarker::MarkerTypeIndexesCount; ++markerListIndex) { On 2013/09/19 18:17:09, groby wrote: > I'd suggest only adding a MarkerList for the needed index. Done. https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:194: list->append(RenderedDocumentMarker(toInsert)); On 2013/09/19 18:17:09, groby wrote: > I'd combine that into the list->empty branch() i.e. > if( list->isEmpty() || list->last().endOffset() >= newMarker.startOffset()) { > list->append... > > > Done. https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:208: MarkerList::iterator firstOverlapping = list->begin(); On 2013/09/19 18:17:09, groby wrote: > I think this can now be simplified significantly. Since we know all markers for > a given type are merged, we're sorted by startOffset AND endOffset > > So I think this would work: > > * find lower bound for toInsert.start < node.end - all nodes that potentially > need merging come after that point > * insert toInsert at that point > * keep merging until merged.end < node.start > > If you want to be more efficient, determine if the node after the insertion > point overlaps with toInsert at all. If it doesn't, insert and be done. If it > does, merge that node in place with toInsert, then continue merging right. (That > means one insertion less in the merge case, and with that, one bulk element move > less) > > This should result in _much_ shorter code Done. https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:266: if (marker.endOffset() < startOffset) On 2013/09/19 18:17:09, groby wrote: > We can lower_bound this. Done. https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:307: if (!markerTypes.contains(list->begin()->type())) { On 2013/09/19 18:17:09, groby wrote: > Personal nit: No need for curlies on single-line branches. Done. https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:319: if (marker.endOffset() <= startOffset) { On 2013/09/19 18:17:09, groby wrote: > I suppose we could upper_bound this, saving us iteration. Done. https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:353: if (emptyListsCount == DocumentMarker::MarkerTypeIndexesCount) { On 2013/09/19 18:17:09, groby wrote: > If we do ad-hoc creation for individual lists, we need to add the list->isEmpty > check back, too. Done. https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:401: for (size_t i = 0; i < list->size(); ++i) { On 2013/09/19 18:17:09, groby wrote: > nit: no curlies on single-line statements. Done. https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:409: Vector<DocumentMarker*> DocumentMarkerController::markers() On 2013/09/19 18:17:09, groby wrote: > I'm wondering if the callers of this expect the markers' sort order to be across > types. Done. https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:423: Vector<DocumentMarker*> DocumentMarkerController::markersInRange(Range* range, DocumentMarker::MarkerTypes markerTypes) On 2013/09/19 18:17:09, groby wrote: > Same as above: Do they need to be sorted? Done. https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:573: renderer->repaint(); On 2013/09/19 18:17:09, groby wrote: > I believe this needs a break - otherwise, this can trigger multiple repaints. Done. https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:609: if (marker.startOffset() >= startOffset) { On 2013/09/20 18:56:00, pstanek wrote: > On 2013/09/19 18:17:09, groby wrote: > > upper_bound > > "Unlike lower_bound, the value pointed by the iterator returned by this function > cannot be equivalent to val, only greater." Done. https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:660: if (marker.endOffset() <= startOffset || marker.type() != DocumentMarker::TextMatch) On 2013/09/19 18:17:09, groby wrote: > upper_bound Done.
https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... File Source/core/dom/DocumentMarker.h (right): https://codereview.chromium.org/23728006/diff/26001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarker.h:42: SpellingMarkerIndex, On 2013/09/19 18:17:09, groby wrote: > Personal nit: I prefer a "= 0" after the first enum, just to make clear we rely > on the value of the enum Done.
LGTM - thank you for doing this work! morrita1, tony, rouslan: PTAL.
From spellchecking perspective, there seems to be no changes, so lgtm. What are the performance improvements?
On 2013/09/14 08:19:48, pstanek wrote: > The TC shows 7-10% improvement here. You should put this in the description. Also, please avoid lines longer than 80 characters in the description. (I personally prefer 74 character lines.)
I trust groby's review and only checked the style. LGTM. https://codereview.chromium.org/23728006/diff/37001/Source/core/dom/DocumentM... File Source/core/dom/DocumentMarkerController.cpp (right): https://codereview.chromium.org/23728006/diff/37001/Source/core/dom/DocumentM... Source/core/dom/DocumentMarkerController.cpp:48: default: Nit: We normally avoid default cases and enumerate all values. This means you get a compiler error when adding a new enum value. I would add DocumentMarker::MarkerTypeIndexesCount at the end and put an ASSERT_NOT_REACHED() there (you can return any of the index values).
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/pstanek@opera.com/23728006/48001
Sorry for I got bad news for ya. Compile failed with a clobber build on mac_layout. http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=mac_layout... Your code is likely broken or HEAD is junk. Please ensure your code is not broken then alert the build sheriffs. Look at the try server FAQ for more details.
On 2013/09/23 19:34:03, tony wrote: > I trust groby's review and only checked the style. LGTM. > > https://codereview.chromium.org/23728006/diff/37001/Source/core/dom/DocumentM... > File Source/core/dom/DocumentMarkerController.cpp (right): > > https://codereview.chromium.org/23728006/diff/37001/Source/core/dom/DocumentM... > Source/core/dom/DocumentMarkerController.cpp:48: default: > Nit: We normally avoid default cases and enumerate all values. This means you > get a compiler error when adding a new enum value. I would add > DocumentMarker::MarkerTypeIndexesCount at the end and put an > ASSERT_NOT_REACHED() there (you can return any of the index values). I did it and it introduced compilation error on mac (VS2012 took it without complains this is why I didn't catch that) because DocumentMarker::MarkerTypeIndexesCount is not a part of DocumentMarker::MarkerType. So I simply removed that case and default. PTAL and send to integration if looks ok.
LGTM. You're right. I was confusing MarkerTypeIndex and MarkerType.
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/pstanek@opera.com/23728006/54001
Retried try job too often on win_blink_rel for step(s) webkit_tests http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=win_blink_...
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/pstanek@opera.com/23728006/54001
Retried try job too often on win_blink_rel for step(s) webkit_tests http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=win_blink_...
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/pstanek@opera.com/23728006/54001
On 2013/09/24 17:42:40, I haz the power (commit-bot) wrote: > Retried try job too often on win_blink_rel for step(s) webkit_tests > http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=win_blink_... I wish I knew what's failing here...
Retried try job too often on win_blink_rel for step(s) webkit_tests http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=win_blink_...
On 2013/09/24 17:50:59, I haz the power (commit-bot) wrote: > Retried try job too often on win_blink_rel for step(s) webkit_tests > http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=win_blink_... Did I break anything? Some python script fails but I have no idea whether it's related to code changes...
On 2013/09/24 17:57:18, pstanek wrote: > On 2013/09/24 17:50:59, I haz the power (commit-bot) wrote: > > Retried try job too often on win_blink_rel for step(s) webkit_tests > > > http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=win_blink_... > > Did I break anything? Some python script fails but I have no idea whether it's > related to code changes... Merge master and try again. If this happens again, ask on #blink IRC channel.
On 2013/09/24 17:57:18, pstanek wrote: > On 2013/09/24 17:50:59, I haz the power (commit-bot) wrote: > > Retried try job too often on win_blink_rel for step(s) webkit_tests > > > http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=win_blink_... > > Did I break anything? Some python script fails but I have no idea whether it's > related to code changes... I don't think it's related to your change. I will try again in a few hours. Maybe the bots will be healthier then.
On 2013/09/24 17:59:48, tony wrote: > On 2013/09/24 17:57:18, pstanek wrote: > > On 2013/09/24 17:50:59, I haz the power (commit-bot) wrote: > > > Retried try job too often on win_blink_rel for step(s) webkit_tests > > > > > > http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=win_blink_... > > > > Did I break anything? Some python script fails but I have no idea whether it's > > related to code changes... > > I don't think it's related to your change. I will try again in a few hours. > Maybe the bots will be healthier then. Great. Thx.
I don't think it's you. Occasionally, the try servers are in a bad mood. If you want to find out details, hop on #chromium on IRC - that's where the sheriffs and troopers hang out. Or just try again in an hour or so :(
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/pstanek@opera.com/23728006/54001
Message was sent while issue was closed.
Change committed as 158288 |