Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(297)

Issue 9655003: Gather word-start Information to Aid in Scoring. (Closed)

Created:
8 years, 9 months ago by mrossetti
Modified:
8 years, 9 months ago
Reviewers:
brettw, Peter Kasting, sky
CC:
chromium-reviews, brettw-cc_chromium.org
Visibility:
Public.

Description

Gather word-start Information to Aid in Scoring. This change implements word start gathering and provides those word starts to the scoring routine but does not yet implement consideration of those word starts in the actual score calculation. BUG=NONE TEST=Added and updated unit tests. Committed: https://src.chromium.org/viewvc/chrome?view=rev&revision=126661

Patch Set 1 #

Total comments: 11

Patch Set 2 : #

Unified diffs Side-by-side diffs Delta from patch set Stats (+441 lines, -139 lines) Patch
M chrome/browser/history/in_memory_url_index_cache.proto View 1 chunk +20 lines, -8 lines 0 comments Download
M chrome/browser/history/in_memory_url_index_types.h View 1 5 chunks +27 lines, -4 lines 0 comments Download
M chrome/browser/history/in_memory_url_index_types.cc View 1 3 chunks +30 lines, -10 lines 0 comments Download
M chrome/browser/history/in_memory_url_index_types_unittest.cc View 1 4 chunks +70 lines, -30 lines 0 comments Download
M chrome/browser/history/in_memory_url_index_unittest.cc View 1 6 chunks +149 lines, -67 lines 0 comments Download
M chrome/browser/history/url_index_private_data.h View 1 8 chunks +28 lines, -3 lines 0 comments Download
M chrome/browser/history/url_index_private_data.cc View 1 16 chunks +117 lines, -17 lines 0 comments Download

Messages

Total messages: 4 (0 generated)
mrossetti
pkasting, please give the basic logic of collecting word starts and the unit tests a ...
8 years, 9 months ago (2012-03-08 21:26:40 UTC) #1
Peter Kasting
LGTM https://chromiumcodereview.appspot.com/9655003/diff/1/chrome/browser/history/in_memory_url_index_types.cc File chrome/browser/history/in_memory_url_index_types.cc (right): https://chromiumcodereview.appspot.com/9655003/diff/1/chrome/browser/history/in_memory_url_index_types.cc#newcode136 chrome/browser/history/in_memory_url_index_types.cc:136: string16 original_word(iter.GetString()); Nit: Shorter: if (trimmed & TRIM_LEADING) ...
8 years, 9 months ago (2012-03-09 02:37:43 UTC) #2
sky
Rubber stamp LGTM from me
8 years, 9 months ago (2012-03-09 16:56:07 UTC) #3
mrossetti
8 years, 9 months ago (2012-03-14 23:23:49 UTC) #4
Cleaning out draft comments.

http://codereview.chromium.org/9655003/diff/1/chrome/browser/history/in_memor...
File chrome/browser/history/in_memory_url_index_types.cc (right):

http://codereview.chromium.org/9655003/diff/1/chrome/browser/history/in_memor...
chrome/browser/history/in_memory_url_index_types.cc:136: string16
original_word(iter.GetString());
Done! That's a great approach.

On 2012/03/09 02:37:43, Peter Kasting wrote:
> Nit: Shorter:
> 
>   if (trimmed & TRIM_LEADING)
>     word_start += iter.GetString().find(word);
> 
> I'm a little concerned about the time cost of the find() call.  I wonder if it
> doesn't make sense to do something more akin to:
> 
>   string16 word(iter.GetString());
>   size_t initial_whitespace = 0;
>   if (break_on_space) {
>     string16 trimmed_word;
>     TrimWhitespace(word, TRIM_LEADING, &trimmed_word);
>     initial_whitespace = word.length() - trimmed_word.length();
>     TrimWhitespace(trimmed_word, TRIM_TRAILING, &word);
>   }
> 
> I don't know whether this makes much practical difference.

http://codereview.chromium.org/9655003/diff/1/chrome/browser/history/in_memor...
File chrome/browser/history/in_memory_url_index_types.h (right):

http://codereview.chromium.org/9655003/diff/1/chrome/browser/history/in_memor...
chrome/browser/history/in_memory_url_index_types.h:91: // is not NULL then
clears and pushes the word starts onto |word_starts|.
On 2012/03/09 02:37:43, Peter Kasting wrote:
> Nit: You might want to change "word starts" to something more descriptive,
e.g.
> "offsets within |uni_string| at which each word starts".  Also, if that's a
> correct description, then you should be using size_t for these instead of int.

Comment updated to be more clear. Also changed to size_t. I'm also introducing a
type (WordStarts) so that we will only have to change this in one place if a
change is called for in the future.

http://codereview.chromium.org/9655003/diff/1/chrome/browser/history/in_memor...
File chrome/browser/history/in_memory_url_index_types_unittest.cc (right):

http://codereview.chromium.org/9655003/diff/1/chrome/browser/history/in_memor...
chrome/browser/history/in_memory_url_index_types_unittest.cc:18: const int*
expected, size_t expected_size, const T& actual) {
On 2012/03/09 02:37:43, Peter Kasting wrote:
> Nit: One arg per line if they don't all fit on the first line

Done.

http://codereview.chromium.org/9655003/diff/1/chrome/browser/history/in_memor...
chrome/browser/history/in_memory_url_index_types_unittest.cc:46: int
expected_starts_a[] = {0, 7, 11, 18, 23, 31, 35};
On 2012/03/09 02:37:43, Peter Kasting wrote:
> Nit: Use size_t in all these too

Done.

http://codereview.chromium.org/9655003/diff/1/chrome/browser/history/url_inde...
File chrome/browser/history/url_index_private_data.cc (right):

http://codereview.chromium.org/9655003/diff/1/chrome/browser/history/url_inde...
chrome/browser/history/url_index_private_data.cc:1048: // allow testing of cache
file upgrading in ReadFromFile().
On 2012/03/09 02:37:43, Peter Kasting wrote:
> Nit: Is this the best way of doing this?  For most other versioned data
storage
> migration tests, we keep a copy of an old version in chrome/test/data and then
> write code that directly opens and migrates it to check that it does what we
> want.  This avoids being as intrusive to the real code that saves these.

I hear you, and understand. Since saving using protobuf is going to be replaced
in the near future (for some definition of 'near') and because this is pretty
complex data I'm going to leave it as-is for now and see how difficult it would
be to take the proper approach once this caching has moved over to SQLite. I've
added a TODO to remind me of this.

Powered by Google App Engine
This is Rietveld 408576698