|
|
Created:
8 years, 4 months ago by scottmg Modified:
8 years, 4 months ago CC:
chromium-reviews, joi+watch-content_chromium.org, darin-cc_chromium.org, jam Base URL:
svn://svn.chromium.org/chrome/trunk/src Visibility:
Public. |
DescriptionOnly track http and https in duplicate resource handler metrics. Specifically, exclude data and blob urls because they can be generated by content. This avoids a slow leak with generated URLs.
R=gavinp@chromium.org,frankwang@google.com
BUG=114570
Committed: http://src.chromium.org/viewvc/chrome?view=rev&revision=151452
Patch Set 1 #Patch Set 2 : . #
Total comments: 2
Patch Set 3 : indent #Patch Set 4 : only handle http and https #Patch Set 5 : . #Patch Set 6 : . #
Total comments: 6
Patch Set 7 : early out of hash calculation #Patch Set 8 : rebase #Patch Set 9 : cosnt #
Messages
Total messages: 31 (0 generated)
The point of the original patch was to track all resources going into the disk cache. For the URLs you ignore, do those resources not get stored in the disk cache? http://chromiumcodereview.appspot.com/10824236/diff/2001/content/browser/rend... File content/browser/renderer_host/duplicate_content_resource_handler.cc (right): http://chromiumcodereview.appspot.com/10824236/diff/2001/content/browser/rend... content/browser/renderer_host/duplicate_content_resource_handler.cc:108: url_spec.length() + bytes_read_); nit: indentation
http://chromiumcodereview.appspot.com/10824236/diff/2001/content/browser/rend... File content/browser/renderer_host/duplicate_content_resource_handler.cc (right): http://chromiumcodereview.appspot.com/10824236/diff/2001/content/browser/rend... content/browser/renderer_host/duplicate_content_resource_handler.cc:95: if (StartsWithASCII(url_spec, "data:", false) || So since data URLs will usually have the same URL for the same content , the total cost from these should be about 8 bytes per unique data URL, right? For the blob: URLs, I wonder. Frank, I think none of these resources are being retrieved from the web. So if your goal is to track web retrievals, this exclusion makes sense. What I don't quite understand is how many of these there are in practice.
On 2012/08/09 17:28:04, gavinp wrote: > http://chromiumcodereview.appspot.com/10824236/diff/2001/content/browser/rend... > File content/browser/renderer_host/duplicate_content_resource_handler.cc > (right): > > http://chromiumcodereview.appspot.com/10824236/diff/2001/content/browser/rend... > content/browser/renderer_host/duplicate_content_resource_handler.cc:95: if > (StartsWithASCII(url_spec, "data:", false) || > So since data URLs will usually have the same URL for the same content , the > total cost from these should be about 8 bytes per unique data URL, right? Yeah, I'm ambivalent on data: urls. If a page generated a data: url from a canvas that was drawn on there could be a lot maybe? But as you say, it's only ~8-16 bytes/unique URLs in that case. > > For the blob: URLs, I wonder. > > Frank, I think none of these resources are being retrieved from the web. So if > your goal is to track web retrievals, this exclusion makes sense. What I don't > quite understand is how many of these there are in practice. I don't quite understand why blob urls need to have a guid in them, it appears to have something to do with origin security. If you would prefer to track the data in blob urls too, WebKit contains the required URL extraction code (to pull the host/port/raw data out) but it seemed non-trivial to use it here. So please feel free to prepare a more suitable patch in that case. :)
Actually, can you just narrow the included URLs to http and https?
On 2012/08/09 17:41:19, frankwang wrote: > Actually, can you just narrow the included URLs to http and https? lgtm if we just only track http and https.
LGTM if it tracks only http and https.
+sky for OWNERS
http://chromiumcodereview.appspot.com/10824236/diff/2002/content/browser/rend... File content/browser/renderer_host/duplicate_content_resource_handler.cc (right): http://chromiumcodereview.appspot.com/10824236/diff/2002/content/browser/rend... content/browser/renderer_host/duplicate_content_resource_handler.cc:107: MH_UINT32 hashed_with_url = PMurHash32_Result( nice cleanup.
I'm not familiar with this code to do a review. Maybe John can do it (I added him).
I'm sorry, I just noticed this late in the game: but we shouldn't calculate the Murmur3 for non http/https requests. We know this at the first time we receive data, right? http://codereview.chromium.org/10824236/diff/2002/content/browser/renderer_ho... File content/browser/renderer_host/duplicate_content_resource_handler.cc (right): http://codereview.chromium.org/10824236/diff/2002/content/browser/renderer_ho... content/browser/renderer_host/duplicate_content_resource_handler.cc:95: if (!request_->url().SchemeIs("http") && !request_->url().SchemeIs("https")) { I'm taking my LGTM back. This code calculates the hash on every resource, when it could save some time by just setting a bool early on.
http://codereview.chromium.org/10824236/diff/2002/content/browser/renderer_ho... File content/browser/renderer_host/duplicate_content_resource_handler.cc (right): http://codereview.chromium.org/10824236/diff/2002/content/browser/renderer_ho... content/browser/renderer_host/duplicate_content_resource_handler.cc:95: if (!request_->url().SchemeIs("http") && !request_->url().SchemeIs("https")) { On 2012/08/09 20:20:45, gavinp wrote: > I'm taking my LGTM back. This code calculates the hash on every resource, when > it could save some time by just setting a bool early on. Sorry, you lost me? There's a return here, so if the request is not http and not https we early out.
http://codereview.chromium.org/10824236/diff/2002/content/browser/renderer_ho... File content/browser/renderer_host/duplicate_content_resource_handler.cc (right): http://codereview.chromium.org/10824236/diff/2002/content/browser/renderer_ho... content/browser/renderer_host/duplicate_content_resource_handler.cc:75: read_buffer_->data(), bytes_read); The hash is calculated incrementally here. If the request isn't http or https, isn't before this is called the time to stop? http://codereview.chromium.org/10824236/diff/2002/content/browser/renderer_ho... content/browser/renderer_host/duplicate_content_resource_handler.cc:95: if (!request_->url().SchemeIs("http") && !request_->url().SchemeIs("https")) { See my comment on line 74, above. On 2012/08/09 20:24:56, scottmg wrote: > On 2012/08/09 20:20:45, gavinp wrote: > > I'm taking my LGTM back. This code calculates the hash on every resource, when > > it could save some time by just setting a bool early on. > > Sorry, you lost me? There's a return here, so if the request is not http and not > https we early out.
http://codereview.chromium.org/10824236/diff/2002/content/browser/renderer_ho... File content/browser/renderer_host/duplicate_content_resource_handler.cc (right): http://codereview.chromium.org/10824236/diff/2002/content/browser/renderer_ho... content/browser/renderer_host/duplicate_content_resource_handler.cc:75: read_buffer_->data(), bytes_read); On 2012/08/12 16:17:14, gavinp wrote: > The hash is calculated incrementally here. If the request isn't http or https, > isn't before this is called the time to stop? I missed this too and agree with Gavin. I also take back my L G T M. The URLRequest is passed in from the resource dispatcher. To save performance, you should have a check here to pass through to the next handler if it isn't http or https.
PTAL. I'm not sure that I like the addition of extra state, but I defer to my reviewers. :)
LGTM
jam for OWNERS
LGTM. Thanks for humouring us.
rubberstamp lgtm
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/scottmg@chromium.org/10824236/3003
Failed to apply patch for content/browser/renderer_host/duplicate_content_resource_handler.cc: While running patch -p1 --forward --force; patching file content/browser/renderer_host/duplicate_content_resource_handler.cc Hunk #4 FAILED at 99. 1 out of 4 hunks FAILED -- saving rejects to file content/browser/renderer_host/duplicate_content_resource_handler.cc.rej
On 2012/08/13 20:44:45, I haz the power (commit-bot) wrote: > Failed to apply patch for > content/browser/renderer_host/duplicate_content_resource_handler.cc: > While running patch -p1 --forward --force; > patching file > content/browser/renderer_host/duplicate_content_resource_handler.cc > Hunk #4 FAILED at 99. > 1 out of 4 hunks FAILED -- saving rejects to file > content/browser/renderer_host/duplicate_content_resource_handler.cc.rej I just committed a patch. You might need to rebase.
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/scottmg@chromium.org/10824236/2003
Try job failure for 10824236-2003 (retry) on android for steps "compile, build" (clobber build). It's a second try, previously, steps "compile, build" failed. http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=android&nu...
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/scottmg@chromium.org/10824236/2004
Try job failure for 10824236-2004 (retry) on win_rel for step "runhooks". It's a second try, previously, step "interactive_ui_tests" failed. http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=win_rel&nu...
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/scottmg@chromium.org/10824236/2004
Try job failure for 10824236-2004 (retry) (retry) on linux_chromeos for steps "browser_tests, content_browsertests". It's a second try, previously, steps "browser_tests, content_browsertests" failed. http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=linux_chro...
On 2012/08/14 04:43:50, I haz the power (commit-bot) wrote: > Try job failure for 10824236-2004 (retry) (retry) on linux_chromeos for steps > "browser_tests, content_browsertests". > It's a second try, previously, steps "browser_tests, content_browsertests" > failed. > http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=linux_chro... It just doesn't like it wants to be committed today.
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/scottmg@chromium.org/10824236/2004
On 2012/08/14 04:46:28, frankwang wrote: > On 2012/08/14 04:43:50, I haz the power (commit-bot) wrote: > > Try job failure for 10824236-2004 (retry) (retry) on linux_chromeos for steps > > "browser_tests, content_browsertests". > > It's a second try, previously, steps "browser_tests, content_browsertests" > > failed. > > > http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=linux_chro... > It just doesn't like it wants to be committed today. It doesn't seem like it wants to be committed today. I tried it again to see if it will go through.
Change committed as 151452 |