|
|
Created:
7 years, 3 months ago by clamy Modified:
7 years, 3 months ago CC:
chromium-reviews, cbentzel+watch_chromium.org, gavinp+disk_chromium.org Base URL:
https://chromium.googlesource.com/chromium/src.git@master Visibility:
Public. |
DescriptionSimpleCache: merge the first and second stream in one file
This CL merges the two first streams in a single file in order to reduce the
disk space consumption of the Simple Cache.
BUG=173398
Committed: https://src.chromium.org/viewvc/chrome?view=rev&revision=223989
Patch Set 1 : #
Total comments: 27
Patch Set 2 : Switched to GrowableIOBuffer #
Total comments: 31
Patch Set 3 : Stream 0 size in footer #
Total comments: 32
Patch Set 4 : Removed functions from simple_util #
Total comments: 52
Patch Set 5 : Changes in the crc32 handling #
Total comments: 16
Patch Set 6 : Addressed Egor's comments #Patch Set 7 : Addressed gavin's comments #
Messages
Total messages: 35 (0 generated)
gavinp, pasko: PTAL This is a first version of the stream merge. Concerning tests, there are no new unit tests yet. Some of the current tests will also need to be modified: many only use stream 0. Since the implementation of each stream is now different, I believe they should be modified to run on each stream. However, I think this belongs to another CL. Wdyt? I also needed to modify a few tests that imply a corruption on disk to use stream 1, stream 0 being unusable unless the entry is closed because the data is kept in memory.
Sorry, I am reading this slowly. As usual, I find it difficult to jump across multiple functions to follow codepaths through numerous thread hops and state changes. But, sadly, now I don't have any suggestion on how to make it significantly easier to read. The code overall looks reasonable. Thanks! One high-level note I have is to try to reuse the same IOBuffer at all times in SimpleEntryImpl. From a bird's eye view it looks like a cleaner strategy. Without that the transfer of ownership, reallocation, copying manually only adds extra mental complexity for a benefit that I did not discover yet. See smaller notes below, did not look through everything I wanted yet, will have another pass tomorrow. https://codereview.chromium.org/23983005/diff/7001/net/disk_cache/simple/simp... File net/disk_cache/simple/simple_entry_format.h (right): https://codereview.chromium.org/23983005/diff/7001/net/disk_cache/simple/simp... net/disk_cache/simple/simple_entry_format.h:31: // - a SimpleFileHeader. Please clarify in the description whether the size for stream2 is stored in SimpleFileHeader.stream_0_size or SimpleFileHeader.stream_1_size, and what the other value should be. https://codereview.chromium.org/23983005/diff/7001/net/disk_cache/simple/simp... File net/disk_cache/simple/simple_entry_impl.cc (right): https://codereview.chromium.org/23983005/diff/7001/net/disk_cache/simple/simp... net/disk_cache/simple/simple_entry_impl.cc:409: return WriteStream0Data(buf, offset, buf_len, truncate); This is an optimization over copying the contents of the buffer twice in the optimistic write. That is neat. However, reading this and verifying correctness is not trivial. I would recommend disabling optimistic operations on stream 0 below, this would merge this codepath with the main one at the cost of an extra push/pop at entry_operations_. I think this cost is small. I would even argue that just dropping these lines and going with double copying is nicer because it's simpler. https://codereview.chromium.org/23983005/diff/7001/net/disk_cache/simple/simp... net/disk_cache/simple/simple_entry_impl.cc:786: // Stream 0 data is kept in memory, so can be read immediately. s/so can be// we do not have an option to post a task to worker pool to perform the operation, so the comment should not read as if this is an optimization. https://codereview.chromium.org/23983005/diff/7001/net/disk_cache/simple/simp... net/disk_cache/simple/simple_entry_impl.cc:869: DCHECK_EQ(STATE_READY, state_); this DCHECK belongs better to a place _before_ we do anything with data https://codereview.chromium.org/23983005/diff/7001/net/disk_cache/simple/simp... net/disk_cache/simple/simple_entry_impl.cc:1076: stream_index, completion_callback, *entry_stat, result.Pass()); probably .Pass() would be more stylish https://codereview.chromium.org/23983005/diff/7001/net/disk_cache/simple/simp... net/disk_cache/simple/simple_entry_impl.cc:1262: // note writes to stream 0 that violate those assumptions. nit: not sure: s/note/support/ ? https://codereview.chromium.org/23983005/diff/7001/net/disk_cache/simple/simp... net/disk_cache/simple/simple_entry_impl.cc:1266: stream_0_data_ = new IOBuffer(buf_len); please use a GrowableIOBuffer, it might be less optimal in some cases, but certainly shorter to write and easier to read. https://codereview.chromium.org/23983005/diff/7001/net/disk_cache/simple/simp... File net/disk_cache/simple/simple_entry_impl.h (right): https://codereview.chromium.org/23983005/diff/7001/net/disk_cache/simple/simp... net/disk_cache/simple/simple_entry_impl.h:243: int WriteStream0Data(net::IOBuffer* buf, I find this name confusing. I would prefer CopyStream0Data() to have an extra hint that we are not writing it to disk but rathey copy it and going to write it back on Close(). https://codereview.chromium.org/23983005/diff/7001/net/disk_cache/simple/simp... net/disk_cache/simple/simple_entry_impl.h:298: scoped_refptr<net::IOBuffer> stream_0_data_; We should have an explanation why stream 0 is special and what the strategy for it is and why it is like that. I think this would be the best place to explain it. https://codereview.chromium.org/23983005/diff/7001/net/disk_cache/simple/simp... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/23983005/diff/7001/net/disk_cache/simple/simp... net/disk_cache/simple/simple_synchronous_entry.cc:167: result(net::OK) { please initialize the new field here https://codereview.chromium.org/23983005/diff/7001/net/disk_cache/simple/simp... net/disk_cache/simple/simple_synchronous_entry.cc:712: header.key_hash = base::Hash(key_); why not entry_hash_? https://codereview.chromium.org/23983005/diff/7001/net/disk_cache/simple/simp... net/disk_cache/simple/simple_synchronous_entry.cc:720: DLOG(WARNING) << "Could not write headers to cache entry."; 'headers' may be misread as HTTP headers, better refer to something like 'cache file header' https://codereview.chromium.org/23983005/diff/7001/net/disk_cache/simple/simp... File net/disk_cache/simple/simple_synchronous_entry.h (right): https://codereview.chromium.org/23983005/diff/7001/net/disk_cache/simple/simp... net/disk_cache/simple/simple_synchronous_entry.h:158: // Allocates and fill a buffer with stream 0 data in |stream_0_data|, then s/fill/fills/ s/checks it/checks its/
Thanks Egor! It's true that the interactions between SimpleEntryImplementation and SimpleSynchronousEntry are not always easy to follow. https://chromiumcodereview.appspot.com/23983005/diff/7001/net/disk_cache/simp... File net/disk_cache/simple/simple_entry_format.h (right): https://chromiumcodereview.appspot.com/23983005/diff/7001/net/disk_cache/simp... net/disk_cache/simple/simple_entry_format.h:31: // - a SimpleFileHeader. On 2013/09/09 20:01:22, pasko wrote: > Please clarify in the description whether the size for stream2 is stored in > SimpleFileHeader.stream_0_size or SimpleFileHeader.stream_1_size, and what the > other value should be. Done. https://chromiumcodereview.appspot.com/23983005/diff/7001/net/disk_cache/simp... File net/disk_cache/simple/simple_entry_impl.cc (right): https://chromiumcodereview.appspot.com/23983005/diff/7001/net/disk_cache/simp... net/disk_cache/simple/simple_entry_impl.cc:409: return WriteStream0Data(buf, offset, buf_len, truncate); On 2013/09/09 20:01:22, pasko wrote: > This is an optimization over copying the contents of the buffer twice in the > optimistic write. That is neat. However, reading this and verifying correctness > is not trivial. > > I would recommend disabling optimistic operations on stream 0 below, this would > merge this codepath with the main one at the cost of an extra push/pop at > entry_operations_. I think this cost is small. > > I would even argue that just dropping these lines and going with double copying > is nicer because it's simpler. Done. https://chromiumcodereview.appspot.com/23983005/diff/7001/net/disk_cache/simp... net/disk_cache/simple/simple_entry_impl.cc:786: // Stream 0 data is kept in memory, so can be read immediately. On 2013/09/09 20:01:22, pasko wrote: > s/so can be// > > we do not have an option to post a task to worker pool to perform the operation, > so the comment should not read as if this is an optimization. Done. https://chromiumcodereview.appspot.com/23983005/diff/7001/net/disk_cache/simp... net/disk_cache/simple/simple_entry_impl.cc:869: DCHECK_EQ(STATE_READY, state_); On 2013/09/09 20:01:22, pasko wrote: > this DCHECK belongs better to a place _before_ we do anything with data Done. https://chromiumcodereview.appspot.com/23983005/diff/7001/net/disk_cache/simp... net/disk_cache/simple/simple_entry_impl.cc:1076: stream_index, completion_callback, *entry_stat, result.Pass()); On 2013/09/09 20:01:22, pasko wrote: > probably .Pass() would be more stylish EntryOperationComplete expects a const SimpleEntryStat&. https://chromiumcodereview.appspot.com/23983005/diff/7001/net/disk_cache/simp... net/disk_cache/simple/simple_entry_impl.cc:1262: // note writes to stream 0 that violate those assumptions. On 2013/09/09 20:01:22, pasko wrote: > nit: not sure: s/note/support/ ? Done. https://chromiumcodereview.appspot.com/23983005/diff/7001/net/disk_cache/simp... net/disk_cache/simple/simple_entry_impl.cc:1266: stream_0_data_ = new IOBuffer(buf_len); On 2013/09/09 20:01:22, pasko wrote: > please use a GrowableIOBuffer, it might be less optimal in some cases, but > certainly shorter to write and easier to read. Done. https://chromiumcodereview.appspot.com/23983005/diff/7001/net/disk_cache/simp... File net/disk_cache/simple/simple_entry_impl.h (right): https://chromiumcodereview.appspot.com/23983005/diff/7001/net/disk_cache/simp... net/disk_cache/simple/simple_entry_impl.h:243: int WriteStream0Data(net::IOBuffer* buf, On 2013/09/09 20:01:22, pasko wrote: > I find this name confusing. I would prefer CopyStream0Data() to have an extra > hint that we are not writing it to disk but rathey copy it and going to write it > back on Close(). Done. https://chromiumcodereview.appspot.com/23983005/diff/7001/net/disk_cache/simp... net/disk_cache/simple/simple_entry_impl.h:298: scoped_refptr<net::IOBuffer> stream_0_data_; On 2013/09/09 20:01:22, pasko wrote: > We should have an explanation why stream 0 is special and what the strategy for > it is and why it is like that. I think this would be the best place to explain > it. Done. https://chromiumcodereview.appspot.com/23983005/diff/7001/net/disk_cache/simp... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://chromiumcodereview.appspot.com/23983005/diff/7001/net/disk_cache/simp... net/disk_cache/simple/simple_synchronous_entry.cc:167: result(net::OK) { On 2013/09/09 20:01:22, pasko wrote: > please initialize the new field here I thought we were supposed to omit default constructors in list initialization. The default constructor initializes the pointer to NULL, which is what I want. https://chromiumcodereview.appspot.com/23983005/diff/7001/net/disk_cache/simp... net/disk_cache/simple/simple_synchronous_entry.cc:712: header.key_hash = base::Hash(key_); On 2013/09/09 20:01:22, pasko wrote: > why not entry_hash_? Strangely, key_hash is defined as an int32 in SimpleFileHeader, but entry_hash_ is an int64. Using entry_hash_ causes all entries to fail because of a hash mismatch, while using base::Hash works. I suspect some conversion error of some sort. Wdt? https://chromiumcodereview.appspot.com/23983005/diff/7001/net/disk_cache/simp... net/disk_cache/simple/simple_synchronous_entry.cc:720: DLOG(WARNING) << "Could not write headers to cache entry."; On 2013/09/09 20:01:22, pasko wrote: > 'headers' may be misread as HTTP headers, better refer to something like 'cache > file header' Done. https://chromiumcodereview.appspot.com/23983005/diff/7001/net/disk_cache/simp... File net/disk_cache/simple/simple_synchronous_entry.h (right): https://chromiumcodereview.appspot.com/23983005/diff/7001/net/disk_cache/simp... net/disk_cache/simple/simple_synchronous_entry.h:158: // Allocates and fill a buffer with stream 0 data in |stream_0_data|, then On 2013/09/09 20:01:22, pasko wrote: > s/fill/fills/ > s/checks it/checks its/ Done.
Overall looks good. I like keeping the buffer in the simple_entry_impl, makes for nice zero latency IO. My only question is about the file format... https://codereview.chromium.org/23983005/diff/20001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_entry_format.h (right): https://codereview.chromium.org/23983005/diff/20001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_entry_format.h:49: uint32 stream_0_size; I'm not sure that storing explicit sizes is great. Now, what do we do if the file size and the stream sizes are mutually inconsistent? We can avoid this by storing offsets. And the offset of stream 1 is pretty easy to store: it's 0. As well, storing the stream 0 size means we're mutating this header every time that stream 0 changes size. I propose: we store the stream 0 offset in a signed int. Negative means from EOF, positive means from beginning. The first time we write the header, we don't know stream 1 size, but we do know stream 0, so we write a negative offset. The first time we update the headers, we can write in a positive offset (we know the stream 1 size), and then we aren't rewriting the header ever. That's more complex though than just always storing the size of stream 1 here (might as well always make it negative, just so the upgrade is easier if we do the tricky thing). And it avoids all these weird inconsistencies from the explicit sizes. https://codereview.chromium.org/23983005/diff/20001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_entry_impl.cc (right): https://codereview.chromium.org/23983005/diff/20001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_entry_impl.cc:1267: int SimpleEntryImpl::CopyStream0Data(net::IOBuffer* buf, Does stream 0 need to support anything than a truncating write of length 0 at offset 0, and a write of length X at offset 0 to a stream already of length 0? Other than for unit tests, I mean. https://codereview.chromium.org/23983005/diff/20001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_entry_impl.cc:1289: const int fill_size = buffer_size - data_size_[0]; const int fill_size = (truncate || offset <= data_size_[0]) ? 0 : offset - data_size_[0]; Or is that too complicated? https://codereview.chromium.org/23983005/diff/20001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_entry_impl.h (right): https://codereview.chromium.org/23983005/diff/20001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_entry_impl.h:314: // stream 1 on disk (to save diskspace). This strategy allows stream 1 to Stream 1 probably never needs to change size, though.
https://chromiumcodereview.appspot.com/23983005/diff/20001/net/disk_cache/sim... File net/disk_cache/simple/simple_entry_format.h (right): https://chromiumcodereview.appspot.com/23983005/diff/20001/net/disk_cache/sim... net/disk_cache/simple/simple_entry_format.h:49: uint32 stream_0_size; On 2013/09/10 22:20:49, gavinp wrote: > I'm not sure that storing explicit sizes is great. Now, what do we do if the > file size and the stream sizes are mutually inconsistent? > > We can avoid this by storing offsets. And the offset of stream 1 is pretty easy > to store: it's 0. > > As well, storing the stream 0 size means we're mutating this header every time > that stream 0 changes size. > > I propose: we store the stream 0 offset in a signed int. Negative means from > EOF, positive means from beginning. The first time we write the header, we don't > know stream 1 size, but we do know stream 0, so we write a negative offset. The > first time we update the headers, we can write in a positive offset (we know the > stream 1 size), and then we aren't rewriting the header ever. > > That's more complex though than just always storing the size of stream 1 here > (might as well always make it negative, just so the upgrade is easier if we do > the tricky thing). And it avoids all these weird inconsistencies from the > explicit sizes. I don't really like the idea of having an offset taken from different things based on its sign. I find it too complicated, not to mention that the first time we write the header we don't know the size of stream 0 (since it is in Create). The other time (on Close) we know the size of both streams, so which one we choose matters little. I would also prefer to store the size explicitly, rather than an offset, since we already store the key size. As to whether store stream 0 or stream 1 size, if we expect stream 1 size to vary less, it would be better to store stream 1. Then rewrite the header only if stream 1 size varies. https://chromiumcodereview.appspot.com/23983005/diff/20001/net/disk_cache/sim... File net/disk_cache/simple/simple_entry_impl.cc (right): https://chromiumcodereview.appspot.com/23983005/diff/20001/net/disk_cache/sim... net/disk_cache/simple/simple_entry_impl.cc:1267: int SimpleEntryImpl::CopyStream0Data(net::IOBuffer* buf, On 2013/09/10 22:20:49, gavinp wrote: > Does stream 0 need to support anything than a truncating write of length 0 at > offset 0, and a write of length X at offset 0 to a stream already of length 0? > > Other than for unit tests, I mean. Well on HTTP cache, according to UMA other cases never happened in two weeks. But I am not sure we can drop support of the other types of write (wouldn't we be "breaking the contract of the API"?). https://chromiumcodereview.appspot.com/23983005/diff/20001/net/disk_cache/sim... net/disk_cache/simple/simple_entry_impl.cc:1289: const int fill_size = buffer_size - data_size_[0]; On 2013/09/10 22:20:49, gavinp wrote: > const int fill_size = (truncate || offset <= data_size_[0]) ? 0 : offset - > data_size_[0]; > > Or is that too complicated? With a comment such as // The extension until offset needs to be zeroed it should be ok (otherwise I needed a drawing to see what this was about). And I think the formula should be: const int fill_size = offset <= data_size_[0] ? 0 : offset -data_size_[0]; https://chromiumcodereview.appspot.com/23983005/diff/20001/net/disk_cache/sim... File net/disk_cache/simple/simple_entry_impl.h (right): https://chromiumcodereview.appspot.com/23983005/diff/20001/net/disk_cache/sim... net/disk_cache/simple/simple_entry_impl.h:314: // stream 1 on disk (to save diskspace). This strategy allows stream 1 to On 2013/09/10 22:20:49, gavinp wrote: > Stream 1 probably never needs to change size, though. Probably not, but the unit tests test it.
https://chromiumcodereview.appspot.com/23983005/diff/20001/net/disk_cache/sim... File net/disk_cache/simple/simple_entry_format.h (right): https://chromiumcodereview.appspot.com/23983005/diff/20001/net/disk_cache/sim... net/disk_cache/simple/simple_entry_format.h:49: uint32 stream_0_size; On 2013/09/11 12:46:21, clamy wrote: > On 2013/09/10 22:20:49, gavinp wrote: > > I'm not sure that storing explicit sizes is great. Now, what do we do if the > > file size and the stream sizes are mutually inconsistent? > > > > We can avoid this by storing offsets. And the offset of stream 1 is pretty > easy > > to store: it's 0. > > > > As well, storing the stream 0 size means we're mutating this header every time > > that stream 0 changes size. > > > > I propose: we store the stream 0 offset in a signed int. Negative means from > > EOF, positive means from beginning. The first time we write the header, we > don't > > know stream 1 size, but we do know stream 0, so we write a negative offset. > The > > first time we update the headers, we can write in a positive offset (we know > the > > stream 1 size), and then we aren't rewriting the header ever. > > > > That's more complex though than just always storing the size of stream 1 here > > (might as well always make it negative, just so the upgrade is easier if we do > > the tricky thing). And it avoids all these weird inconsistencies from the > > explicit sizes. > > I don't really like the idea of having an offset taken from different things > based on its sign. I find it too complicated, not to mention that the first time > we write the header we don't know the size of stream 0 (since it is in Create). > The other time (on Close) we know the size of both streams, so which one we > choose matters little. I would also prefer to store the size explicitly, rather > than an offset, since we already store the key size. > > As to whether store stream 0 or stream 1 size, if we expect stream 1 size to > vary less, it would be better to store stream 1. Then rewrite the header only if > stream 1 size varies. If we write stream 1 size, we're rewriting the header every time we close an entry. The most common case for a cache entry is to never be used, so unfortunate to create extra IO for that case. The offset from different things based on sign probably is too complicated; sure. Could we store the size of stream 0 in the footer?
https://chromiumcodereview.appspot.com/23983005/diff/7001/net/disk_cache/simp... File net/disk_cache/simple/simple_entry_format.h (right): https://chromiumcodereview.appspot.com/23983005/diff/7001/net/disk_cache/simp... net/disk_cache/simple/simple_entry_format.h:47: uint32 stream_0_size; Why not just store the offset of the stream that comes second in the file (stream 0)? https://chromiumcodereview.appspot.com/23983005/diff/20001/net/disk_cache/sim... File net/disk_cache/simple/simple_entry_impl.cc (right): https://chromiumcodereview.appspot.com/23983005/diff/20001/net/disk_cache/sim... net/disk_cache/simple/simple_entry_impl.cc:857: // Postask prevents creating a loop when calling the callback directly. nit: PostTask
https://codereview.chromium.org/23983005/diff/20001/net/disk_cache/backend_un... File net/disk_cache/backend_unittest.cc (right): https://codereview.chromium.org/23983005/diff/20001/net/disk_cache/backend_un... net/disk_cache/backend_unittest.cc:3191: // Write an invalid header on stream 1. Let's mention stream 0 here as well to be less cryptic: // Write an invalid header for stream 0 and stream 1. https://codereview.chromium.org/23983005/diff/20001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_entry_impl.cc (right): https://codereview.chromium.org/23983005/diff/20001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_entry_impl.cc:1257: RecordReadResult(cache_type_, READ_RESULT_SUCCESS); READ_RESULT_SYNC_READ_FAILURE https://codereview.chromium.org/23983005/diff/20001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_entry_impl.cc:1271: // Currently, Simple Cache is only used for HTTP, which stores the headers in "only used for HTTP" will have to be changed soon. Better rephrase to say that stream 0 is used only for HTTP headers, all other clients are encouraged to use stream 1. https://codereview.chromium.org/23983005/diff/20001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_entry_impl.cc:1277: have_written_[0] = true; I am somewhat annoyed to observe the zero index in data_size_[0] along this function, can we add a temporary and initialize it here: data_size = data_size_[0] ? https://codereview.chromium.org/23983005/diff/20001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/23983005/diff/20001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_synchronous_entry.cc:627: out_entry_stat->data_size[0] = header.stream_0_size; Keeping in mind what index is for what would be hard for an unfamiliar person. How about encapsulating all operations on data_size_ in SimpleEntryStat? I can see these operations regarding streams: GetDataSize(stream_index) GetFileName(entry_hash, stream_index) GetOffsetInFile(stream_index, offset) GetEOFOffsetInFile(stream_index) and we need some operations to work on files: iteration: 0..kSimpleEntryFileCount GetClusterLoss(file_index) ApplyDataSizesFromFileSize(file_index, file_size) and maybe something else I forgot :) Some functions may be used only once beyond tests, but the code responsible for file naming on the disk and indexing in the file array would be local in one place making it easier to follow than magic juggling with data_size[0] and data_size[1]. https://codereview.chromium.org/23983005/diff/20001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_util.cc (right): https://codereview.chromium.org/23983005/diff/20001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_util.cc:86: return GetEntryHashKeyAsHexString(key) + base::StringPrintf("_%1d", index); it seems natural to replace "index" to "GetFileIndexFromStreamIndex(index)" in this line. Same with GetFilenameFromEntryHashAndIndex() https://codereview.chromium.org/23983005/diff/20001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_util.cc:102: return (stream_index == 2) ? 1 : 0; I would prefer to leave stream 2 with the "_2" suffix in the file name. This backwards-compatibl-ish property helps in some situations like "a simple_cache_dump_cache" tool that understands both old and new formats while it seemingly does not have any downsides.. https://codereview.chromium.org/23983005/diff/20001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_util.cc:105: int GetMaximumDataOffset(int file_index, const int data_size[]) { Naming is fun. This name suggests (at least to me) that the function should return the offset of the maximum byte of the data, while it returns the offset of first non-data byte (well, counting the 0th EOF record as "data"). So how do we call it? How about GetLastEOFRecordOffset()?
Thanks for the review guys! This new version has the size of stream 0 stored in the EOF record, so no extraneous write of the header. However, the open path is more complicated. https://chromiumcodereview.appspot.com/23983005/diff/20001/net/disk_cache/bac... File net/disk_cache/backend_unittest.cc (right): https://chromiumcodereview.appspot.com/23983005/diff/20001/net/disk_cache/bac... net/disk_cache/backend_unittest.cc:3191: // Write an invalid header on stream 1. On 2013/09/13 19:09:21, pasko wrote: > Let's mention stream 0 here as well to be less cryptic: > // Write an invalid header for stream 0 and stream 1. Done. https://chromiumcodereview.appspot.com/23983005/diff/20001/net/disk_cache/sim... File net/disk_cache/simple/simple_entry_format.h (right): https://chromiumcodereview.appspot.com/23983005/diff/20001/net/disk_cache/sim... net/disk_cache/simple/simple_entry_format.h:49: uint32 stream_0_size; On 2013/09/11 14:22:10, gavinp wrote: > On 2013/09/11 12:46:21, clamy wrote: > > On 2013/09/10 22:20:49, gavinp wrote: > > > I'm not sure that storing explicit sizes is great. Now, what do we do if the > > > file size and the stream sizes are mutually inconsistent? > > > > > > We can avoid this by storing offsets. And the offset of stream 1 is pretty > > easy > > > to store: it's 0. > > > > > > As well, storing the stream 0 size means we're mutating this header every > time > > > that stream 0 changes size. > > > > > > I propose: we store the stream 0 offset in a signed int. Negative means from > > > EOF, positive means from beginning. The first time we write the header, we > > don't > > > know stream 1 size, but we do know stream 0, so we write a negative offset. > > The > > > first time we update the headers, we can write in a positive offset (we know > > the > > > stream 1 size), and then we aren't rewriting the header ever. > > > > > > That's more complex though than just always storing the size of stream 1 > here > > > (might as well always make it negative, just so the upgrade is easier if we > do > > > the tricky thing). And it avoids all these weird inconsistencies from the > > > explicit sizes. > > > > I don't really like the idea of having an offset taken from different things > > based on its sign. I find it too complicated, not to mention that the first > time > > we write the header we don't know the size of stream 0 (since it is in > Create). > > The other time (on Close) we know the size of both streams, so which one we > > choose matters little. I would also prefer to store the size explicitly, > rather > > than an offset, since we already store the key size. > > > > As to whether store stream 0 or stream 1 size, if we expect stream 1 size to > > vary less, it would be better to store stream 1. Then rewrite the header only > if > > stream 1 size varies. > > If we write stream 1 size, we're rewriting the header every time we close an > entry. The most common case for a cache entry is to never be used, so > unfortunate to create extra IO for that case. > > The offset from different things based on sign probably is too complicated; > sure. > > Could we store the size of stream 0 in the footer? > Done. https://chromiumcodereview.appspot.com/23983005/diff/20001/net/disk_cache/sim... File net/disk_cache/simple/simple_entry_impl.cc (right): https://chromiumcodereview.appspot.com/23983005/diff/20001/net/disk_cache/sim... net/disk_cache/simple/simple_entry_impl.cc:857: // Postask prevents creating a loop when calling the callback directly. On 2013/09/12 22:05:56, ttuttle wrote: > nit: PostTask Done. https://chromiumcodereview.appspot.com/23983005/diff/20001/net/disk_cache/sim... net/disk_cache/simple/simple_entry_impl.cc:1257: RecordReadResult(cache_type_, READ_RESULT_SUCCESS); On 2013/09/13 19:09:21, pasko wrote: > READ_RESULT_SYNC_READ_FAILURE Done. https://chromiumcodereview.appspot.com/23983005/diff/20001/net/disk_cache/sim... net/disk_cache/simple/simple_entry_impl.cc:1271: // Currently, Simple Cache is only used for HTTP, which stores the headers in On 2013/09/13 19:09:21, pasko wrote: > "only used for HTTP" will have to be changed soon. Better rephrase to say that > stream 0 is used only for HTTP headers, all other clients are encouraged to use > stream 1. Done. https://chromiumcodereview.appspot.com/23983005/diff/20001/net/disk_cache/sim... net/disk_cache/simple/simple_entry_impl.cc:1277: have_written_[0] = true; On 2013/09/13 19:09:21, pasko wrote: > I am somewhat annoyed to observe the zero index in data_size_[0] along this > function, can we add a temporary and initialize it here: > data_size = data_size_[0] > ? Done. https://chromiumcodereview.appspot.com/23983005/diff/20001/net/disk_cache/sim... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://chromiumcodereview.appspot.com/23983005/diff/20001/net/disk_cache/sim... net/disk_cache/simple/simple_synchronous_entry.cc:627: out_entry_stat->data_size[0] = header.stream_0_size; On 2013/09/13 19:09:21, pasko wrote: > Keeping in mind what index is for what would be hard for an unfamiliar person. > > How about encapsulating all operations on data_size_ in SimpleEntryStat? > > I can see these operations regarding streams: > GetDataSize(stream_index) > GetFileName(entry_hash, stream_index) > GetOffsetInFile(stream_index, offset) > GetEOFOffsetInFile(stream_index) > > and we need some operations to work on files: > iteration: 0..kSimpleEntryFileCount > GetClusterLoss(file_index) > ApplyDataSizesFromFileSize(file_index, file_size) > > and maybe something else I forgot :) > > Some functions may be used only once beyond tests, but the code responsible for > file naming on the disk and indexing in the file array would be local in one > place making it easier to follow than magic juggling with data_size[0] and > data_size[1]. - I am not sure about the GetDataSize(stream_index)? Is it so much clearer than data_size[stream_index]? - GetFileName(entry_hash, stream_index) does not touch data_size, and would be a duplicate of functions in simple_util if they are split in two versions. If not, well it would be better in simple_util anyway. I implemented the two others. https://chromiumcodereview.appspot.com/23983005/diff/20001/net/disk_cache/sim... File net/disk_cache/simple/simple_util.cc (right): https://chromiumcodereview.appspot.com/23983005/diff/20001/net/disk_cache/sim... net/disk_cache/simple/simple_util.cc:86: return GetEntryHashKeyAsHexString(key) + base::StringPrintf("_%1d", index); On 2013/09/13 19:09:21, pasko wrote: > it seems natural to replace "index" to "GetFileIndexFromStreamIndex(index)" in > this line. > > Same with GetFilenameFromEntryHashAndIndex() When creating the platform files, we are iterating on file index directly (ie we only do 2 iterations instead of 3), while calling this function. So we cannot replace the GetFileIndexFromStreamIndex in the current situation without duplicating this function in a version that works with the stream index, and one that works with the file index. Is that desirable? https://chromiumcodereview.appspot.com/23983005/diff/20001/net/disk_cache/sim... net/disk_cache/simple/simple_util.cc:102: return (stream_index == 2) ? 1 : 0; On 2013/09/13 19:09:21, pasko wrote: > I would prefer to leave stream 2 with the "_2" suffix in the file name. This > backwards-compatibl-ish property helps in some situations like "a > simple_cache_dump_cache" tool that understands both old and new formats while it > seemingly does not have any downsides.. File index is being used as an offset in an array of 2 entries, therefore it has to be strictly smaller than two. We can imagine a function that would convert the file index in memory to a file name number on disk, but I think it is a bit too much for an hypothetic backward compatibility. https://chromiumcodereview.appspot.com/23983005/diff/20001/net/disk_cache/sim... net/disk_cache/simple/simple_util.cc:105: int GetMaximumDataOffset(int file_index, const int data_size[]) { On 2013/09/13 19:09:21, pasko wrote: > Naming is fun. This name suggests (at least to me) that the function should > return the offset of the maximum byte of the data, while it returns the offset > of first non-data byte (well, counting the 0th EOF record as "data"). > > So how do we call it? How about GetLastEOFRecordOffset()? Done.
https://chromiumcodereview.appspot.com/23983005/diff/37001/net/disk_cache/sim... File net/disk_cache/simple/simple_entry_impl.h (left): https://chromiumcodereview.appspot.com/23983005/diff/37001/net/disk_cache/sim... net/disk_cache/simple/simple_entry_impl.h:289: // If |have_written_[index]| is true, we have written to the stream |index|. This comment is no longer strictly true; we set have_written_[0] if we write to stream 1, since we will need to rewrite stream 1 also. https://chromiumcodereview.appspot.com/23983005/diff/37001/net/disk_cache/sim... File net/disk_cache/simple/simple_util.cc (right): https://chromiumcodereview.appspot.com/23983005/diff/37001/net/disk_cache/sim... net/disk_cache/simple/simple_util.cc:107: return data_size[0] + data_size[1] + sizeof(SimpleFileEOF); This looks like it's vulnerable to an integer overflow if someone manages to cache a 2+ GB resource in stream 1 -- the stream 0 offset could easily end up negative. (For that matter, caching something that is INT_MAX could still make the offset of the last SimpleFileHeader's worth of the resource negative.)
https://chromiumcodereview.appspot.com/23983005/diff/37001/net/disk_cache/sim... File net/disk_cache/simple/simple_util.cc (right): https://chromiumcodereview.appspot.com/23983005/diff/37001/net/disk_cache/sim... net/disk_cache/simple/simple_util.cc:107: return data_size[0] + data_size[1] + sizeof(SimpleFileEOF); On 2013/09/16 20:01:52, ttuttle wrote: > This looks like it's vulnerable to an integer overflow if someone manages to > cache a 2+ GB resource in stream 1 -- the stream 0 offset could easily end up > negative. (For that matter, caching something that is INT_MAX could still make > the offset of the last SimpleFileHeader's worth of the resource negative.) This should be fine. Currently Write operations will start failing after reaching 1/8 of the cache size, which itself is itself an int32, so this leaves a maximum of 256MiB of data per entry stream.
answering previous comments now, a more in-depth review to arrive later https://chromiumcodereview.appspot.com/23983005/diff/20001/net/disk_cache/sim... File net/disk_cache/simple/simple_util.cc (right): https://chromiumcodereview.appspot.com/23983005/diff/20001/net/disk_cache/sim... net/disk_cache/simple/simple_util.cc:86: return GetEntryHashKeyAsHexString(key) + base::StringPrintf("_%1d", index); On 2013/09/16 15:01:17, clamy wrote: > On 2013/09/13 19:09:21, pasko wrote: > > it seems natural to replace "index" to "GetFileIndexFromStreamIndex(index)" in > > this line. > > > > Same with GetFilenameFromEntryHashAndIndex() > > When creating the platform files, we are iterating on file index directly (ie we > only do 2 iterations instead of 3), while calling this function. So we cannot > replace the GetFileIndexFromStreamIndex in the current situation without > duplicating this function in a version that works with the stream index, and one > that works with the file index. Is that desirable? makes sense just discovered: this function is used only in tests, please move it to simple_test_util.cc also, since "index" is more ambiguous now than it used to be, please rename all arguments named "index" in simple_util to become "file_index" or "stream_index", there are just 2 remaining. https://chromiumcodereview.appspot.com/23983005/diff/20001/net/disk_cache/sim... net/disk_cache/simple/simple_util.cc:102: return (stream_index == 2) ? 1 : 0; On 2013/09/16 15:01:17, clamy wrote: > On 2013/09/13 19:09:21, pasko wrote: > > I would prefer to leave stream 2 with the "_2" suffix in the file name. This > > backwards-compatibl-ish property helps in some situations like "a > > simple_cache_dump_cache" tool that understands both old and new formats while > it > > seemingly does not have any downsides.. > > File index is being used as an offset in an array of 2 entries, therefore it has > to be strictly smaller than two. We can imagine a function that would convert > the file index in memory to a file name number on disk, but I think it is a bit > too much for an hypothetic backward compatibility. Agreed. https://chromiumcodereview.appspot.com/23983005/diff/37001/net/disk_cache/sim... File net/disk_cache/simple/simple_util.h (right): https://chromiumcodereview.appspot.com/23983005/diff/37001/net/disk_cache/sim... net/disk_cache/simple/simple_util.h:73: int stream_1_size); Originally I did not like this stream_1_size in the utility function because it is error-prone for callers to add this parameter, this could be avoided by encapsulating the stream <-> file mapping details in a compact class. Out of that came the suggestion to advance the SimpleEntryStat. Apparently something did not work out to the original goal, I guess the roadblock was the necessity to make the class available wider and replace the simple_util in parts? Well, by itself it is rather a positive move.
https://codereview.chromium.org/23983005/diff/37001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/23983005/diff/37001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_synchronous_entry.cc:409: if (!stream_0_data) { What's happening in this case?
https://codereview.chromium.org/23983005/diff/20001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_util.cc (right): https://codereview.chromium.org/23983005/diff/20001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_util.cc:86: return GetEntryHashKeyAsHexString(key) + base::StringPrintf("_%1d", index); On 2013/09/17 11:29:01, pasko wrote: > On 2013/09/16 15:01:17, clamy wrote: > > On 2013/09/13 19:09:21, pasko wrote: > > > it seems natural to replace "index" to "GetFileIndexFromStreamIndex(index)" > in > > > this line. > > > > > > Same with GetFilenameFromEntryHashAndIndex() > > > > When creating the platform files, we are iterating on file index directly (ie > we > > only do 2 iterations instead of 3), while calling this function. So we cannot > > replace the GetFileIndexFromStreamIndex in the current situation without > > duplicating this function in a version that works with the stream index, and > one > > that works with the file index. Is that desirable? > > makes sense > > just discovered: this function is used only in tests, please move it to > simple_test_util.cc > > also, since "index" is more ambiguous now than it used to be, please rename all > arguments named "index" in simple_util to become "file_index" or "stream_index", > there are just 2 remaining. Done. https://codereview.chromium.org/23983005/diff/37001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_entry_impl.h (left): https://codereview.chromium.org/23983005/diff/37001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_entry_impl.h:289: // If |have_written_[index]| is true, we have written to the stream |index|. On 2013/09/16 20:01:52, ttuttle wrote: > This comment is no longer strictly true; we set have_written_[0] if we write to > stream 1, since we will need to rewrite stream 1 also. Done. https://codereview.chromium.org/23983005/diff/37001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/23983005/diff/37001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_synchronous_entry.cc:409: if (!stream_0_data) { On 2013/09/17 13:27:11, gavinp wrote: > What's happening in this case? If we were supposed to have data and we don't, it's an error. We are not supposed to though, so I could replace it with a DCHECK. https://codereview.chromium.org/23983005/diff/37001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_util.h (right): https://codereview.chromium.org/23983005/diff/37001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_util.h:73: int stream_1_size); On 2013/09/17 11:29:01, pasko wrote: > Originally I did not like this stream_1_size in the utility function because it > is error-prone for callers to add this parameter, this could be avoided by > encapsulating the stream <-> file mapping details in a compact class. Out of > that came the suggestion to advance the SimpleEntryStat. Apparently something > did not work out to the original goal, I guess the roadblock was the necessity > to make the class available wider and replace the simple_util in parts? Well, by > itself it is rather a positive move. Done.
more thoughts, looking further https://chromiumcodereview.appspot.com/23983005/diff/37001/net/disk_cache/sim... File net/disk_cache/simple/simple_entry_impl.cc (right): https://chromiumcodereview.appspot.com/23983005/diff/37001/net/disk_cache/sim... net/disk_cache/simple/simple_entry_impl.cc:807: buf_len = std::min(buf_len, GetDataSize(stream_index) - offset); we recompute the buf_len here, and in ReadStream0Data(), can we just instead assume in ReadStream0Data() that buf_len does not overflow? Saves one line and moves this line above a little. https://chromiumcodereview.appspot.com/23983005/diff/37001/net/disk_cache/sim... net/disk_cache/simple/simple_entry_impl.cc:872: // PostTask prevents creating a loop when calling the callback directly. nit: this comment is not necessary, we consistently do this sort of thing all the time. It is first commented out on PostClientCallback() which should be enough. https://chromiumcodereview.appspot.com/23983005/diff/37001/net/disk_cache/sim... net/disk_cache/simple/simple_entry_impl.cc:915: have_written_[0] = true; this means we would loose the CRC on stream 0 after reading and not changing the stream https://chromiumcodereview.appspot.com/23983005/diff/37001/net/disk_cache/sim... net/disk_cache/simple/simple_entry_impl.cc:976: crc_check_state_[0] = CRC_CHECK_DONE; this should also update the crc32s_end_offset_[0], otherwise we would loose the CRC in the EOF of stream 0. I would be extremely thankful if you wrote a test that: opens writes stream 0 and 1 sequentially closes opens appends stream 1 closes verifies that both streams still have CRC on disk https://chromiumcodereview.appspot.com/23983005/diff/37001/net/disk_cache/sim... net/disk_cache/simple/simple_entry_impl.cc:1286: // and size changes of the headers. Also, supports writes to stream 0 that nit: "the size and size changes"? I do not see the purpose of the sentence: "Also, supports writes to stream 0 that violate those assumptions." Probably you mean s/supports/support/? It would be more clear it it were written to say that generally the contract requires correctl behavior with other access patterns. Recently I've come to absolutely *love* the phrase "access patterns", so deep. https://chromiumcodereview.appspot.com/23983005/diff/37001/net/disk_cache/sim... net/disk_cache/simple/simple_entry_impl.cc:1292: int data_size = data_size_[0]; In this file we tried to remain consistent on always calling GetDataSize(x) when reading data_size_[x] for all x. It is not much more readable, just more consistent. Please maintain this consistency. https://chromiumcodereview.appspot.com/23983005/diff/37001/net/disk_cache/sim... File net/disk_cache/simple/simple_entry_impl.h (right): https://chromiumcodereview.appspot.com/23983005/diff/37001/net/disk_cache/sim... net/disk_cache/simple/simple_entry_impl.h:261: bool truncate); since this function does not change the state of |buf|, it should be a const reference. Also the comment is confusing, one might think that "the buffer kept in memory" is |buf|. I would rephrase it like this: "Copies data from |buf| to the internal in-memory buffer for stream 0." nit: it would be useful to add a clarification that |truncate| if set to true causes the target buffer to be truncated at |offset| before being written. https://chromiumcodereview.appspot.com/23983005/diff/37001/net/disk_cache/sim... net/disk_cache/simple/simple_entry_impl.h:323: // Stream 0 is kept in memory because it is stored in the same file than s/than/as/ s/(to save diskspace)// <-- since we also save on file the amount of open file descriptors, which is arguably more important https://chromiumcodereview.appspot.com/23983005/diff/37001/net/disk_cache/sim... File net/disk_cache/simple/simple_index_file.cc (right): https://chromiumcodereview.appspot.com/23983005/diff/37001/net/disk_cache/sim... net/disk_cache/simple/simple_index_file.cc:120: // Summing up the total size of the entry through all the *_[0-2] files this comment needs to be updated, there is no longer [0-2] https://chromiumcodereview.appspot.com/23983005/diff/37001/net/disk_cache/sim... net/disk_cache/simple/simple_index_file.cc:415: COMPILE_ASSERT(kSimpleEntryFileCount == 2, Please remove the assertion and the TODO. The assertion does not make sense (there is no "format" for file entries any more, we just go through all files that start with something looking like a key, with a length of filename equal to exact number). The TODO will never get done, obviously. Also I like how the word "Fix" is entertaining in the context.
https://codereview.chromium.org/23983005/diff/37001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/23983005/diff/37001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_synchronous_entry.cc:409: if (!stream_0_data) { On 2013/09/17 13:59:42, clamy wrote: > On 2013/09/17 13:27:11, gavinp wrote: > > What's happening in this case? > > If we were supposed to have data and we don't, it's an error. We are not > supposed to though, so I could replace it with a DCHECK. In practice we always have data in stream 0, but isn't it legal to have nothing?
https://codereview.chromium.org/23983005/diff/37001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/23983005/diff/37001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_synchronous_entry.cc:409: if (!stream_0_data) { On 2013/09/17 14:25:33, gavinp wrote: > On 2013/09/17 13:59:42, clamy wrote: > > On 2013/09/17 13:27:11, gavinp wrote: > > > What's happening in this case? > > > > If we were supposed to have data and we don't, it's an error. We are not > > supposed to though, so I could replace it with a DCHECK. > > In practice we always have data in stream 0, but isn't it legal to have nothing? If we get non 0 size in data_size[0], then we expect to have something in stream 0, but not actual data is provided. Wouldn't that be non legal ?
https://codereview.chromium.org/23983005/diff/37001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/23983005/diff/37001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_synchronous_entry.cc:409: if (!stream_0_data) { On 2013/09/17 14:29:48, clamy wrote: > On 2013/09/17 14:25:33, gavinp wrote: > > On 2013/09/17 13:59:42, clamy wrote: > > > On 2013/09/17 13:27:11, gavinp wrote: > > > > What's happening in this case? > > > > > > If we were supposed to have data and we don't, it's an error. We are not > > > supposed to though, so I could replace it with a DCHECK. > > > > In practice we always have data in stream 0, but isn't it legal to have > nothing? > > If we get non 0 size in data_size[0], then we expect to have something in stream > 0, but not actual data is provided. Wouldn't that be non legal ? Aha, yup. So you were right the first time, this should be a DCHECK. We shouldn't check for impossible conditions. https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_entry_format.h (right): https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_entry_format.h:60: uint32 stream_size; This probably is adding another 32 bits of padding, but that's OK, since we memset this whole struct to 0 in the constructor for precisely this reason. https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_entry_impl.cc (right): https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_entry_impl.cc:799: // Since stream 0 data is kept in memory, it is read immediately. Why not catch this case in ReadData, so we can return synchronously there? Saves a posttask for the callback which seems like a performance win. Same for WriteData? https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_entry_impl.h (right): https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_entry_impl.h:325: // stream 1 on disk (to save diskspace). This strategy allows stream 1 to Nit: disk space is two words. https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_synchronous_entry.cc:314: const int64 file_offset = out_entry_stat->GetOffsetInFile( That doesn't look like an output parameter! Perhaps rename it to "entry_stat" ? Is this necessary, though? Is it legal to call ReadData for stream 0? If we make that illegal, we can get away without this parameter being in/out.
I'd need to look more closely at synchronous entry yet, and tests https://chromiumcodereview.appspot.com/23983005/diff/50001/net/disk_cache/sim... File net/disk_cache/simple/simple_entry_impl.cc (right): https://chromiumcodereview.appspot.com/23983005/diff/50001/net/disk_cache/sim... net/disk_cache/simple/simple_entry_impl.cc:870: // Since stream 0 data is kept in memory, it will be written immediatly. add extra empty line above please, I just like when a block gets commented and preceded by an empty line. Moving DCHECK below the comment is also possible https://chromiumcodereview.appspot.com/23983005/diff/50001/net/disk_cache/sim... net/disk_cache/simple/simple_entry_impl.cc:1273: memcpy(buf->data(), stream_0_data_->data() + offset, read_size); Seems like a null deref with stream_0_data_ if a Read is issued immediately after an optimistic Create. Why should we be having the uninitialized state with stream_0_data_ at all? It would be safer to just always have a growable buffer in place. Destroyed when the last refcount goes away. https://chromiumcodereview.appspot.com/23983005/diff/50001/net/disk_cache/sim... net/disk_cache/simple/simple_entry_impl.cc:1280: int SimpleEntryImpl::CopyStream0Data(net::IOBuffer* buf, hm, I dunno, it seems it would be a tiny bit less confusing to me to have it named as SetStream0Data() or MutateStream0Data(). Copy is a bit generic and raises questions like "From where and to where?" and "At which circumstance? Is it on reading or writing?". I am picky on naming. Sorry. https://chromiumcodereview.appspot.com/23983005/diff/50001/net/disk_cache/sim... net/disk_cache/simple/simple_entry_impl.cc:1303: // If |stream_0_data_| was extended. the extension until offset need to be s/./,/ s/need/needs/ s/zeroed/zero-filled/ In other words: "If |stream_0_data_| was extended, the extension until offset needs to be zero-filled." https://chromiumcodereview.appspot.com/23983005/diff/50001/net/disk_cache/sim... net/disk_cache/simple/simple_entry_impl.cc:1308: if (buf) why this check? does it happen in practice? I don't see this guard at line 1296 on similar occasion. https://chromiumcodereview.appspot.com/23983005/diff/50001/net/disk_cache/sim... net/disk_cache/simple/simple_entry_impl.cc:1318: data_size_[0]); this needs updating crc32s_end_offset_, I think it makes sense to reuse the code that does it: AdvanceCrc(buffer, length, stream_index) // maybe offset in the buffer as well, but seems not needed now this should update crc32s_[stream_index] and crc32s_end_offset_[stream_index] https://chromiumcodereview.appspot.com/23983005/diff/50001/net/disk_cache/sim... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://chromiumcodereview.appspot.com/23983005/diff/50001/net/disk_cache/sim... net/disk_cache/simple/simple_synchronous_entry.cc:144: using simple_util::ConvertEntryHashKeyToHexString; this one is not used https://chromiumcodereview.appspot.com/23983005/diff/50001/net/disk_cache/sim... net/disk_cache/simple/simple_synchronous_entry.cc:151: SimpleEntryStat::SimpleEntryStat() {} we do not need this constructor https://chromiumcodereview.appspot.com/23983005/diff/50001/net/disk_cache/sim... net/disk_cache/simple/simple_synchronous_entry.cc:314: const int64 file_offset = out_entry_stat->GetOffsetInFile( On 2013/09/17 15:07:43, gavinp wrote: > That doesn't look like an output parameter! Perhaps rename it to "entry_stat" ? > > Is this necessary, though? Is it legal to call ReadData for stream 0? If we make > that illegal, we can get away without this parameter being in/out. seems like a cool simple strategy to assert against reading from stream 0 https://chromiumcodereview.appspot.com/23983005/diff/50001/net/disk_cache/sim... File net/disk_cache/simple/simple_synchronous_entry.h (right): https://chromiumcodereview.appspot.com/23983005/diff/50001/net/disk_cache/sim... net/disk_cache/simple/simple_synchronous_entry.h:30: struct SimpleEntryStat { I am sorry to bother you with dumb work, but now it's a class, with new naming of the fields and such https://chromiumcodereview.appspot.com/23983005/diff/50001/net/disk_cache/sim... File net/disk_cache/simple/simple_util.h (right): https://chromiumcodereview.appspot.com/23983005/diff/50001/net/disk_cache/sim... net/disk_cache/simple/simple_util.h:60: // Given the stream index, returns the number of the file the stream is stored nit: please add an empty line above
https://chromiumcodereview.appspot.com/23983005/diff/50001/net/disk_cache/sim... File net/disk_cache/simple/simple_entry_impl.cc (right): https://chromiumcodereview.appspot.com/23983005/diff/50001/net/disk_cache/sim... net/disk_cache/simple/simple_entry_impl.cc:1280: int SimpleEntryImpl::CopyStream0Data(net::IOBuffer* buf, On 2013/09/17 16:40:52, pasko wrote: > hm, I dunno, it seems it would be a tiny bit less confusing to me to have it > named as SetStream0Data() or MutateStream0Data(). Copy is a bit generic and > raises questions like "From where and to where?" and "At which circumstance? Is > it on reading or writing?". > > I > am > picky > on > naming. > Sorry. I'd be OK with these being "ReadDataFromBuffer" and "WriteDataToBuffer", and even better if the buffer was an argument, and they were in the anonymous namespace?
https://chromiumcodereview.appspot.com/23983005/diff/50001/net/disk_cache/sim... File net/disk_cache/simple/simple_entry_impl.cc (right): https://chromiumcodereview.appspot.com/23983005/diff/50001/net/disk_cache/sim... net/disk_cache/simple/simple_entry_impl.cc:1280: int SimpleEntryImpl::CopyStream0Data(net::IOBuffer* buf, On 2013/09/17 17:09:04, gavinp wrote: > On 2013/09/17 16:40:52, pasko wrote: > > hm, I dunno, it seems it would be a tiny bit less confusing to me to have it > > named as SetStream0Data() or MutateStream0Data(). Copy is a bit generic and > > raises questions like "From where and to where?" and "At which circumstance? > Is > > it on reading or writing?". > > > > I > > am > > picky > > on > > naming. > > Sorry. > > I'd be OK with these being "ReadDataFromBuffer" and "WriteDataToBuffer", and > even better if the buffer was an argument, and they were in the anonymous > namespace? I don't think it is a good idea since the method uses a few more object members: have_written_, cache_type_, data_size_, crc32s_, crc32s_end_offset_.
only minor proposals in the synchronous entry. Looks as it should be. almost done, I'll look at tests now, my current understanding is that we can add tests later as followup changes, so we are very near https://chromiumcodereview.appspot.com/23983005/diff/50001/net/disk_cache/sim... File net/disk_cache/simple/simple_synchronous_entry.cc (left): https://chromiumcodereview.appspot.com/23983005/diff/50001/net/disk_cache/sim... net/disk_cache/simple/simple_synchronous_entry.cc:347: out_entry_stat->data_size[index] = offset + buf_len; nit: is there a reason to move this line above the truncation? In case truncation fails it seems more natural to leave the data size unchanged. This would be on a Doomed entry, so not too interesting, but still raises unnecessary questions when reading. https://chromiumcodereview.appspot.com/23983005/diff/50001/net/disk_cache/sim... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://chromiumcodereview.appspot.com/23983005/diff/50001/net/disk_cache/sim... net/disk_cache/simple/simple_synchronous_entry.cc:314: const int64 file_offset = out_entry_stat->GetOffsetInFile( On 2013/09/17 16:40:52, pasko wrote: > On 2013/09/17 15:07:43, gavinp wrote: > > That doesn't look like an output parameter! Perhaps rename it to "entry_stat" > ? > > > > Is this necessary, though? Is it legal to call ReadData for stream 0? If we > make > > that illegal, we can get away without this parameter being in/out. > > seems like a cool simple strategy to assert against reading from stream 0 .. but on the other hand I also like the uniformity of using the SimpleEntryStat::GetOffsetInFile() here https://chromiumcodereview.appspot.com/23983005/diff/50001/net/disk_cache/sim... net/disk_cache/simple/simple_synchronous_entry.cc:338: int index = in_entry_op.index; please add a DCHECK_NE(0, index), similarly in SSE::ReadData https://chromiumcodereview.appspot.com/23983005/diff/50001/net/disk_cache/sim... net/disk_cache/simple/simple_synchronous_entry.cc:401: DLOG(INFO) << "Eof record had bad crc."; nit: s/Eof/EOF/, same in other places https://chromiumcodereview.appspot.com/23983005/diff/50001/net/disk_cache/sim... net/disk_cache/simple/simple_synchronous_entry.cc:454: CHECK(did_close_file); we don't want to crash the browser here on a file error, so DCHECK https://chromiumcodereview.appspot.com/23983005/diff/50001/net/disk_cache/sim... net/disk_cache/simple/simple_synchronous_entry.cc:576: // initialized correctly. Let's write a more detailed comment here. The 'data' is confusing. Here we can explain that sizes of the data streams are not known before the key length is not read, as well as size of the stream 0 in the final EOF record. The [i + 1] is a puzzle, explained in InitializeForOpen, but given the "elegance" of the trick, please explain it in the comment as well. Frankly, I think this should be refactored to avoid having tricks like these, but that's certainly for later CLs. https://chromiumcodereview.appspot.com/23983005/diff/50001/net/disk_cache/sim... net/disk_cache/simple/simple_synchronous_entry.cc:725: out_entry_stat->data_size[0] = stream_0_size; so the function ReadAndValidateStream0() also updates data_size[*] which is counter-intuitive, but I don't know what to suggest beyond a refactoring to initialize all stream sizes early. I believe this should be postponed for later. The quick hack to keep a file size in data_size has just backfired. https://chromiumcodereview.appspot.com/23983005/diff/50001/net/disk_cache/sim... net/disk_cache/simple/simple_synchronous_entry.cc:729: *stream_0_data = NULL; is there a reason in touching this here? the value would not be checked on the error path https://chromiumcodereview.appspot.com/23983005/diff/50001/net/disk_cache/sim... net/disk_cache/simple/simple_synchronous_entry.cc:769: return net::ERR_CACHE_CHECKSUM_READ_FAILURE; fun fact: cache checksum read failure can happen even if we did not have the checksum in the EOF record
https://chromiumcodereview.appspot.com/23983005/diff/50001/net/disk_cache/sim... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://chromiumcodereview.appspot.com/23983005/diff/50001/net/disk_cache/sim... net/disk_cache/simple/simple_synchronous_entry.cc:442: if (WritePlatformFile(files_[file_index], With your other CL for testing all streams (crrev.com/23702024 awesome change btw) I found an interesting case. If the file was previously larger, this write would not change the size, hence we'd get wrong stream sizes at the next open. this is fixable with: if (index != 1 && !TruncatePlatformFile(files_[file_index], eof_offset)) { // handle error } The "index != 1" is necessary because we write EOFs in the backwards order. This cute detail needs a comment.
https://codereview.chromium.org/23983005/diff/37001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_entry_impl.cc (right): https://codereview.chromium.org/23983005/diff/37001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_entry_impl.cc:807: buf_len = std::min(buf_len, GetDataSize(stream_index) - offset); On 2013/09/17 14:17:36, pasko wrote: > we recompute the buf_len here, and in ReadStream0Data(), can we just instead > assume in ReadStream0Data() that buf_len does not overflow? Saves one line and > moves this line above a little. Done. https://codereview.chromium.org/23983005/diff/37001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_entry_impl.cc:872: // PostTask prevents creating a loop when calling the callback directly. On 2013/09/17 14:17:36, pasko wrote: > nit: > > this comment is not necessary, we consistently do this sort of thing all the > time. It is first commented out on PostClientCallback() which should be enough. Done. https://codereview.chromium.org/23983005/diff/37001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_entry_impl.cc:915: have_written_[0] = true; On 2013/09/17 14:17:36, pasko wrote: > this means we would loose the CRC on stream 0 after reading and not changing the > stream Done. https://codereview.chromium.org/23983005/diff/37001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_entry_impl.cc:976: crc_check_state_[0] = CRC_CHECK_DONE; On 2013/09/17 14:17:36, pasko wrote: > this should also update the crc32s_end_offset_[0], otherwise we would loose the > CRC in the EOF of stream 0. > > I would be extremely thankful if you wrote a test that: > opens > writes stream 0 and 1 sequentially > closes > opens > appends stream 1 > closes > verifies that both streams still have CRC on disk Done. https://codereview.chromium.org/23983005/diff/37001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_entry_impl.cc:1286: // and size changes of the headers. Also, supports writes to stream 0 that On 2013/09/17 14:17:36, pasko wrote: > nit: "the size and size changes"? > > I do not see the purpose of the sentence: "Also, supports writes to stream 0 > that violate those assumptions." Probably you mean s/supports/support/? It would > be more clear it it were written to say that generally the contract requires > correctl behavior with other access patterns. > > Recently I've come to absolutely *love* the phrase "access patterns", so deep. Done. https://codereview.chromium.org/23983005/diff/37001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_entry_impl.cc:1292: int data_size = data_size_[0]; On 2013/09/17 14:17:36, pasko wrote: > In this file we tried to remain consistent on always calling GetDataSize(x) when > reading data_size_[x] for all x. It is not much more readable, just more > consistent. Please maintain this consistency. Done. https://codereview.chromium.org/23983005/diff/37001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_entry_impl.h (right): https://codereview.chromium.org/23983005/diff/37001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_entry_impl.h:261: bool truncate); On 2013/09/17 14:17:36, pasko wrote: > since this function does not change the state of |buf|, it should be a const > reference. Also the comment is confusing, one might think that "the buffer kept > in memory" is |buf|. I would rephrase it like this: "Copies data from |buf| to > the internal in-memory buffer for stream 0." > > nit: it would be useful to add a clarification that |truncate| if set to true > causes the target buffer to be truncated at |offset| before being written. net::IOBuffer::data() is not const, and used in this function. Done for the comment. https://codereview.chromium.org/23983005/diff/37001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_entry_impl.h:323: // Stream 0 is kept in memory because it is stored in the same file than On 2013/09/17 14:17:36, pasko wrote: > s/than/as/ > > s/(to save diskspace)// <-- since we also save on file the amount of open file > descriptors, which is arguably more important Done. https://codereview.chromium.org/23983005/diff/37001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_index_file.cc (right): https://codereview.chromium.org/23983005/diff/37001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_index_file.cc:120: // Summing up the total size of the entry through all the *_[0-2] files On 2013/09/17 14:17:36, pasko wrote: > this comment needs to be updated, there is no longer [0-2] Done. https://codereview.chromium.org/23983005/diff/37001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_index_file.cc:415: COMPILE_ASSERT(kSimpleEntryFileCount == 2, On 2013/09/17 14:17:36, pasko wrote: > Please remove the assertion and the TODO. The assertion does not make sense > (there is no "format" for file entries any more, we just go through all files > that start with something looking like a key, with a length of filename equal to > exact number). The TODO will never get done, obviously. Also I like how the word > "Fix" is entertaining in the context. Done. https://codereview.chromium.org/23983005/diff/37001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/23983005/diff/37001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_synchronous_entry.cc:409: if (!stream_0_data) { On 2013/09/17 15:07:43, gavinp wrote: > On 2013/09/17 14:29:48, clamy wrote: > > On 2013/09/17 14:25:33, gavinp wrote: > > > On 2013/09/17 13:59:42, clamy wrote: > > > > On 2013/09/17 13:27:11, gavinp wrote: > > > > > What's happening in this case? > > > > > > > > If we were supposed to have data and we don't, it's an error. We are not > > > > supposed to though, so I could replace it with a DCHECK. > > > > > > In practice we always have data in stream 0, but isn't it legal to have > > nothing? > > > > If we get non 0 size in data_size[0], then we expect to have something in > stream > > 0, but not actual data is provided. Wouldn't that be non legal ? > > Aha, yup. So you were right the first time, this should be a DCHECK. We > shouldn't check for impossible conditions. With the other modifications that went in, we should always have a stream_0_data buffer, so I added a DCHECK at the beginning of the function. https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_entry_impl.cc (right): https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_entry_impl.cc:799: // Since stream 0 data is kept in memory, it is read immediately. On 2013/09/17 15:07:43, gavinp wrote: > Why not catch this case in ReadData, so we can return synchronously there? Saves > a posttask for the callback which seems like a performance win. > > Same for WriteData? Returning with net::OK in ReadData causes some unit tests to fail, since they expect net::ERR_IO_PENDING. Similarly, clients may not expect an immediate return. For WriteData, it was there initially and Egor asked me to take it out in order to make the code more readable. https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_entry_impl.cc:870: // Since stream 0 data is kept in memory, it will be written immediatly. On 2013/09/17 16:40:52, pasko wrote: > add extra empty line above please, I just like when a block gets commented and > preceded by an empty line. Moving DCHECK below the comment is also possible Done. https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_entry_impl.cc:1273: memcpy(buf->data(), stream_0_data_->data() + offset, read_size); On 2013/09/17 16:40:52, pasko wrote: > Seems like a null deref with stream_0_data_ if a Read is issued immediately > after an optimistic Create. Why should we be having the uninitialized state with > stream_0_data_ at all? It would be safer to just always have a growable buffer > in place. Destroyed when the last refcount goes away. This code is called from ReadInternal, which is called by the operation queue. The operation queue only calls the next operation when creates returns. But added the growable buffer nonetheless. https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_entry_impl.cc:1280: int SimpleEntryImpl::CopyStream0Data(net::IOBuffer* buf, On 2013/09/18 08:14:10, pasko wrote: > On 2013/09/17 17:09:04, gavinp wrote: > > On 2013/09/17 16:40:52, pasko wrote: > > > hm, I dunno, it seems it would be a tiny bit less confusing to me to have it > > > named as SetStream0Data() or MutateStream0Data(). Copy is a bit generic and > > > raises questions like "From where and to where?" and "At which circumstance? > > Is > > > it on reading or writing?". > > > > > > I > > > am > > > picky > > > on > > > naming. > > > Sorry. > > > > I'd be OK with these being "ReadDataFromBuffer" and "WriteDataToBuffer", and > > even better if the buffer was an argument, and they were in the anonymous > > namespace? > > I don't think it is a good idea since the method uses a few more object members: > have_written_, cache_type_, data_size_, crc32s_, crc32s_end_offset_. I went with SetStream0Data, and as Egor pointed out, given the number of class members used, I can hardly put it in an anonymous namespace. https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_entry_impl.cc:1303: // If |stream_0_data_| was extended. the extension until offset need to be On 2013/09/17 16:40:52, pasko wrote: > s/./,/ > s/need/needs/ > s/zeroed/zero-filled/ > > In other words: "If |stream_0_data_| was extended, the extension until offset > needs to be zero-filled." Done. https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_entry_impl.cc:1308: if (buf) On 2013/09/17 16:40:52, pasko wrote: > why this check? does it happen in practice? I don't see this guard at line 1296 > on similar occasion. Some unit tests like 0-length truncating writes with a null buffer as a parameter. https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_entry_impl.cc:1318: data_size_[0]); On 2013/09/17 16:40:52, pasko wrote: > this needs updating crc32s_end_offset_, I think it makes sense to reuse the code > that does it: > AdvanceCrc(buffer, length, stream_index) // maybe offset in the buffer as well, > but seems not needed now > > this should update crc32s_[stream_index] and crc32s_end_offset_[stream_index] Done. https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_entry_impl.h (right): https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_entry_impl.h:325: // stream 1 on disk (to save diskspace). This strategy allows stream 1 to On 2013/09/17 15:07:43, gavinp wrote: > Nit: disk space is two words. Done. https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_synchronous_entry.cc (left): https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_synchronous_entry.cc:347: out_entry_stat->data_size[index] = offset + buf_len; On 2013/09/18 12:03:19, pasko wrote: > nit: > > is there a reason to move this line above the truncation? In case truncation > fails it seems more natural to leave the data size unchanged. This would be on a > Doomed entry, so not too interesting, but still raises unnecessary questions > when reading. Otherwise out_entry_stat will return the wrong offset for truncation. https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_synchronous_entry.cc:144: using simple_util::ConvertEntryHashKeyToHexString; On 2013/09/17 16:40:52, pasko wrote: > this one is not used Done. https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_synchronous_entry.cc:151: SimpleEntryStat::SimpleEntryStat() {} On 2013/09/17 16:40:52, pasko wrote: > we do not need this constructor Done. https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_synchronous_entry.cc:314: const int64 file_offset = out_entry_stat->GetOffsetInFile( On 2013/09/17 15:07:43, gavinp wrote: > That doesn't look like an output parameter! Perhaps rename it to "entry_stat" ? > > Is this necessary, though? Is it legal to call ReadData for stream 0? If we make > that illegal, we can get away without this parameter being in/out. All offset functions are now used part of EntryStat, so we need one to compute the offset. I don't really want to reintroduce them. I could create a fake entry stat with 0 in data_size, since it is not called on stream 0, wdyt? https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_synchronous_entry.cc:338: int index = in_entry_op.index; On 2013/09/18 12:03:19, pasko wrote: > please add a DCHECK_NE(0, index), similarly in SSE::ReadData Done. https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_synchronous_entry.cc:401: DLOG(INFO) << "Eof record had bad crc."; On 2013/09/18 12:03:19, pasko wrote: > nit: s/Eof/EOF/, same in other places Done. https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_synchronous_entry.cc:442: if (WritePlatformFile(files_[file_index], On 2013/09/18 15:08:16, pasko wrote: > With your other CL for testing all streams (crrev.com/23702024 awesome change > btw) I found an interesting case. If the file was previously larger, this write > would not change the size, hence we'd get wrong stream sizes at the next open. > > this is fixable with: > if (index != 1 && !TruncatePlatformFile(files_[file_index], eof_offset)) { > // handle error > } > > The "index != 1" is necessary because we write EOFs in the backwards order. This > cute detail needs a comment. Done. https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_synchronous_entry.cc:454: CHECK(did_close_file); On 2013/09/18 12:03:19, pasko wrote: > we don't want to crash the browser here on a file error, so DCHECK Done. https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_synchronous_entry.cc:576: // initialized correctly. On 2013/09/18 12:03:19, pasko wrote: > Let's write a more detailed comment here. The 'data' is confusing. Here we can > explain that sizes of the data streams are not known before the key length is > not read, as well as size of the stream 0 in the final EOF record. > > The [i + 1] is a puzzle, explained in InitializeForOpen, but given the > "elegance" of the trick, please explain it in the comment as well. > > Frankly, I think this should be refactored to avoid having tricks like these, > but that's certainly for later CLs. Done. https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_synchronous_entry.cc:729: *stream_0_data = NULL; On 2013/09/18 12:03:19, pasko wrote: > is there a reason in touching this here? the value would not be checked on the > error path Done. https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_synchronous_entry.h (right): https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_synchronous_entry.h:30: struct SimpleEntryStat { On 2013/09/17 16:40:52, pasko wrote: > I am sorry to bother you with dumb work, but now it's a class, with new naming > of the fields and such Done. https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_util.h (right): https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_util.h:60: // Given the stream index, returns the number of the file the stream is stored On 2013/09/17 16:40:52, pasko wrote: > nit: please add an empty line above Done.
LGTM with comments addressed, congrats! please also wait for Gavin's approval. Gavin, please take a careful look again (asap). note: this change may need to be updated if the upgrade infrastructure lands first, I can help with that. https://codereview.chromium.org/23983005/diff/70001/net/disk_cache/entry_unit... File net/disk_cache/entry_unittest.cc (right): https://codereview.chromium.org/23983005/diff/70001/net/disk_cache/entry_unit... net/disk_cache/entry_unittest.cc:3501: // Write something in stream0. s/in/to/ https://codereview.chromium.org/23983005/diff/70001/net/disk_cache/entry_unit... net/disk_cache/entry_unittest.cc:3545: // Reduce stream1. Shrink? https://codereview.chromium.org/23983005/diff/70001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_entry_impl.cc (right): https://codereview.chromium.org/23983005/diff/70001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_entry_impl.cc:903: // Writing on stream 1 affects the placement of stream 0 in the file. please add: ", the EOF record will need to be written. https://codereview.chromium.org/23983005/diff/70001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_entry_impl.h (right): https://codereview.chromium.org/23983005/diff/70001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_entry_impl.h:36: struct SimpleEntryStat; the compiler on Windows is complaining that you should update it to 'class' here. https://codereview.chromium.org/23983005/diff/70001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_entry_impl.h:267: void AdvanceCrc(net::IOBuffer* buffer, we are not updating the buffer here, should be a const reference then? https://codereview.chromium.org/23983005/diff/70001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/23983005/diff/70001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_synchronous_entry.cc:447: // resizing of the file is handled in SimpleSynchronousEntry::WriteData. Nice comment. WriteData() (with parentheses) looks more like a function, please add that. https://codereview.chromium.org/23983005/diff/70001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_synchronous_entry.h (right): https://codereview.chromium.org/23983005/diff/70001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_synchronous_entry.h:30: class SimpleEntryStat { Now please add a one-sentence comment describing what the class does. https://codereview.chromium.org/23983005/diff/70001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_synchronous_entry.h:50: int32 data_size(int i) const { return data_size_[i]; } I would prefer s/i/stream/ or s/i/stream_index/
bad news: DiskCacheEntryTest.SimpleCacheNonSequentialWrite fails when run for stream 0, please look at that
LGTM. Is the test CL up to date now? https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_entry_impl.cc (right): https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_entry_impl.cc:799: // Since stream 0 data is kept in memory, it is read immediately. On 2013/09/18 16:17:15, clamy wrote: > On 2013/09/17 15:07:43, gavinp wrote: > > Why not catch this case in ReadData, so we can return synchronously there? > Saves > > a posttask for the callback which seems like a performance win. > > > > Same for WriteData? > > Returning with net::OK in ReadData causes some unit tests to fail, since they > expect net::ERR_IO_PENDING. Similarly, clients may not expect an immediate > return. > > For WriteData, it was there initially and Egor asked me to take it out in order > to make the code more readable. It makes me sad that we're slowing down our code just for the unit tests. But clients should all expect an immediate return is possible; after all, the in memory backend returns immediately on everything. Can we at least move this code to ReadData/WriteData and add a TODO to be synchronous? If you look again at the unit tests, you might see that they already special case the in memory backend, we could certainly use that for stream 0 in simple cache, too? https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_synchronous_entry.cc:314: const int64 file_offset = out_entry_stat->GetOffsetInFile( On 2013/09/18 16:17:15, clamy wrote: > On 2013/09/17 15:07:43, gavinp wrote: > > That doesn't look like an output parameter! Perhaps rename it to "entry_stat" > ? > > > > Is this necessary, though? Is it legal to call ReadData for stream 0? If we > make > > that illegal, we can get away without this parameter being in/out. > > All offset functions are now used part of EntryStat, so we need one to compute > the offset. I don't really want to reintroduce them. I could create a fake entry > stat with 0 in data_size, since it is not called on stream 0, wdyt? Or you could just rename the parameter to remove the "out_" and call that good enough. Probably we should have the asserts though, just because it's too weird if anyone reads from stream 0.
Thanks Egor for the review! The NonSequentialWrite test is passing on all streams taken independently, so I suspect this is a problem with the patch to have it run on all three streams. https://codereview.chromium.org/23983005/diff/70001/net/disk_cache/entry_unit... File net/disk_cache/entry_unittest.cc (right): https://codereview.chromium.org/23983005/diff/70001/net/disk_cache/entry_unit... net/disk_cache/entry_unittest.cc:3501: // Write something in stream0. On 2013/09/18 16:53:56, pasko wrote: > s/in/to/ Done. https://codereview.chromium.org/23983005/diff/70001/net/disk_cache/entry_unit... net/disk_cache/entry_unittest.cc:3545: // Reduce stream1. On 2013/09/18 16:53:56, pasko wrote: > Shrink? Done. https://codereview.chromium.org/23983005/diff/70001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_entry_impl.cc (right): https://codereview.chromium.org/23983005/diff/70001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_entry_impl.cc:903: // Writing on stream 1 affects the placement of stream 0 in the file. On 2013/09/18 16:53:56, pasko wrote: > please add: ", the EOF record will need to be written. Done. https://codereview.chromium.org/23983005/diff/70001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_entry_impl.h (right): https://codereview.chromium.org/23983005/diff/70001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_entry_impl.h:36: struct SimpleEntryStat; On 2013/09/18 16:53:56, pasko wrote: > the compiler on Windows is complaining that you should update it to 'class' > here. Done. Actually most bots complain. I don't understand how it as even compiling on my machine. https://codereview.chromium.org/23983005/diff/70001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_entry_impl.h:267: void AdvanceCrc(net::IOBuffer* buffer, On 2013/09/18 16:53:56, pasko wrote: > we are not updating the buffer here, should be a const reference then? No we use net::IOBuffer::data() which is not const. https://codereview.chromium.org/23983005/diff/70001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/23983005/diff/70001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_synchronous_entry.cc:447: // resizing of the file is handled in SimpleSynchronousEntry::WriteData. On 2013/09/18 16:53:56, pasko wrote: > Nice comment. WriteData() (with parentheses) looks more like a function, please > add that. Done. https://codereview.chromium.org/23983005/diff/70001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_synchronous_entry.h (right): https://codereview.chromium.org/23983005/diff/70001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_synchronous_entry.h:30: class SimpleEntryStat { On 2013/09/18 16:53:56, pasko wrote: > Now please add a one-sentence comment describing what the class does. Done. https://codereview.chromium.org/23983005/diff/70001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_synchronous_entry.h:50: int32 data_size(int i) const { return data_size_[i]; } On 2013/09/18 16:53:56, pasko wrote: > I would prefer s/i/stream/ or s/i/stream_index/ Done.
A note on my LGTM: you can do what you will with the synchronous read/write; but please at least TODO that, since it's an easy improvement.
https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_entry_impl.cc (right): https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_entry_impl.cc:799: // Since stream 0 data is kept in memory, it is read immediately. On 2013/09/18 17:20:35, gavinp wrote: > On 2013/09/18 16:17:15, clamy wrote: > > On 2013/09/17 15:07:43, gavinp wrote: > > > Why not catch this case in ReadData, so we can return synchronously there? > > Saves > > > a posttask for the callback which seems like a performance win. > > > > > > Same for WriteData? > > > > Returning with net::OK in ReadData causes some unit tests to fail, since they > > expect net::ERR_IO_PENDING. Similarly, clients may not expect an immediate > > return. > > > > For WriteData, it was there initially and Egor asked me to take it out in > order > > to make the code more readable. > > It makes me sad that we're slowing down our code just for the unit tests. But > clients should all expect an immediate return is possible; after all, the in > memory backend returns immediately on everything. > > Can we at least move this code to ReadData/WriteData and add a TODO to be > synchronous? If you look again at the unit tests, you might see that they > already special case the in memory backend, we could certainly use that for > stream 0 in simple cache, too? I'm sorry, I just did not like to have an optimization in WriteData() when it was not strictly necessary. Now I think the optimization itself is not a complicated piece of code and can stay for WriteData(). With ReadData() I am less confident, we may investigate later what we can do for tests that are expecting ERR_IO_PENDING, and whether it is worth it. Put a TODO here?
Great thanks guys! https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_entry_impl.cc (right): https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_entry_impl.cc:799: // Since stream 0 data is kept in memory, it is read immediately. On 2013/09/18 17:35:27, pasko wrote: > On 2013/09/18 17:20:35, gavinp wrote: > > On 2013/09/18 16:17:15, clamy wrote: > > > On 2013/09/17 15:07:43, gavinp wrote: > > > > Why not catch this case in ReadData, so we can return synchronously there? > > > Saves > > > > a posttask for the callback which seems like a performance win. > > > > > > > > Same for WriteData? > > > > > > Returning with net::OK in ReadData causes some unit tests to fail, since > they > > > expect net::ERR_IO_PENDING. Similarly, clients may not expect an immediate > > > return. > > > > > > For WriteData, it was there initially and Egor asked me to take it out in > > order > > > to make the code more readable. > > > > It makes me sad that we're slowing down our code just for the unit tests. But > > clients should all expect an immediate return is possible; after all, the in > > memory backend returns immediately on everything. > > > > Can we at least move this code to ReadData/WriteData and add a TODO to be > > synchronous? If you look again at the unit tests, you might see that they > > already special case the in memory backend, we could certainly use that for > > stream 0 in simple cache, too? > > I'm sorry, I just did not like to have an optimization in WriteData() when it > was not strictly necessary. Now I think the optimization itself is not a > complicated piece of code and can stay for WriteData(). > > With ReadData() I am less confident, we may investigate later what we can do for > tests that are expecting ERR_IO_PENDING, and whether it is worth it. > > Put a TODO here? Done. https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... File net/disk_cache/simple/simple_synchronous_entry.cc (right): https://codereview.chromium.org/23983005/diff/50001/net/disk_cache/simple/sim... net/disk_cache/simple/simple_synchronous_entry.cc:314: const int64 file_offset = out_entry_stat->GetOffsetInFile( On 2013/09/18 17:20:35, gavinp wrote: > On 2013/09/18 16:17:15, clamy wrote: > > On 2013/09/17 15:07:43, gavinp wrote: > > > That doesn't look like an output parameter! Perhaps rename it to > "entry_stat" > > ? > > > > > > Is this necessary, though? Is it legal to call ReadData for stream 0? If we > > make > > > that illegal, we can get away without this parameter being in/out. > > > > All offset functions are now used part of EntryStat, so we need one to compute > > the offset. I don't really want to reintroduce them. I could create a fake > entry > > stat with 0 in data_size, since it is not called on stream 0, wdyt? > > Or you could just rename the parameter to remove the "out_" and call that good > enough. Probably we should have the asserts though, just because it's too weird > if anyone reads from stream 0. Done.
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/clamy@chromium.org/23983005/87001
Step "update" is always a major failure. Look at the try server FAQ for more details. http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=android_cl...
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/clamy@chromium.org/23983005/87001
Message was sent while issue was closed.
Change committed as 223989 |