|
|
Created:
8 years, 4 months ago by Nico Modified:
8 years, 4 months ago CC:
chromium-reviews, Scott Hess - ex-Googler, Roger Tawa OOO till Jul 10th Base URL:
svn://svn.chromium.org/chrome/trunk/src Visibility:
Public. |
Descriptionrlz/mac: Make sure a crashing process doesn't leave a stale lockfile behind.
Don't use NSDistributedLock, which uses a lock that isn't cleaned up on
unexpected program termination, and which strongly recommends to not
call -breakLock (which makes this class fairly pointless).
Instead, use a flock(), which is cleaned up by the OS on process exit.
Suggested by shess@.
BUG=141108
Committed: https://src.chromium.org/viewvc/chrome?view=rev&revision=151723
Patch Set 1 #Patch Set 2 : flock #Patch Set 3 : xxx #Patch Set 4 : actually #Patch Set 5 : . #
Total comments: 23
Patch Set 6 : . #
Total comments: 8
Patch Set 7 : more comments #
Total comments: 4
Patch Set 8 : . #
Total comments: 2
Messages
Total messages: 15 (0 generated)
On second thought, the "breaks test" comment I left on the bug is rubbish: The code _does_ discover that particular failure case faster, and the new test I'm adding does test the sleep behavior.
I'm sure the headlines will read "Chromium includes Flock". http://codereview.chromium.org/10823329/diff/9002/rlz/mac/lib/rlz_value_store... File rlz/mac/lib/rlz_value_store_mac.mm (right): http://codereview.chromium.org/10823329/diff/9002/rlz/mac/lib/rlz_value_store... rlz/mac/lib/rlz_value_store_mac.mm:238: int file_lock_; ScopedFD? http://codereview.chromium.org/10823329/diff/9002/rlz/mac/lib/rlz_value_store... rlz/mac/lib/rlz_value_store_mac.mm:272: file_lock_ = 0; 0 is a valid fd. Suggest using -1 for invalid, just to be safe. http://codereview.chromium.org/10823329/diff/9002/rlz/mac/lib/rlz_value_store... rlz/mac/lib/rlz_value_store_mac.mm:278: while ((flock_result = flock(file_lock_, LOCK_EX | LOCK_NB)) == -1 && errno == EWOULDBLOCK && Then you can lift the if() clause and drop the else clause because it's already handled outside the loop. http://codereview.chromium.org/10823329/diff/9002/rlz/mac/lib/rlz_value_store... rlz/mac/lib/rlz_value_store_mac.mm:282: elapsedMS += kSleepPerTryMS; usleep() could be interrupted by a signal, in which case elapsedMS would be bogus. Admittedly at 200ms per shot, not super bogus, but still. Can you use TimeTicks() or something or the sort to drive this? Though I guess it's just as good as before. Also, how much contention could there be? As implemented, someone could get denied the lock while someone else jumps in after and claims it. Though, again, just as good as before.
http://codereview.chromium.org/10823329/diff/9002/rlz/lib/rlz_lib_test.cc File rlz/lib/rlz_lib_test.cc (right): http://codereview.chromium.org/10823329/diff/9002/rlz/lib/rlz_lib_test.cc#new... rlz/lib/rlz_lib_test.cc:800: Blank line nit. http://codereview.chromium.org/10823329/diff/9002/rlz/lib/rlz_lib_test.cc#new... rlz/lib/rlz_lib_test.cc:848: // make the test fail. It does however cause lots of "check failed" You could make the test fail in the parent by signaling a bad condition with a special exit code, and checking the exit statuses down in your waitpid. http://codereview.chromium.org/10823329/diff/9002/rlz/mac/lib/rlz_value_store... File rlz/mac/lib/rlz_value_store_mac.mm (right): http://codereview.chromium.org/10823329/diff/9002/rlz/mac/lib/rlz_value_store... rlz/mac/lib/rlz_value_store_mac.mm:272: file_lock_ = 0; But 0 is a valid FD. You should rewrite things to use -1 as the invalid lock value (your initializer on lines 239-244, your CHECK on line 269, your reset on line 285, your test on line 297, etc.) http://codereview.chromium.org/10823329/diff/9002/rlz/mac/lib/rlz_value_store... rlz/mac/lib/rlz_value_store_mac.mm:277: int elapsedMS = 0; elapsed_ms http://codereview.chromium.org/10823329/diff/9002/rlz/mac/lib/rlz_value_store... rlz/mac/lib/rlz_value_store_mac.mm:278: while ((flock_result = flock(file_lock_, LOCK_EX | LOCK_NB)) == -1 && HANDLE_EINTR http://codereview.chromium.org/10823329/diff/9002/rlz/mac/lib/rlz_value_store... rlz/mac/lib/rlz_value_store_mac.mm:291: close(file_lock_); This duplicates what’s in the loop. I think you only need this copy.
Also added the required third argument to open() 9_9 http://codereview.chromium.org/10823329/diff/9002/rlz/lib/rlz_lib_test.cc File rlz/lib/rlz_lib_test.cc (right): http://codereview.chromium.org/10823329/diff/9002/rlz/lib/rlz_lib_test.cc#new... rlz/lib/rlz_lib_test.cc:848: // make the test fail. It does however cause lots of "check failed" On 2012/08/15 00:50:54, Mark Mentovai wrote: > You could make the test fail in the parent by signaling a bad condition with a > special exit code, and checking the exit statuses down in your waitpid. Done. http://codereview.chromium.org/10823329/diff/9002/rlz/mac/lib/rlz_value_store... File rlz/mac/lib/rlz_value_store_mac.mm (right): http://codereview.chromium.org/10823329/diff/9002/rlz/mac/lib/rlz_value_store... rlz/mac/lib/rlz_value_store_mac.mm:238: int file_lock_; On 2012/08/15 00:44:38, shess wrote: > ScopedFD? It needs to be closed while the mutex is still held. http://codereview.chromium.org/10823329/diff/9002/rlz/mac/lib/rlz_value_store... rlz/mac/lib/rlz_value_store_mac.mm:272: file_lock_ = 0; On 2012/08/15 00:50:54, Mark Mentovai wrote: > But 0 is a valid FD. You should rewrite things to use -1 as the invalid lock > value (your initializer on lines 239-244, your CHECK on line 269, your reset on > line 285, your test on line 297, etc.) Done. (But in practice 0 is stdin, so in practice this should've been fine as is, right?) http://codereview.chromium.org/10823329/diff/9002/rlz/mac/lib/rlz_value_store... rlz/mac/lib/rlz_value_store_mac.mm:277: int elapsedMS = 0; On 2012/08/15 00:50:54, Mark Mentovai wrote: > elapsed_ms Done. http://codereview.chromium.org/10823329/diff/9002/rlz/mac/lib/rlz_value_store... rlz/mac/lib/rlz_value_store_mac.mm:278: while ((flock_result = flock(file_lock_, LOCK_EX | LOCK_NB)) == -1 && On 2012/08/15 00:50:54, Mark Mentovai wrote: > HANDLE_EINTR Done. http://codereview.chromium.org/10823329/diff/9002/rlz/mac/lib/rlz_value_store... rlz/mac/lib/rlz_value_store_mac.mm:282: elapsedMS += kSleepPerTryMS; On 2012/08/15 00:44:38, shess wrote: > usleep() could be interrupted by a signal, in which case elapsedMS would be > bogus. Admittedly at 200ms per shot, not super bogus, but still. Can you use > TimeTicks() or something or the sort to drive this? Though I guess it's just as > good as before. Added HANDLE_EINTR here too. > Also, how much contention could there be? As implemented, someone could get > denied the lock while someone else jumps in after and claims it. Though, again, > just as good as before. In practice, chrome is the only app that uses rlz. (In theory, both chrome and canary could write to it, but canary is always unbranded and shouldn't use rlz), so I think that's fine as is. (Out of interest: What's a better way to do this, should there be contention?) http://codereview.chromium.org/10823329/diff/9002/rlz/mac/lib/rlz_value_store... rlz/mac/lib/rlz_value_store_mac.mm:291: close(file_lock_); On 2012/08/15 00:50:54, Mark Mentovai wrote: > This duplicates what’s in the loop. I think you only need this copy. Done, together with shess's suggestion above.
On 2012/08/15 00:44:38, shess wrote: > I'm sure the headlines will read "Chromium includes Flock". I found this while looking up how flock()s deal with being forked: http://www.linux-mag.com/id/8613/ (before I looked at `man 2 flock`)
http://codereview.chromium.org/10823329/diff/9002/rlz/mac/lib/rlz_value_store... File rlz/mac/lib/rlz_value_store_mac.mm (right): http://codereview.chromium.org/10823329/diff/9002/rlz/mac/lib/rlz_value_store... rlz/mac/lib/rlz_value_store_mac.mm:272: file_lock_ = 0; Nico wrote: > On 2012/08/15 00:50:54, Mark Mentovai wrote: > > But 0 is a valid FD. You should rewrite things to use -1 as the invalid lock > > value (your initializer on lines 239-244, your CHECK on line 269, your reset > on > > line 285, your test on line 297, etc.) > > Done. (But in practice 0 is stdin, so in practice this should've been fine as > is, right?) You can close stdin. http://codereview.chromium.org/10823329/diff/9002/rlz/mac/lib/rlz_value_store... rlz/mac/lib/rlz_value_store_mac.mm:282: elapsedMS += kSleepPerTryMS; On 2012/08/15 03:24:12, Nico wrote: > On 2012/08/15 00:44:38, shess wrote: > > usleep() could be interrupted by a signal, in which case elapsedMS would be > > bogus. Admittedly at 200ms per shot, not super bogus, but still. Can you use > > TimeTicks() or something or the sort to drive this? Though I guess it's just > as > > good as before. > > Added HANDLE_EINTR here too. HANDLE_EINTR doesn’t really help here because the goal is to get a timed sleep. In fact, HANDLE_EINTR may hurt because it will cause any interrupted sleep to be restarted for the full kSleepPerTryMS milliseconds, and in a pathological “signal storm,” you’ll never leave HANDLE_EINTR’s loop around usleep. http://codereview.chromium.org/10823329/diff/4002/rlz/lib/rlz_lib_test.cc File rlz/lib/rlz_lib_test.cc (right): http://codereview.chromium.org/10823329/diff/4002/rlz/lib/rlz_lib_test.cc#new... rlz/lib/rlz_lib_test.cc:865: if (waitpid(pids[i], &status, 0) != -1) HANDLE_EINTR http://codereview.chromium.org/10823329/diff/4002/rlz/lib/rlz_lib_test.cc#new... rlz/lib/rlz_lib_test.cc:866: EXPECT_EQ(0, WEXITSTATUS(status)); WEXITSTATUS is only valid if WIFEXITED is true. You should check WIFEXITED in addition to WEXITSTATUS. If WEXITED is false, the test should fail anyway since you’re expecting it to die by _exit, not by signal. http://codereview.chromium.org/10823329/diff/4002/rlz/mac/lib/rlz_value_store... File rlz/mac/lib/rlz_value_store_mac.mm (right): http://codereview.chromium.org/10823329/diff/4002/rlz/mac/lib/rlz_value_store... rlz/mac/lib/rlz_value_store_mac.mm:273: file_lock_ = open([lock_filename fileSystemRepresentation], O_CREAT, 0755); Use 0666 for the mode argument. There’s no need to set execute bits (0111), but you can let the umask do its job (don’t be afraid of the 022). http://codereview.chromium.org/10823329/diff/4002/rlz/mac/lib/rlz_value_store... rlz/mac/lib/rlz_value_store_mac.mm:283: ignore_result(HANDLE_EINTR(usleep(kSleepPerTryMS * 1000))); Remove HANDLE_EINTR as described. The correct way to implement a timer loop in light of possible signal interruption is to use nanosleep directly. http://codereview.chromium.org/10823329/diff/4002/rlz/mac/lib/rlz_value_store... rlz/mac/lib/rlz_value_store_mac.mm:301: close(file_lock_); You can put your unlock and close in HANDLE_EINTRs too.
http://codereview.chromium.org/10823329/diff/9002/rlz/mac/lib/rlz_value_store... File rlz/mac/lib/rlz_value_store_mac.mm (right): http://codereview.chromium.org/10823329/diff/9002/rlz/mac/lib/rlz_value_store... rlz/mac/lib/rlz_value_store_mac.mm:282: elapsedMS += kSleepPerTryMS; On 2012/08/15 13:36:28, Mark Mentovai wrote: > On 2012/08/15 03:24:12, Nico wrote: > > On 2012/08/15 00:44:38, shess wrote: > > > usleep() could be interrupted by a signal, in which case elapsedMS would be > > > bogus. Admittedly at 200ms per shot, not super bogus, but still. Can you > use > > > TimeTicks() or something or the sort to drive this? Though I guess it's > just > > as > > > good as before. > > > > Added HANDLE_EINTR here too. > > HANDLE_EINTR doesn’t really help here because the goal is to get a timed sleep. > In fact, HANDLE_EINTR may hurt because it will cause any interrupted sleep to be > restarted for the full kSleepPerTryMS milliseconds, and in a pathological > “signal storm,” you’ll never leave HANDLE_EINTR’s loop around usleep. Good (pathological) point! Removed it again. http://codereview.chromium.org/10823329/diff/4002/rlz/mac/lib/rlz_value_store... File rlz/mac/lib/rlz_value_store_mac.mm (right): http://codereview.chromium.org/10823329/diff/4002/rlz/mac/lib/rlz_value_store... rlz/mac/lib/rlz_value_store_mac.mm:301: close(file_lock_); On 2012/08/15 13:36:28, Mark Mentovai wrote: > You can put your unlock and close in HANDLE_EINTRs too. Done for close. man flock says it can't EINTR (it's supposed to be atomic)
http://codereview.chromium.org/10823329/diff/14001/rlz/lib/rlz_lib_test.cc File rlz/lib/rlz_lib_test.cc (right): http://codereview.chromium.org/10823329/diff/14001/rlz/lib/rlz_lib_test.cc#ne... rlz/lib/rlz_lib_test.cc:873: } Should this test be wrapped in a #if defined(OS_MACOSX) or something?
http://codereview.chromium.org/10823329/diff/4002/rlz/mac/lib/rlz_value_store... File rlz/mac/lib/rlz_value_store_mac.mm (right): http://codereview.chromium.org/10823329/diff/4002/rlz/mac/lib/rlz_value_store... rlz/mac/lib/rlz_value_store_mac.mm:301: close(file_lock_); Nico wrote: > On 2012/08/15 13:36:28, Mark Mentovai wrote: > > You can put your unlock and close in HANDLE_EINTRs too. > > Done for close. man flock says it can't EINTR (it's supposed to be atomic) Are you sure about that? The man page isn’t exhaustive. I can show you a flock that results in EINTR. This doesn’t have anything to do with atomicity. And you’ve used HANDLE_EINTR on line 280. http://codereview.chromium.org/10823329/diff/14001/rlz/lib/rlz_lib_test.cc File rlz/lib/rlz_lib_test.cc (right): http://codereview.chromium.org/10823329/diff/14001/rlz/lib/rlz_lib_test.cc#ne... rlz/lib/rlz_lib_test.cc:873: } Roger Tawa wrote: > Should this test be wrapped in a #if defined(OS_MACOSX) or something? It already is. http://codereview.chromium.org/10823329/diff/14001/rlz/mac/lib/rlz_value_stor... File rlz/mac/lib/rlz_value_store_mac.mm (right): http://codereview.chromium.org/10823329/diff/14001/rlz/mac/lib/rlz_value_stor... rlz/mac/lib/rlz_value_store_mac.mm:283: usleep(kSleepPerTryMS * 1000); So you’re just giving up, then? (I think it’s OK. But Scott raised the issue.)
http://codereview.chromium.org/10823329/diff/4002/rlz/mac/lib/rlz_value_store... File rlz/mac/lib/rlz_value_store_mac.mm (right): http://codereview.chromium.org/10823329/diff/4002/rlz/mac/lib/rlz_value_store... rlz/mac/lib/rlz_value_store_mac.mm:301: close(file_lock_); On 2012/08/15 16:10:00, Mark Mentovai wrote: > Nico wrote: > > On 2012/08/15 13:36:28, Mark Mentovai wrote: > > > You can put your unlock and close in HANDLE_EINTRs too. > > > > Done for close. man flock says it can't EINTR (it's supposed to be atomic) > > Are you sure about that? The man page isn’t exhaustive. I can show you a flock > that results in EINTR. This doesn’t have anything to do with atomicity. And > you’ve used HANDLE_EINTR on line 280. flock is even explicitly mentioned on the list in https://groups.google.com/forum/?fromgroups#!msg/chromium-dev/QaNbKkV9c6A/rQl... , so added a HANDLE_EINTR http://codereview.chromium.org/10823329/diff/14001/rlz/mac/lib/rlz_value_stor... File rlz/mac/lib/rlz_value_store_mac.mm (right): http://codereview.chromium.org/10823329/diff/14001/rlz/mac/lib/rlz_value_stor... rlz/mac/lib/rlz_value_store_mac.mm:283: usleep(kSleepPerTryMS * 1000); On 2012/08/15 16:10:00, Mark Mentovai wrote: > So you’re just giving up, then? Yes.
LGTM.
thakis@chromium.org wrote: > flock is even explicitly mentioned on the list in > https://groups.google.com/**forum/?fromgroups#!msg/** > chromium-dev/QaNbKkV9c6A/**rQlumxSIqKcJ%5B1-25%5D<https://groups.google.com/forum/?fromgroups#!msg/chromium-dev/QaNbKkV9c6A/rQlumxSIqKcJ%5B1-25%5D> > , so added a HANDLE_EINTR > These new Groups links don’t work at all. I guess you meant https://groups.google.com/group/chromium-dev/browse_thread/thread/41a35b2a457....
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/thakis@chromium.org/10823329/3005
lame-duck comments not needing any action. http://codereview.chromium.org/10823329/diff/9002/rlz/mac/lib/rlz_value_store... File rlz/mac/lib/rlz_value_store_mac.mm (right): http://codereview.chromium.org/10823329/diff/9002/rlz/mac/lib/rlz_value_store... rlz/mac/lib/rlz_value_store_mac.mm:238: int file_lock_; On 2012/08/15 03:24:12, Nico wrote: > On 2012/08/15 00:44:38, shess wrote: > > ScopedFD? > > It needs to be closed while the mutex is still held. That's what reset() is for. http://codereview.chromium.org/10823329/diff/9002/rlz/mac/lib/rlz_value_store... rlz/mac/lib/rlz_value_store_mac.mm:282: elapsedMS += kSleepPerTryMS; On 2012/08/15 03:24:12, Nico wrote: > On 2012/08/15 00:44:38, shess wrote: > > Also, how much contention could there be? As implemented, someone could get > > denied the lock while someone else jumps in after and claims it. Though, > again, > > just as good as before. > > In practice, chrome is the only app that uses rlz. (In theory, both chrome and > canary could write to it, but canary is always unbranded and shouldn't use rlz), > so I think that's fine as is. > > (Out of interest: What's a better way to do this, should there be contention?) I think the POSIXy way is something like: int end_time = time() + kTimeout; do { alarm(end_time - time()); rc = flock(fd_, LOCK_EX); } while (rc == -1 && errno == EINTR && time() < end_time); Of course, it needs to be rewritten with gettimeofday() and setitimer() to give better resolution, and someone would need to verify whether this works right with threading. Aaaaaand ... I guess it's still open for contention if you have other signals in the system. But if you're lucky enough not to have other signals, and not to need to re-lock while a previous lock's alarm is still pending, then you'll just wait until either you get the lock or the alarm fires. So another process couldn't snipe you. IOW, let's not go there. http://codereview.chromium.org/10823329/diff/3005/rlz/mac/lib/rlz_value_store... File rlz/mac/lib/rlz_value_store_mac.mm (right): http://codereview.chromium.org/10823329/diff/3005/rlz/mac/lib/rlz_value_store... rlz/mac/lib/rlz_value_store_mac.mm:301: ignore_result(HANDLE_EINTR(close(file_lock_))); The close() would suffice, I think. No problem, though.
http://codereview.chromium.org/10823329/diff/9002/rlz/mac/lib/rlz_value_store... File rlz/mac/lib/rlz_value_store_mac.mm (right): http://codereview.chromium.org/10823329/diff/9002/rlz/mac/lib/rlz_value_store... rlz/mac/lib/rlz_value_store_mac.mm:282: elapsedMS += kSleepPerTryMS; shess wrote: > On 2012/08/15 03:24:12, Nico wrote: > > On 2012/08/15 00:44:38, shess wrote: > > > Also, how much contention could there be? As implemented, someone could get > > > denied the lock while someone else jumps in after and claims it. Though, > > again, > > > just as good as before. > > > > In practice, chrome is the only app that uses rlz. (In theory, both chrome and > > canary could write to it, but canary is always unbranded and shouldn't use > rlz), > > so I think that's fine as is. > > > > (Out of interest: What's a better way to do this, should there be contention?) > > I think the POSIXy way is something like: > > int end_time = time() + kTimeout; > do { > alarm(end_time - time()); > rc = flock(fd_, LOCK_EX); > } while (rc == -1 && errno == EINTR && time() < end_time); > > Of course, it needs to be rewritten with gettimeofday() and setitimer() to give > better resolution, and someone would need to verify whether this works right > with threading. Even without other signals, there’s no explicit guarantee that the process that ultimately wins the exclusive lock will be the first process to request it that blocked, so you’d be relying on an implementation detail (that incidentally is shared by the Mac and Linux kernels). http://codereview.chromium.org/10823329/diff/3005/rlz/mac/lib/rlz_value_store... File rlz/mac/lib/rlz_value_store_mac.mm (right): http://codereview.chromium.org/10823329/diff/3005/rlz/mac/lib/rlz_value_store... rlz/mac/lib/rlz_value_store_mac.mm:301: ignore_result(HANDLE_EINTR(close(file_lock_))); shess wrote: > The close() would suffice, I think. No problem, though. It would have, but there’s nothing wrong with the explicit flock(…, LOCK_UN) either. Unless we’re concerned about code size. But obviously we’re not. |