|
|
Created:
7 years, 4 months ago by mdempsky_google Modified:
7 years, 4 months ago CC:
chromium-reviews, erikwright+watch_chromium.org Base URL:
https://chromium.googlesource.com/chromium/src.git@master Visibility:
Public. |
DescriptionBlock signals while forking so they aren't run in the child process.
BUG=178450
Committed: https://src.chromium.org/viewvc/chrome?view=rev&revision=215915
Patch Set 1 #
Total comments: 2
Patch Set 2 : Block all signals while forking, and remember to restore the original sigmask in the child after ca… #Patch Set 3 : Revert no-longer-needed kSignalsToReset refactoring #
Total comments: 6
Patch Set 4 : Workaround Android's broken pthread_sigmask() #Patch Set 5 : Fix branch upstream #Patch Set 6 : Add CHECK for CLONE_THREAD/CLONE_VM suggested by Markus #
Total comments: 12
Patch Set 7 : Use RAW_CHECK as suggested by jln; cleanup SetSignalMask interface #
Total comments: 1
Patch Set 8 : Tweak comment #
Total comments: 2
Patch Set 9 : Check for CLONE_SIGHAND and add explanatory comment #Messages
Total messages: 19 (0 generated)
lgtm Looks great to me. But since jln was the one to file the bug, please hold off until he had a chance to chime in as well (and of course, wait for the try bots)
https://codereview.chromium.org/21415003/diff/1/base/process/launch_posix.cc File base/process/launch_posix.cc (right): https://codereview.chromium.org/21415003/diff/1/base/process/launch_posix.cc#... base/process/launch_posix.cc:87: const int kSignalsToReset[] = { This ignores the other implementation of ResetChildSignalHandlersToDefaults() (where all signals get reset). Maybe the solution is to write a function that iterates through all signal numbers and blocks any signal with a custom signal handler and returns a mask that will be restored in the parent after fork().
https://codereview.chromium.org/21415003/diff/1/base/process/launch_posix.cc File base/process/launch_posix.cc (right): https://codereview.chromium.org/21415003/diff/1/base/process/launch_posix.cc#... base/process/launch_posix.cc:87: const int kSignalsToReset[] = { On 2013/07/31 23:54:55, Julien Tinnes wrote: > Maybe the solution is to write a function that iterates through all signal > numbers and blocks any signal with a custom signal handler and returns a mask > that will be restored in the parent after fork(). Uploaded a new patch that's slightly simpler and just blocks all signals while fork()ing. This seems okay to me since fork()ing isn't that common (I think?), so it shouldn't cause any problems if other signals get delayed until after the fork() completes. (I also realized I need to restore the original sigmask in the child process after calling ResetChildSignalHandlersToDefaults.)
lgtm https://codereview.chromium.org/21415003/diff/11001/base/process/launch_posix.cc File base/process/launch_posix.cc (right): https://codereview.chromium.org/21415003/diff/11001/base/process/launch_posix... base/process/launch_posix.cc:400: if (pthread_sigmask(SIG_BLOCK, &full_sigset, &old_sigset) != 0) { In principle, I believe this works OK. But you are technically skirting real-close to undefined behavior. sigfillset() might set signals that are not actually defined. Linux happens to have exactly 32 regular signals and 32 real-time signals. So, setting the full mask does in fact affect exactly all signals. Not more, not less. But this isn't necessarily true for all platforms. Also, you are trying to block things like SIGKILL and SIGSTOP. This is not actually possible. In general, Linux tends to be nice and ends up doing what you meant rather than what you told it to do (i.e. it ignores SIGKILL and SIGSTOP) in the signal mask. Finally, glibc reserves at least one real-time signal for internal use by pthreads. It will generally wrap all signal-related system calls to filter out attempts to change this signal. Summarizing all of the above, I don't actually require that you change this code, as it is likely to work fine. But I would really like to see a unittest that verifies that you don't get unexpected behavior. https://codereview.chromium.org/21415003/diff/11001/base/process/launch_posix... base/process/launch_posix.cc:400: if (pthread_sigmask(SIG_BLOCK, &full_sigset, &old_sigset) != 0) { And now for a nitpick: It is actually not possible to safely call fork() in a multi-threaded application. So, if we were doing things correctly, you should be single-threaded by the time you get here. And that of course means you should have called sigprocmask() instead of pthread_sigmask(). But we all know this isn't happening :-( https://codereview.chromium.org/21415003/diff/11001/base/process/launch_posix... base/process/launch_posix.cc:408: pid = syscall(__NR_clone, options.clone_flags, 0, 0, 0); Do you by chance happen to know if we use this code? There is only a small subset of clone_flags that can safely be called from syscall(). There is another API for calling sys_clone() that works more generically. It would be nice to have a CHECK() that clone_flags doesn't include CLONE_THREAD or CLONE_VM. Again, I am perfectly happy, if you think this shouldn't go into this CL at this time.
https://codereview.chromium.org/21415003/diff/11001/base/process/launch_posix.cc File base/process/launch_posix.cc (right): https://codereview.chromium.org/21415003/diff/11001/base/process/launch_posix... base/process/launch_posix.cc:400: if (pthread_sigmask(SIG_BLOCK, &full_sigset, &old_sigset) != 0) { On 2013/08/01 01:20:54, Markus (顧孟勤) wrote: > It is actually not possible to safely call fork() in a multi-threaded > application. https://code.google.com/p/chromium/issues/detail?id=36678 seems to suggest Chromium depends on forking in a multi-threaded app. > So, if we were doing things correctly, you should be > single-threaded by the time you get here. And that of course means you should > have called sigprocmask() instead of pthread_sigmask(). Is there a difference between sigprocmask() and pthread_sigmask() on Linux? POSIX says they're the same except sigprocmask() can only be used in a single-threaded process (whereas pthread_sigmask() can be used in single-threaded or multi-threaded processes), and on OpenBSD they're backed by the same system call. (Well, and pthread_sigmask is in -lpthread instead of -lc, but I'm under the possibly mistaken impression that Chromium code will be linking against -lpthread anyway.) https://codereview.chromium.org/21415003/diff/11001/base/process/launch_posix... base/process/launch_posix.cc:400: if (pthread_sigmask(SIG_BLOCK, &full_sigset, &old_sigset) != 0) { On 2013/08/01 01:20:54, Markus (顧孟勤) wrote: > In principle, I believe this works OK. But you are technically skirting > real-close to undefined behavior. I expect sigfillset() + pthread_sigmask() should be safe, but I'll check with the austin group mailing list to be sure. > Also, you are trying to block things like SIGKILL and SIGSTOP. This is not > actually possible. Agreed, but also it's not an error: "It is not possible to block those signals which cannot be ignored. This shall be enforced by the system without causing an error to be indicated." http://pubs.opengroup.org/onlinepubs/009695399/functions/sigprocmask.html > Finally, glibc reserves at least one real-time signal for internal use by > pthreads. It will generally wrap all signal-related system calls to filter out > attempts to change this signal. Yep, OpenBSD does that too. I don't think it should be a problem here though. > Summarizing all of the above, I don't actually require that you change this > code, as it is likely to work fine. But I would really like to see a unittest > that verifies that you don't get unexpected behavior. Sure. What sort of unittest did you have in mind? https://codereview.chromium.org/21415003/diff/11001/base/process/launch_posix... base/process/launch_posix.cc:408: pid = syscall(__NR_clone, options.clone_flags, 0, 0, 0); On 2013/08/01 01:20:54, Markus (顧孟勤) wrote: > Do you by chance happen to know if we use this code? Not really. According to cs.chromium.org, the only use I found was in components/nacl/zygote/nacl_fork_delegate_linux.cc: "options.clone_flags = CLONE_FS | SIGCHLD;" > It would be nice to have a CHECK() that clone_flags doesn't include CLONE_THREAD > or CLONE_VM. I can add that.
On 2013/08/01 16:25:05, mdempsky wrote: > I expect sigfillset() + pthread_sigmask() should be safe, but I'll check with > the austin group mailing list to be sure. Asked on austin-group-l. Eric Blake agreed that it should be safe, but still waiting for some replies from others. (Also waiting for gmane to publish the posts to their archives so I can link to it...) However, theory aside, it seems that Android has issues with it somehow: http://build.chromium.org/p/tryserver.chromium/builders/android_dbg_triggered...: [ RUN ] ProcessUtilTest.LaunchProcess [ERROR:launch_posix.cc(401)] pthread_sigmask: No such file or directory ../../base/process/process_util_unittest.cc:620: Failure Value of: base::LaunchProcess(args, options, NULL) Actual: false Expected: true [FATAL:process_util_unittest.cc(625)] Check failed: n > 0. : No such file or directory So maybe Bionic is being non-compliant with POSIX here? :(
https://chromiumcodereview.appspot.com/21415003/diff/30001/base/process/launc... File base/process/launch_posix.cc (right): https://chromiumcodereview.appspot.com/21415003/diff/30001/base/process/launc... base/process/launch_posix.cc:430: CHECK_EQ(0, options.clone_flags & (CLONE_THREAD | CLONE_VM)); RAW_CHECK https://chromiumcodereview.appspot.com/21415003/diff/30001/base/process/launc... base/process/launch_posix.cc:437: // Always restore the signal mask in the parent. https://chromiumcodereview.appspot.com/21415003/diff/30001/base/process/launc... base/process/launch_posix.cc:440: NOTREACHED(); RAW_CHECK https://chromiumcodereview.appspot.com/21415003/diff/30001/base/process/launc... base/process/launch_posix.cc:488: std::set<int>::const_iterator resource; This could clearly allocate memory :( crbug.com/267342 https://chromiumcodereview.appspot.com/21415003/diff/30001/base/process/launc... base/process/launch_posix.cc:510: NOTREACHED(); RAW_CHECK https://chromiumcodereview.appspot.com/21415003/diff/30001/base/process/launc... base/process/launch_posix.cc:514: // When debugging it can be helpful to check that we really aren't making I guess this was intended right after the fork() but has regressed. https://chromiumcodereview.appspot.com/21415003/diff/30001/base/process/launc... base/process/launch_posix.cc:540: for (FileHandleMappingVector::const_iterator Again, this can clearly allocate.
https://chromiumcodereview.appspot.com/21415003/diff/30001/base/process/launc... File base/process/launch_posix.cc (right): https://chromiumcodereview.appspot.com/21415003/diff/30001/base/process/launc... base/process/launch_posix.cc:440: NOTREACHED(); On 2013/08/02 05:28:47, Julien Tinnes wrote: > RAW_CHECK Is RAW_CHECK(x) guaranteed to always evaluate x? Should I put this as RAW_CHECK(SetSignalMask(&old_sigset, NULL)); or like bool mask_restored = SetSignalMask(&old_sigset, NULL); RAW_CHECK(mask_restored); or something else? https://chromiumcodereview.appspot.com/21415003/diff/30001/base/process/launc... base/process/launch_posix.cc:488: std::set<int>::const_iterator resource; On 2013/08/02 05:28:47, Julien Tinnes wrote: > This could clearly allocate memory :( Iterating over a set can allocate memory?
https://codereview.chromium.org/21415003/diff/30001/base/process/launch_posix.cc File base/process/launch_posix.cc (right): https://codereview.chromium.org/21415003/diff/30001/base/process/launch_posix... base/process/launch_posix.cc:430: CHECK_EQ(0, options.clone_flags & (CLONE_THREAD | CLONE_VM)); On 2013/08/02 05:28:47, Julien Tinnes wrote: > RAW_CHECK Done. https://codereview.chromium.org/21415003/diff/30001/base/process/launch_posix... base/process/launch_posix.cc:437: On 2013/08/02 05:28:47, Julien Tinnes wrote: > // Always restore the signal mask in the parent. Done. https://codereview.chromium.org/21415003/diff/30001/base/process/launch_posix... base/process/launch_posix.cc:440: NOTREACHED(); I moved the RAW_CHECK()s directly into SetSignalMask(), so the API is slightly simpler now. https://codereview.chromium.org/21415003/diff/33002/base/process/launch_posix.cc File base/process/launch_posix.cc (right): https://codereview.chromium.org/21415003/diff/33002/base/process/launch_posix... base/process/launch_posix.cc:415: const sigset_t orig_sigmask = SetSignalMask(full_sigset); Now that SetSignalMask() doesn't need to return success/failure, I changed it to return the old signal mask, which I think is slightly nicer because it lets me mark orig_sigmask as const now. Downside is the kernel has to return the 'old' signal mask even when we restore the original signal mask later on, but I expect the cost of that is negligible. Happy to change it back though.
PTAL Should I add someone from base/OWNERS as a reviewer too?
lgtm I suggest mark@ as a good owner for this change. https://codereview.chromium.org/21415003/diff/42001/base/process/launch_posix.cc File base/process/launch_posix.cc (right): https://codereview.chromium.org/21415003/diff/42001/base/process/launch_posix... base/process/launch_posix.cc:420: RAW_CHECK(!(options.clone_flags & (CLONE_THREAD | CLONE_VM))); Maybe add CLONE_SIGHAND as well here ? Add a comment along the lines of "Signal handling in this function assumes the creation of a new process, check that a thread is not being created by mistake and that signal handling follows the process-creation rules".
+mark for base/OWNERS
https://codereview.chromium.org/21415003/diff/42001/base/process/launch_posix.cc File base/process/launch_posix.cc (right): https://codereview.chromium.org/21415003/diff/42001/base/process/launch_posix... base/process/launch_posix.cc:420: RAW_CHECK(!(options.clone_flags & (CLONE_THREAD | CLONE_VM))); On 2013/08/05 21:55:40, Julien Tinnes wrote: > Maybe add CLONE_SIGHAND as well here ? > > Add a comment along the lines of "Signal handling in this function assumes the > creation of a new process, check that a thread is not being created by mistake > and that signal handling follows the process-creation rules". Done.
LGTM for base OWNERS review.
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/mdempsky@google.com/21415003/52001
Step "update" is always a major failure. Look at the try server FAQ for more details. http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=android_cl...
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/mdempsky@google.com/21415003/52001
Message was sent while issue was closed.
Change committed as 215915 |