|
|
Chromium Code Reviews|
Created:
8 years, 6 months ago by Markus (顧孟勤) Modified:
8 years, 6 months ago CC:
chromium-reviews, agl, jln+watch_chromium.org Base URL:
svn://svn.chromium.org/chrome/trunk/src Visibility:
Public. |
DescriptionInstead of outputting one BPF check per possible system call coalesce
all system calls that are supposed to be treated identically.
This change list depends on https://chromiumcodereview.appspot.com/10546041/
These changes should address the immediate concerns about inefficient
BPF evaluation of system calls. But they are only the first step in
the process of us generating an optimal BPF program. We are still
missing the compilation of the binary search tree. That is going to be
the next change list in this series.
But for the benefit of better reviewability, I split the changes into
two parts.
BUG=130662
TEST=make && demo32 && demo64
Committed: http://src.chromium.org/viewvc/chrome?view=rev&revision=142295
Patch Set 1 #Patch Set 2 : Rebased #Patch Set 3 : Does this result in easier-to-read diffs? #
Total comments: 7
Patch Set 4 : Added more asserts and tweak the existing ones a little bit #
Total comments: 3
Patch Set 5 : Simplified the asserts #
Total comments: 4
Patch Set 6 : Moved checking of policies into a separate method #
Total comments: 6
Patch Set 7 : Rebased #Patch Set 8 : Rebase #Patch Set 9 : Rebased #
Messages
Total messages: 20 (0 generated)
This change list currently includes the changes from https://chromiumcodereview.appspot.com/10546041/ Please review the former change list first and I will then commit it and regenerate the diffs for this changelist. That should make things easier to review.
Do you mind rebasing it on top of the verifier branch? When doing "git cl upload", you can do "git cl upload verifier_branch", so that only this diff gets uploaded.
Thanks. I was looking for something like that :-) Yes, I think, this is a much more manageable changelist -- and it is therefore suitable for review now, rather than waiting for the other review to complete.
Looks ok in general, but please make it easier to review for correctness in the
case we'll most likely care about.
If you feel strongly about keeping it generic, at least have a few asserts in
there for the cases that we'll test with or really care about.
But really, I would much rather support less feature and have it review-able. I
would even say that I would love if we could catch callers trying to set-up
policies that are very unlikely to be what they want and return an error.
I think you could make a case for supporting syscall number larger than
MAX_SYSCALL (but I still would like some kind of assert("this is not tested",
butt not negative syscall numbers.
https://chromiumcodereview.appspot.com/10536048/diff/5001/sandbox/linux/secco...
File sandbox/linux/seccomp-bpf/sandbox_bpf.cc (right):
https://chromiumcodereview.appspot.com/10536048/diff/5001/sandbox/linux/secco...
sandbox/linux/seccomp-bpf/sandbox_bpf.cc:302: if (oldErr !=
evaluateSyscall(std::numeric_limits<int>::max()) ||
This makes it hard to review.
I would either:
- Have a comment somewhere, that we don't support negative system call numbers,
with the proper assert.
- Add asserts in case the system call number is negative or after MAX_SYSCALL
explaining this is untested.
I would prefer the first solution much more. Simplicity! :)
Also it's extremely unlikely that the caller will want to do that, ever, and I
would rather err on the side of catching mistakes.
I can already see bugs pop-up where callers expected "unknown" syscalls to be
denied by default.
https://chromiumcodereview.appspot.com/10536048/diff/5001/sandbox/linux/secco...
sandbox/linux/seccomp-bpf/sandbox_bpf.cc:322: uint32_t last =
static_cast<uint32_t>(-1);
I know this is correct and allowed by standards, but it's more readable with
std::numeric_limits<uint32_t>::max (), no ?
If you think it makes the code below more readable, add a comment: "guaranteed
to be UINT32_MAX"
https://chromiumcodereview.appspot.com/10536048/diff/5001/sandbox/linux/secco...
sandbox/linux/seccomp-bpf/sandbox_bpf.cc:326: // Ranges most be contiguous and
monotonically increasing.
s/most/must
https://chromiumcodereview.appspot.com/10536048/diff/5001/sandbox/linux/secco...
sandbox/linux/seccomp-bpf/sandbox_bpf.cc:328: iter->from != last+1) {
nit: last + 1 (spaces)
also add a comment explaining that last + 1 will be 0 on the first iteration.
I don't like having to rely on (defined but poorly known) behavior such as
unsigned overflow. Agreed it allows us to not unroll the loop so that's good.
PTAL https://chromiumcodereview.appspot.com/10536048/diff/5001/sandbox/linux/secco... File sandbox/linux/seccomp-bpf/sandbox_bpf.cc (right): https://chromiumcodereview.appspot.com/10536048/diff/5001/sandbox/linux/secco... sandbox/linux/seccomp-bpf/sandbox_bpf.cc:302: if (oldErr != evaluateSyscall(std::numeric_limits<int>::max()) || This is just a glorified assert() statement. Unfortunately, it's a little verbose to actually write what we want to write. I added another set of checks to the place where the user first hands us the policy evaluator. And I explicitly deny negative system calls. Hopefully, this addresses your concern. But if you need more, or if you think there is something else that makes this code easier for you to review, please speak up. https://chromiumcodereview.appspot.com/10536048/diff/5001/sandbox/linux/secco... sandbox/linux/seccomp-bpf/sandbox_bpf.cc:322: uint32_t last = static_cast<uint32_t>(-1); On 2012/06/08 22:38:21, Julien Tinnes wrote: > I know this is correct and allowed by standards, but it's more readable with > std::numeric_limits<uint32_t>::max (), no ? > > If you think it makes the code below more readable, add a comment: "guaranteed > to be UINT32_MAX" > I slightly changed the logic, and can thus avoid this issue. https://chromiumcodereview.appspot.com/10536048/diff/5001/sandbox/linux/secco... sandbox/linux/seccomp-bpf/sandbox_bpf.cc:326: // Ranges most be contiguous and monotonically increasing. On 2012/06/08 22:38:21, Julien Tinnes wrote: > s/most/must Done.
https://chromiumcodereview.appspot.com/10536048/diff/9001/sandbox/linux/secco... File sandbox/linux/seccomp-bpf/sandbox_bpf.cc (right): https://chromiumcodereview.appspot.com/10536048/diff/9001/sandbox/linux/secco... sandbox/linux/seccomp-bpf/sandbox_bpf.cc:186: void Sandbox::setSandboxPolicy(EvaluateSyscall syscallEvaluator, Let's return a bool (and error string) here instead of dying? https://chromiumcodereview.appspot.com/10536048/diff/9001/sandbox/linux/secco... sandbox/linux/seccomp-bpf/sandbox_bpf.cc:347: oldErr != evaluateSyscall((unsigned)std::numeric_limits<int>::max()+1) || nit: max + 1 (spaces) Please add comments, explaining that you're trying to get the "default" error code etc. Also add a comment explicitly for the last line: "casting to unsigned to make sure to avoid overflow". https://chromiumcodereview.appspot.com/10536048/diff/9001/sandbox/linux/secco... sandbox/linux/seccomp-bpf/sandbox_bpf.cc:375: from = iter->to+1; nit: to + 1 (spaces)
I decided to simplify the sanity checks. The earlier version of the change list was more generic generic, and it would do the right thing no matter how C++ defines "int". But that apparently is just confusing my reviewer, and that's clearly no good. I think it is safe to assume that "int" is 32bits and thus the same width as what BPF uses internally. With that assumption, a lot of the test cases all collapse into a few very simple lines. Added comments spelling out the exact bit patterns, too.
LGTM
https://chromiumcodereview.appspot.com/10536048/diff/8007/sandbox/linux/secco... File sandbox/linux/seccomp-bpf/sandbox_bpf.cc (right): https://chromiumcodereview.appspot.com/10536048/diff/8007/sandbox/linux/secco... sandbox/linux/seccomp-bpf/sandbox_bpf.cc:225: Since this method is called "setSandboxPolicy", doesn't it make sense to extract the sanity checks to another method? https://chromiumcodereview.appspot.com/10536048/diff/8007/sandbox/linux/secco... sandbox/linux/seccomp-bpf/sandbox_bpf.cc:343: } This code only coalesces contiguous system calls with the same return value, right? We still cannot have a concept of "hot" syscalls. But if we merge the binary search support, it shouldn't be a problem.
On Mon, Jun 11, 2012 at 12:39 PM, <jorgelo@chromium.org> wrote: > > https://chromiumcodereview.**appspot.com/10536048/diff/** > 8007/sandbox/linux/seccomp-**bpf/sandbox_bpf.cc<https://chromiumcodereview.appspot.com/10536048/diff/8007/sandbox/linux/seccomp-bpf/sandbox_bpf.cc> > File sandbox/linux/seccomp-bpf/**sandbox_bpf.cc (right): > > https://chromiumcodereview.**appspot.com/10536048/diff/** > 8007/sandbox/linux/seccomp-**bpf/sandbox_bpf.cc#newcode225<https://chromiumcodereview.appspot.com/10536048/diff/8007/sandbox/linux/seccomp-bpf/sandbox_bpf.cc#newcode225> > sandbox/linux/seccomp-bpf/**sandbox_bpf.cc:225: > Since this method is called "setSandboxPolicy", doesn't it make sense to > extract the sanity checks to another method? > > https://chromiumcodereview.**appspot.com/10536048/diff/** > 8007/sandbox/linux/seccomp-**bpf/sandbox_bpf.cc#newcode343<https://chromiumcodereview.appspot.com/10536048/diff/8007/sandbox/linux/seccomp-bpf/sandbox_bpf.cc#newcode343> > sandbox/linux/seccomp-bpf/**sandbox_bpf.cc:343: } > This code only coalesces contiguous system calls with the same return > value, right? We still cannot have a concept of "hot" syscalls. But if > we merge the binary search support, it shouldn't be a problem. > FWIW, I agree with Jorge here. I lament the loss of "hot" syscall support, but I suggest we just land the binary search support and call it done. I expect the tree will have sufficiently few branches that we're in good shape in terms of number of instructions required to evaluate any "hot" syscall. > https://chromiumcodereview.**appspot.com/10536048/<https://chromiumcodereview... >
https://chromiumcodereview.appspot.com/10536048/diff/8007/sandbox/linux/secco... File sandbox/linux/seccomp-bpf/sandbox_bpf.cc (right): https://chromiumcodereview.appspot.com/10536048/diff/8007/sandbox/linux/secco... sandbox/linux/seccomp-bpf/sandbox_bpf.cc:225: On 2012/06/11 19:39:10, Jorge Lucangeli Obes wrote: > Since this method is called "setSandboxPolicy", doesn't it make sense to extract > the sanity checks to another method? Done. https://chromiumcodereview.appspot.com/10536048/diff/8007/sandbox/linux/secco... sandbox/linux/seccomp-bpf/sandbox_bpf.cc:343: } Have you actually done any benchmarks and does this cost show up anywhere? I am quite curious. My gut feeling is that the number of executed BPF instructions is pretty much negligible on most CPUs. Even if the kernel decides to interpret the BPF filter instead of using a JIT. What I do suspect will matter is the overall size of the BPF filter program and the overall density of the code. These two factors determine how frequently the CPU needs to reach out to memory instead of being able to load the BPF program from cache. Coalescing of ranges already helps somewhat with this goal, and the binary tree helps even more. For the vast majority of applications, cache foot print is the limiting factor, clock-cyles per instructions are not and haven't been for about ten years or so. These numbers might have changed again, but a few years ago, you could literally execute several thousand instructions while waiting for a single memory load. Of course, once memory starts loading, it will then try to aggressively stream some more memory. That's where code density helps us. And that's where BPF's forward-only jumps are quite nice.
On 2012/06/11 19:56:50, Markus (顧孟勤) wrote: > https://chromiumcodereview.appspot.com/10536048/diff/8007/sandbox/linux/secco... > File sandbox/linux/seccomp-bpf/sandbox_bpf.cc (right): > > https://chromiumcodereview.appspot.com/10536048/diff/8007/sandbox/linux/secco... > sandbox/linux/seccomp-bpf/sandbox_bpf.cc:225: > On 2012/06/11 19:39:10, Jorge Lucangeli Obes wrote: > > Since this method is called "setSandboxPolicy", doesn't it make sense to > extract > > the sanity checks to another method? > > Done. > > https://chromiumcodereview.appspot.com/10536048/diff/8007/sandbox/linux/secco... > sandbox/linux/seccomp-bpf/sandbox_bpf.cc:343: } > Have you actually done any benchmarks and does this cost show up anywhere? I am > quite curious. On an Alex Chrome OS device (Atom) (tests by wad@) localhost chronos # ./seccomp_getpid 10000000 Using syscall instr No seccomp Testing scaffold . . . 23 cycles time: 275 cycles localhost chronos # ./seccomp_getpid 10000000 n s Using syscall instr Using SECCOMP_RET_ALLOW Testing scaffold . . . 21 cycles time: 788 cycles Granted, that's very micro-benchmarky, but the simplest filter (just RET_ALLOW) adds 500 cycles. > My gut feeling is that the number of executed BPF instructions is pretty much > negligible on most CPUs. Even if the kernel decides to interpret the BPF filter > instead of using a JIT. There won't be a JIT in 3.5. > What I do suspect will matter is the overall size of the BPF filter program and > the overall density of the code. These two factors determine how frequently the > CPU needs to reach out to memory instead of being able to load the BPF program > from cache. > > Coalescing of ranges already helps somewhat with this goal, and the binary tree > helps even more. > > For the vast majority of applications, cache foot print is the limiting factor, > clock-cyles per instructions are not and haven't been for about ten years or so. > These numbers might have changed again, but a few years ago, you could literally > execute several thousand instructions while waiting for a single memory load. > > Of course, once memory starts loading, it will then try to aggressively stream > some more memory. That's where code density helps us. And that's where BPF's > forward-only jumps are quite nice. I do agree that binary trees mostly make this discussion obsolete =). lgtm
I am not surprised that the mere fact that we are using BPF filters adds some overhead. Although admittedly, 500 clock cycles will be hard to notice in real-life. I was more interested in how much of a penalty we pay for each additional BPF instruction. And whether it makes a difference if we jump over lots of inactive instructions, or whether we just jump a short distance. There is bound to be a penalty; but is it on the same order as the penalty for using BPF in the first place, is it much higher, or is it much less? Markus On Mon, Jun 11, 2012 at 2:23 PM, <jorgelo@chromium.org> wrote: > On 2012/06/11 19:56:50, Markus (顧孟勤) wrote: > > https://chromiumcodereview.**appspot.com/10536048/diff/** > 8007/sandbox/linux/seccomp-**bpf/sandbox_bpf.cc<https://chromiumcodereview.appspot.com/10536048/diff/8007/sandbox/linux/seccomp-bpf/sandbox_bpf.cc> > >> File sandbox/linux/seccomp-bpf/**sandbox_bpf.cc (right): >> > > > https://chromiumcodereview.**appspot.com/10536048/diff/** > 8007/sandbox/linux/seccomp-**bpf/sandbox_bpf.cc#newcode225<https://chromiumcodereview.appspot.com/10536048/diff/8007/sandbox/linux/seccomp-bpf/sandbox_bpf.cc#newcode225> > >> sandbox/linux/seccomp-bpf/**sandbox_bpf.cc:225: >> On 2012/06/11 19:39:10, Jorge Lucangeli Obes wrote: >> > Since this method is called "setSandboxPolicy", doesn't it make sense to >> extract >> > the sanity checks to another method? >> > > Done. >> > > > https://chromiumcodereview.**appspot.com/10536048/diff/** > 8007/sandbox/linux/seccomp-**bpf/sandbox_bpf.cc#newcode343<https://chromiumcodereview.appspot.com/10536048/diff/8007/sandbox/linux/seccomp-bpf/sandbox_bpf.cc#newcode343> > >> sandbox/linux/seccomp-bpf/**sandbox_bpf.cc:343: } >> Have you actually done any benchmarks and does this cost show up >> anywhere? I >> > am > >> quite curious. >> > > On an Alex Chrome OS device (Atom) (tests by wad@) > > localhost chronos # ./seccomp_getpid 10000000 > Using syscall instr > No seccomp > Testing scaffold . . . 23 cycles > time: 275 cycles > > > localhost chronos # ./seccomp_getpid 10000000 n s > Using syscall instr > Using SECCOMP_RET_ALLOW > Testing scaffold . . . 21 cycles > time: 788 cycles > > Granted, that's very micro-benchmarky, but the simplest filter (just > RET_ALLOW) > adds 500 cycles. > > > My gut feeling is that the number of executed BPF instructions is pretty >> much >> negligible on most CPUs. Even if the kernel decides to interpret the BPF >> > filter > >> instead of using a JIT. >> > > There won't be a JIT in 3.5. > > > What I do suspect will matter is the overall size of the BPF filter >> program >> > and > >> the overall density of the code. These two factors determine how >> frequently >> > the > >> CPU needs to reach out to memory instead of being able to load the BPF >> program >> from cache. >> > > Coalescing of ranges already helps somewhat with this goal, and the binary >> > tree > >> helps even more. >> > > For the vast majority of applications, cache foot print is the limiting >> > factor, > >> clock-cyles per instructions are not and haven't been for about ten years >> or >> > so. > >> These numbers might have changed again, but a few years ago, you could >> > literally > >> execute several thousand instructions while waiting for a single memory >> load. >> > > Of course, once memory starts loading, it will then try to aggressively >> stream >> some more memory. That's where code density helps us. And that's where >> BPF's >> forward-only jumps are quite nice. >> > > I do agree that binary trees mostly make this discussion obsolete =). > > lgtm > > https://chromiumcodereview.**appspot.com/10536048/<https://chromiumcodereview... >
LGTM if you move the policy sanity check to debug-only https://chromiumcodereview.appspot.com/10536048/diff/13001/sandbox/linux/secc... File sandbox/linux/seccomp-bpf/sandbox_bpf.cc (right): https://chromiumcodereview.appspot.com/10536048/diff/13001/sandbox/linux/secc... sandbox/linux/seccomp-bpf/sandbox_bpf.cc:187: EvaluateArguments argumentEvaluator) { I think this should be debug only. That will be more than sufficient to catch problems, without performing thousands of wasted iterations in production. https://chromiumcodereview.appspot.com/10536048/diff/13001/sandbox/linux/secc... sandbox/linux/seccomp-bpf/sandbox_bpf.cc:323: prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &prog)) { Nit (for the future): it'd be nice to differentiate which of those called failed.
https://chromiumcodereview.appspot.com/10536048/diff/13001/sandbox/linux/secc... File sandbox/linux/seccomp-bpf/sandbox_bpf.cc (right): https://chromiumcodereview.appspot.com/10536048/diff/13001/sandbox/linux/secc... sandbox/linux/seccomp-bpf/sandbox_bpf.cc:314: // There is really nothing the caller can do until the bug is fixed. Actually one more thing. I'm not entire groking the order of CLs here but it looks possible that the #ifndef NDEBUG got lost in the move here?
On Tue, Jun 12, 2012 at 11:11 AM, <cevans@chromium.org> wrote: > LGTM if you move the policy sanity check to debug-only It likely that policies will depend on hardware (for instance for the GPU policy) or other factors. Shouldn't we have at least a couple of iterations where we're extra cautious ? This is a small startup-time overhead. In comparison, we're currently sending BPF filters that just kill processes when unknown syscalls or situations arise to our users in dev and beta channel on Linux, I would consider a small startup-time overhead a smaller price to pay. Is it not ? Julien
https://chromiumcodereview.appspot.com/10536048/diff/13001/sandbox/linux/secc... File sandbox/linux/seccomp-bpf/sandbox_bpf.cc (right): https://chromiumcodereview.appspot.com/10536048/diff/13001/sandbox/linux/secc... sandbox/linux/seccomp-bpf/sandbox_bpf.cc:187: EvaluateArguments argumentEvaluator) { On 2012/06/12 18:11:55, Chris Evans wrote: > I think this should be debug only. That will be more than sufficient to catch > problems, without performing thousands of wasted iterations in production. Done. I left in the cheap tests, but disabled the expensive tests for production builds. https://chromiumcodereview.appspot.com/10536048/diff/13001/sandbox/linux/secc... sandbox/linux/seccomp-bpf/sandbox_bpf.cc:314: // There is really nothing the caller can do until the bug is fixed. On 2012/06/12 18:13:47, Chris Evans wrote: > Actually one more thing. I'm not entire groking the order of CLs here but it > looks possible that the #ifndef NDEBUG got lost in the move here? I'll upload a newly rebased version shortly, and it'll show this NDEBUG test https://chromiumcodereview.appspot.com/10536048/diff/13001/sandbox/linux/secc... sandbox/linux/seccomp-bpf/sandbox_bpf.cc:323: prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &prog)) { On 2012/06/12 18:11:55, Chris Evans wrote: > Nit (for the future): it'd be nice to differentiate which of those called > failed. Already fixed by rebasing.
LGTM On Tue, Jun 12, 2012 at 12:02 PM, <markus@chromium.org> wrote: > > https://chromiumcodereview.**appspot.com/10536048/diff/** > 13001/sandbox/linux/seccomp-**bpf/sandbox_bpf.cc<https://chromiumcodereview.appspot.com/10536048/diff/13001/sandbox/linux/seccomp-bpf/sandbox_bpf.cc> > File sandbox/linux/seccomp-bpf/**sandbox_bpf.cc (right): > > https://chromiumcodereview.**appspot.com/10536048/diff/** > 13001/sandbox/linux/seccomp-**bpf/sandbox_bpf.cc#newcode187<https://chromiumcodereview.appspot.com/10536048/diff/13001/sandbox/linux/seccomp-bpf/sandbox_bpf.cc#newcode187> > sandbox/linux/seccomp-bpf/**sandbox_bpf.cc:187: EvaluateArguments > argumentEvaluator) { > On 2012/06/12 18:11:55, Chris Evans wrote: > >> I think this should be debug only. That will be more than sufficient >> > to catch > >> problems, without performing thousands of wasted iterations in >> > production. > > Done. > > I left in the cheap tests, but disabled the expensive tests for > production builds. > > > https://chromiumcodereview.**appspot.com/10536048/diff/** > 13001/sandbox/linux/seccomp-**bpf/sandbox_bpf.cc#newcode314<https://chromiumcodereview.appspot.com/10536048/diff/13001/sandbox/linux/seccomp-bpf/sandbox_bpf.cc#newcode314> > sandbox/linux/seccomp-bpf/**sandbox_bpf.cc:314: // There is really nothing > the caller can do until the bug is fixed. > On 2012/06/12 18:13:47, Chris Evans wrote: > >> Actually one more thing. I'm not entire groking the order of CLs here >> > but it > >> looks possible that the #ifndef NDEBUG got lost in the move here? >> > > I'll upload a newly rebased version shortly, and it'll show this NDEBUG > test > > > https://chromiumcodereview.**appspot.com/10536048/diff/** > 13001/sandbox/linux/seccomp-**bpf/sandbox_bpf.cc#newcode323<https://chromiumcodereview.appspot.com/10536048/diff/13001/sandbox/linux/seccomp-bpf/sandbox_bpf.cc#newcode323> > sandbox/linux/seccomp-bpf/**sandbox_bpf.cc:323: prctl(PR_SET_SECCOMP, > SECCOMP_MODE_FILTER, &prog)) { > On 2012/06/12 18:11:55, Chris Evans wrote: > >> Nit (for the future): it'd be nice to differentiate which of those >> > called > >> failed. >> > > Already fixed by rebasing. > > https://chromiumcodereview.**appspot.com/10536048/<https://chromiumcodereview... >
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/markus@chromium.org/10536048/5004
Change committed as 142295 |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
