Issue 10536048: Instead of outputting one BPF check per possible system call. Coalesce

Issue 10536048: Instead of outputting one BPF check per possible system call. Coalesce (Closed)

Created:
8 years, 6 months ago by Markus (顧孟勤)

Modified:
8 years, 6 months ago

Reviewers:
Jorge Lucangeli Obes, cevans, jln (very slow on Chromium), Chris Evans

CC:
chromium-reviews, agl, jln+watch_chromium.org

Base URL:
svn://svn.chromium.org/chrome/trunk/src

Visibility:
Public.

More Reviews

Description

Instead of outputting one BPF check per possible system call coalesce all system calls that are supposed to be treated identically. This change list depends on https://chromiumcodereview.appspot.com/10546041/ These changes should address the immediate concerns about inefficient BPF evaluation of system calls. But they are only the first step in the process of us generating an optimal BPF program. We are still missing the compilation of the binary search tree. That is going to be the next change list in this series. But for the benefit of better reviewability, I split the changes into two parts. BUG=130662 TEST=make && demo32 && demo64 Committed: http://src.chromium.org/viewvc/chrome?view=rev&revision=142295

Patch Set 1 #

Patch Set 2 : Rebased #

Patch Set 3 : Does this result in easier-to-read diffs? #

Total comments: 7

Patch Set 4 : Added more asserts and tweak the existing ones a little bit #

Total comments: 3

Patch Set 5 : Simplified the asserts #

Total comments: 4

Patch Set 6 : Moved checking of policies into a separate method #

Total comments: 6

Patch Set 7 : Rebased #

Patch Set 8 : Rebase #

Patch Set 9 : Rebased #

Created: 8 years, 6 months ago

Download [raw] [tar.bz2]

	Unified diffs	Side-by-side diffs	Delta from patch set	Stats (+177 lines, -53 lines)			Patch
M	sandbox/linux/seccomp-bpf/sandbox_bpf.h	View	1 2 3 4 5 6 7 8	5 chunks	+22 lines, -5 lines	0 comments	Download
M	sandbox/linux/seccomp-bpf/sandbox_bpf.cc	View	1 2 3 4 5 6 7 8	3 chunks	+153 lines, -46 lines	0 comments	Download
M	sandbox/linux/seccomp-bpf/verifier.cc	View	1 2 3 4 5 6 7 8	2 chunks	+2 lines, -2 lines	0 comments	Download

Messages

Total messages: 20 (0 generated)

Expand Messages | Collapse Messages

Markus (顧孟勤)

This change list currently includes the changes from https://chromiumcodereview.appspot.com/10546041/ Please review the former change list ...

8 years, 6 months ago (2012-06-07 08:32:40 UTC) #1

jln (very slow on Chromium)

Do you mind rebasing it on top of the verifier branch? When doing "git cl ...

8 years, 6 months ago (2012-06-08 19:34:48 UTC) #2

Markus (顧孟勤)

Thanks. I was looking for something like that :-) Yes, I think, this is a ...

8 years, 6 months ago (2012-06-08 19:52:34 UTC) #3

jln (very slow on Chromium)

Looks ok in general, but please make it easier to review for correctness in the ...

8 years, 6 months ago (2012-06-08 22:38:21 UTC) #4

Looks ok in general, but please make it easier to review for correctness in the
case we'll most likely care about.

If you feel strongly about keeping it generic, at least have a few asserts in
there for the cases that we'll test with or really care about.

But really, I would much rather support less feature and have it review-able. I
would even say that I would love if we could catch callers trying to set-up
policies that are very unlikely to be what they want and return an error.

I think you could make a case for supporting syscall number larger than
MAX_SYSCALL (but I still would like some kind of assert("this is not tested",
butt not negative syscall numbers.

https://chromiumcodereview.appspot.com/10536048/diff/5001/sandbox/linux/secco...
File sandbox/linux/seccomp-bpf/sandbox_bpf.cc (right):

https://chromiumcodereview.appspot.com/10536048/diff/5001/sandbox/linux/secco...
sandbox/linux/seccomp-bpf/sandbox_bpf.cc:302: if (oldErr !=
evaluateSyscall(std::numeric_limits<int>::max()) ||
This makes it hard to review.

I would either:
- Have a comment somewhere, that we don't support negative system call numbers,
with the proper assert.
- Add asserts in case the system call number is negative or after MAX_SYSCALL
explaining this is untested.

I would prefer the first solution much more. Simplicity! :)

Also it's extremely unlikely that the caller will want to do that, ever, and I
would rather err on the side of catching mistakes.

I can already see bugs pop-up where callers expected "unknown" syscalls to be
denied by default.

https://chromiumcodereview.appspot.com/10536048/diff/5001/sandbox/linux/secco...
sandbox/linux/seccomp-bpf/sandbox_bpf.cc:322: uint32_t last =
static_cast<uint32_t>(-1);
I know this is correct and allowed by standards, but it's more readable with
std::numeric_limits<uint32_t>::max (), no ?

If you think it makes the code below more readable, add a comment: "guaranteed
to be UINT32_MAX"

https://chromiumcodereview.appspot.com/10536048/diff/5001/sandbox/linux/secco...
sandbox/linux/seccomp-bpf/sandbox_bpf.cc:326: // Ranges most be contiguous and
monotonically increasing.
s/most/must

https://chromiumcodereview.appspot.com/10536048/diff/5001/sandbox/linux/secco...
sandbox/linux/seccomp-bpf/sandbox_bpf.cc:328: iter->from != last+1) {
nit: last + 1 (spaces)

also add a comment explaining that last + 1 will be 0 on the first iteration.
I don't like having to rely on (defined but poorly known) behavior such as
unsigned overflow. Agreed it allows us to not unroll the loop so that's good.

Markus (顧孟勤)

PTAL https://chromiumcodereview.appspot.com/10536048/diff/5001/sandbox/linux/seccomp-bpf/sandbox_bpf.cc File sandbox/linux/seccomp-bpf/sandbox_bpf.cc (right): https://chromiumcodereview.appspot.com/10536048/diff/5001/sandbox/linux/seccomp-bpf/sandbox_bpf.cc#newcode302 sandbox/linux/seccomp-bpf/sandbox_bpf.cc:302: if (oldErr != evaluateSyscall(std::numeric_limits<int>::max()) || This is just ...

8 years, 6 months ago (2012-06-09 00:30:58 UTC) #5

jln (very slow on Chromium)

https://chromiumcodereview.appspot.com/10536048/diff/9001/sandbox/linux/seccomp-bpf/sandbox_bpf.cc File sandbox/linux/seccomp-bpf/sandbox_bpf.cc (right): https://chromiumcodereview.appspot.com/10536048/diff/9001/sandbox/linux/seccomp-bpf/sandbox_bpf.cc#newcode186 sandbox/linux/seccomp-bpf/sandbox_bpf.cc:186: void Sandbox::setSandboxPolicy(EvaluateSyscall syscallEvaluator, Let's return a bool (and error ...

8 years, 6 months ago (2012-06-09 01:06:13 UTC) #6

Markus (顧孟勤)

I decided to simplify the sanity checks. The earlier version of the change list was ...

8 years, 6 months ago (2012-06-09 01:41:45 UTC) #7

Jorge Lucangeli Obes

https://chromiumcodereview.appspot.com/10536048/diff/8007/sandbox/linux/seccomp-bpf/sandbox_bpf.cc File sandbox/linux/seccomp-bpf/sandbox_bpf.cc (right): https://chromiumcodereview.appspot.com/10536048/diff/8007/sandbox/linux/seccomp-bpf/sandbox_bpf.cc#newcode225 sandbox/linux/seccomp-bpf/sandbox_bpf.cc:225: Since this method is called "setSandboxPolicy", doesn't it make ...

8 years, 6 months ago (2012-06-11 19:39:10 UTC) #9

Chris Evans

On Mon, Jun 11, 2012 at 12:39 PM, <jorgelo@chromium.org> wrote: > > https://chromiumcodereview.**appspot.com/10536048/diff/** > 8007/sandbox/linux/seccomp-**bpf/sandbox_bpf.cc<https://chromiumcodereview.appspot.com/10536048/diff/8007/sandbox/linux/seccomp-bpf/sandbox_bpf.cc> ...

8 years, 6 months ago (2012-06-11 19:48:30 UTC) #10

Markus (顧孟勤)

https://chromiumcodereview.appspot.com/10536048/diff/8007/sandbox/linux/seccomp-bpf/sandbox_bpf.cc File sandbox/linux/seccomp-bpf/sandbox_bpf.cc (right): https://chromiumcodereview.appspot.com/10536048/diff/8007/sandbox/linux/seccomp-bpf/sandbox_bpf.cc#newcode225 sandbox/linux/seccomp-bpf/sandbox_bpf.cc:225: On 2012/06/11 19:39:10, Jorge Lucangeli Obes wrote: > Since ...

8 years, 6 months ago (2012-06-11 19:56:50 UTC) #11

Jorge Lucangeli Obes

On 2012/06/11 19:56:50, Markus (顧孟勤) wrote: > https://chromiumcodereview.appspot.com/10536048/diff/8007/sandbox/linux/seccomp-bpf/sandbox_bpf.cc > File sandbox/linux/seccomp-bpf/sandbox_bpf.cc (right): > > https://chromiumcodereview.appspot.com/10536048/diff/8007/sandbox/linux/seccomp-bpf/sandbox_bpf.cc#newcode225 ...

8 years, 6 months ago (2012-06-11 21:23:34 UTC) #12

On 2012/06/11 19:56:50, Markus (顧孟勤) wrote:
>
https://chromiumcodereview.appspot.com/10536048/diff/8007/sandbox/linux/secco...
> File sandbox/linux/seccomp-bpf/sandbox_bpf.cc (right):
> 
>
https://chromiumcodereview.appspot.com/10536048/diff/8007/sandbox/linux/secco...
> sandbox/linux/seccomp-bpf/sandbox_bpf.cc:225: 
> On 2012/06/11 19:39:10, Jorge Lucangeli Obes wrote:
> > Since this method is called "setSandboxPolicy", doesn't it make sense to
> extract
> > the sanity checks to another method?
> 
> Done.
> 
>
https://chromiumcodereview.appspot.com/10536048/diff/8007/sandbox/linux/secco...
> sandbox/linux/seccomp-bpf/sandbox_bpf.cc:343: }
> Have you actually done any benchmarks and does this cost show up anywhere? I
am
> quite curious.

On an Alex Chrome OS device (Atom) (tests by wad@)

localhost chronos # ./seccomp_getpid 10000000
Using syscall instr
No seccomp
Testing scaffold . . . 23 cycles
time: 275 cycles


localhost chronos # ./seccomp_getpid 10000000 n s
Using syscall instr
Using SECCOMP_RET_ALLOW
Testing scaffold . . . 21 cycles
time: 788 cycles

Granted, that's very micro-benchmarky, but the simplest filter (just RET_ALLOW)
adds 500 cycles.

> My gut feeling is that the number of executed BPF instructions is pretty much
> negligible on most CPUs. Even if the kernel decides to interpret the BPF
filter
> instead of using a JIT.

There won't be a JIT in 3.5.

> What I do suspect will matter is the overall size of the BPF filter program
and
> the overall density of the code. These two factors determine how frequently
the
> CPU needs to reach out to memory instead of being able to load the BPF program
> from cache.
> 
> Coalescing of ranges already helps somewhat with this goal, and the binary
tree
> helps even more.
> 
> For the vast majority of applications, cache foot print is the limiting
factor,
> clock-cyles per instructions are not and haven't been for about ten years or
so.
> These numbers might have changed again, but a few years ago, you could
literally
> execute several thousand instructions while waiting for a single memory load.
> 
> Of course, once memory starts loading, it will then try to aggressively stream
> some more memory. That's where code density helps us. And that's where BPF's
> forward-only jumps are quite nice.

I do agree that binary trees mostly make this discussion obsolete =).

lgtm

Markus (顧孟勤)

I am not surprised that the mere fact that we are using BPF filters adds ...

8 years, 6 months ago (2012-06-12 00:30:10 UTC) #13

I am not surprised that the mere fact that we are using BPF filters adds
some overhead. Although admittedly, 500 clock cycles will be hard to notice
in real-life.

I was more interested in how much of a penalty we pay for each additional
BPF instruction. And whether it makes a difference if we jump over lots of
inactive instructions, or whether we just jump a short distance.

There is bound to be a penalty; but is it on the same order as the penalty
for using BPF in the first place, is it much higher, or is it much less?


Markus

On Mon, Jun 11, 2012 at 2:23 PM, <jorgelo@chromium.org> wrote:

> On 2012/06/11 19:56:50, Markus (顧孟勤) wrote:
>
> https://chromiumcodereview.**appspot.com/10536048/diff/**
>
8007/sandbox/linux/seccomp-**bpf/sandbox_bpf.cc<https://chromiumcodereview.appspot.com/10536048/diff/8007/sandbox/linux/seccomp-bpf/sandbox_bpf.cc>
>
>> File sandbox/linux/seccomp-bpf/**sandbox_bpf.cc (right):
>>
>
>
> https://chromiumcodereview.**appspot.com/10536048/diff/**
>
8007/sandbox/linux/seccomp-**bpf/sandbox_bpf.cc#newcode225<https://chromiumcodereview.appspot.com/10536048/diff/8007/sandbox/linux/seccomp-bpf/sandbox_bpf.cc#newcode225>
>
>> sandbox/linux/seccomp-bpf/**sandbox_bpf.cc:225:
>> On 2012/06/11 19:39:10, Jorge Lucangeli Obes wrote:
>> > Since this method is called "setSandboxPolicy", doesn't it make sense to
>> extract
>> > the sanity checks to another method?
>>
>
>  Done.
>>
>
>
> https://chromiumcodereview.**appspot.com/10536048/diff/**
>
8007/sandbox/linux/seccomp-**bpf/sandbox_bpf.cc#newcode343<https://chromiumcodereview.appspot.com/10536048/diff/8007/sandbox/linux/seccomp-bpf/sandbox_bpf.cc#newcode343>
>
>> sandbox/linux/seccomp-bpf/**sandbox_bpf.cc:343: }
>> Have you actually done any benchmarks and does this cost show up
>> anywhere? I
>>
> am
>
>> quite curious.
>>
>
> On an Alex Chrome OS device (Atom) (tests by wad@)
>
> localhost chronos # ./seccomp_getpid 10000000
> Using syscall instr
> No seccomp
> Testing scaffold . . . 23 cycles
> time: 275 cycles
>
>
> localhost chronos # ./seccomp_getpid 10000000 n s
> Using syscall instr
> Using SECCOMP_RET_ALLOW
> Testing scaffold . . . 21 cycles
> time: 788 cycles
>
> Granted, that's very micro-benchmarky, but the simplest filter (just
> RET_ALLOW)
> adds 500 cycles.
>
>
>  My gut feeling is that the number of executed BPF instructions is pretty
>> much
>> negligible on most CPUs. Even if the kernel decides to interpret the BPF
>>
> filter
>
>> instead of using a JIT.
>>
>
> There won't be a JIT in 3.5.
>
>
>  What I do suspect will matter is the overall size of the BPF filter
>> program
>>
> and
>
>> the overall density of the code. These two factors determine how
>> frequently
>>
> the
>
>> CPU needs to reach out to memory instead of being able to load the BPF
>> program
>> from cache.
>>
>
>  Coalescing of ranges already helps somewhat with this goal, and the binary
>>
> tree
>
>> helps even more.
>>
>
>  For the vast majority of applications, cache foot print is the limiting
>>
> factor,
>
>> clock-cyles per instructions are not and haven't been for about ten years
>> or
>>
> so.
>
>> These numbers might have changed again, but a few years ago, you could
>>
> literally
>
>> execute several thousand instructions while waiting for a single memory
>> load.
>>
>
>  Of course, once memory starts loading, it will then try to aggressively
>> stream
>> some more memory. That's where code density helps us. And that's where
>> BPF's
>> forward-only jumps are quite nice.
>>
>
> I do agree that binary trees mostly make this discussion obsolete =).
>
> lgtm
>
>
https://chromiumcodereview.**appspot.com/10536048/<https://chromiumcodereview...
>

Chris Evans

LGTM if you move the policy sanity check to debug-only https://chromiumcodereview.appspot.com/10536048/diff/13001/sandbox/linux/seccomp-bpf/sandbox_bpf.cc File sandbox/linux/seccomp-bpf/sandbox_bpf.cc (right): https://chromiumcodereview.appspot.com/10536048/diff/13001/sandbox/linux/seccomp-bpf/sandbox_bpf.cc#newcode187 ...

8 years, 6 months ago (2012-06-12 18:11:55 UTC) #14

Chris Evans

https://chromiumcodereview.appspot.com/10536048/diff/13001/sandbox/linux/seccomp-bpf/sandbox_bpf.cc File sandbox/linux/seccomp-bpf/sandbox_bpf.cc (right): https://chromiumcodereview.appspot.com/10536048/diff/13001/sandbox/linux/seccomp-bpf/sandbox_bpf.cc#newcode314 sandbox/linux/seccomp-bpf/sandbox_bpf.cc:314: // There is really nothing the caller can do ...

8 years, 6 months ago (2012-06-12 18:13:47 UTC) #15

jln (very slow on Chromium)

On Tue, Jun 12, 2012 at 11:11 AM, <cevans@chromium.org> wrote: > LGTM if you move ...

8 years, 6 months ago (2012-06-12 18:18:59 UTC) #16

Markus (顧孟勤)

https://chromiumcodereview.appspot.com/10536048/diff/13001/sandbox/linux/seccomp-bpf/sandbox_bpf.cc File sandbox/linux/seccomp-bpf/sandbox_bpf.cc (right): https://chromiumcodereview.appspot.com/10536048/diff/13001/sandbox/linux/seccomp-bpf/sandbox_bpf.cc#newcode187 sandbox/linux/seccomp-bpf/sandbox_bpf.cc:187: EvaluateArguments argumentEvaluator) { On 2012/06/12 18:11:55, Chris Evans wrote: ...

8 years, 6 months ago (2012-06-12 19:02:59 UTC) #17

cevans

LGTM On Tue, Jun 12, 2012 at 12:02 PM, <markus@chromium.org> wrote: > > https://chromiumcodereview.**appspot.com/10536048/diff/** > ...

8 years, 6 months ago (2012-06-12 19:12:10 UTC) #18

LGTM

On Tue, Jun 12, 2012 at 12:02 PM, <markus@chromium.org> wrote:

>
> https://chromiumcodereview.**appspot.com/10536048/diff/**
>
13001/sandbox/linux/seccomp-**bpf/sandbox_bpf.cc<https://chromiumcodereview.appspot.com/10536048/diff/13001/sandbox/linux/seccomp-bpf/sandbox_bpf.cc>
> File sandbox/linux/seccomp-bpf/**sandbox_bpf.cc (right):
>
> https://chromiumcodereview.**appspot.com/10536048/diff/**
>
13001/sandbox/linux/seccomp-**bpf/sandbox_bpf.cc#newcode187<https://chromiumcodereview.appspot.com/10536048/diff/13001/sandbox/linux/seccomp-bpf/sandbox_bpf.cc#newcode187>
> sandbox/linux/seccomp-bpf/**sandbox_bpf.cc:187: EvaluateArguments
> argumentEvaluator) {
> On 2012/06/12 18:11:55, Chris Evans wrote:
>
>> I think this should be debug only. That will be more than sufficient
>>
> to catch
>
>> problems, without performing thousands of wasted iterations in
>>
> production.
>
> Done.
>
> I left in the cheap tests, but disabled the expensive tests for
> production builds.
>
>
> https://chromiumcodereview.**appspot.com/10536048/diff/**
>
13001/sandbox/linux/seccomp-**bpf/sandbox_bpf.cc#newcode314<https://chromiumcodereview.appspot.com/10536048/diff/13001/sandbox/linux/seccomp-bpf/sandbox_bpf.cc#newcode314>
> sandbox/linux/seccomp-bpf/**sandbox_bpf.cc:314: // There is really nothing
> the caller can do until the bug is fixed.
> On 2012/06/12 18:13:47, Chris Evans wrote:
>
>> Actually one more thing. I'm not entire groking the order of CLs here
>>
> but it
>
>> looks possible that the #ifndef NDEBUG got lost in the move here?
>>
>
> I'll upload a newly rebased version shortly, and it'll show this NDEBUG
> test
>
>
> https://chromiumcodereview.**appspot.com/10536048/diff/**
>
13001/sandbox/linux/seccomp-**bpf/sandbox_bpf.cc#newcode323<https://chromiumcodereview.appspot.com/10536048/diff/13001/sandbox/linux/seccomp-bpf/sandbox_bpf.cc#newcode323>
> sandbox/linux/seccomp-bpf/**sandbox_bpf.cc:323: prctl(PR_SET_SECCOMP,
> SECCOMP_MODE_FILTER, &prog)) {
> On 2012/06/12 18:11:55, Chris Evans wrote:
>
>> Nit (for the future): it'd be nice to differentiate which of those
>>
> called
>
>> failed.
>>
>
> Already fixed by rebasing.
>
>
https://chromiumcodereview.**appspot.com/10536048/<https://chromiumcodereview...
>

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/markus@chromium.org/10536048/5004

8 years, 6 months ago (2012-06-14 23:23:10 UTC) #19

Change committed as 142295

Expand Messages | Collapse Messages