Issue 10051022: Add an initial Linux GPU sandbox using the seccomp filter framework.

Chris Evans

Julien, if you can concentrate on the large new file and get it to lgtm, ...

8 years, 8 months ago (2012-04-12 00:30:50 UTC) #1

Chris Evans

Ken / Al, can you have a look for the (pretty small!) change to gpu_main.cc ...

8 years, 8 months ago (2012-04-12 00:53:35 UTC) #2

Will Drewry

Awesome! https://chromiumcodereview.appspot.com/10051022/diff/3009/content/common/sandbox_init_linux.cc File content/common/sandbox_init_linux.cc (right): https://chromiumcodereview.appspot.com/10051022/diff/3009/content/common/sandbox_init_linux.cc#newcode54 content/common/sandbox_init_linux.cc:54: unsigned int syscall = context->uc_mcontext.gregs[REG_RAX]; I guess this ...

8 years, 8 months ago (2012-04-12 01:08:56 UTC) #3

Markus (顧孟勤)

I am working on a more generic sandbox that will allow us to protect the ...

8 years, 8 months ago (2012-04-12 17:51:24 UTC) #4

I am working on a more generic sandbox that will allow us to protect the
renderers. But the code isn't quite ready for review yet (most notably, we I
don't have support for 32bit, yet).

My main problem is that I need to support sendmsg() and SysV IPC. Both of them
have interesting security problems. And they are unfortunately not easy to
sandbox.

Having said all of that, the basic code looks similar to what you are doing. I
have two minor comments that might help you:

https://chromiumcodereview.appspot.com/10051022/diff/3009/content/common/sand...
File content/common/sandbox_init_linux.cc (right):

https://chromiumcodereview.appspot.com/10051022/diff/3009/content/common/sand...
content/common/sandbox_init_linux.cc:143: static void
ApplyGPUPolicy(std::vector<struct sock_filter>* program) {
My ultimate plan for efficient syscall filtering is actually to build a tree
that does a binary search.

On the other hand, I am not sure if optimizing the way that we test for system
calls is actually worth the trouble. Has anybody bothered benchmarking this
code? Does the order of the tests show up in benchmarks?

https://chromiumcodereview.appspot.com/10051022/diff/3009/content/common/sand...
content/common/sandbox_init_linux.cc:199: }
If you don't mind the minimal cost of doing an extra fork(), a more thorough
test looks like this:


int Sandbox::supportsSeccompSandbox(int proc_self) {
  pid_t pid = fork();
  if (pid < 0) {
    die("Failed to check for sandbox support");
  }
  if (!pid) {
    static const struct sock_filter filter[] = {
      // If the architecture doesn't match SECCOMP_ARCH, disallow the
      // system call.
      BPF_STMT(BPF_LD+BPF_W+BPF_ABS, offsetof(struct arch_seccomp_data, arch)),
      BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, SECCOMP_ARCH, 0, 3),

      // Check the system call number. The only allowed call is exit_group()
      BPF_STMT(BPF_LD+BPF_W+BPF_ABS, offsetof(struct arch_seccomp_data, nr)),
      BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit_group, 0, 1),
      BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
      BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_KILL),
    };

    // Try to install filter. If we succeed, return success.
    const struct sock_fprog prog = {
      ARRAYSIZE(filter),
      (struct sock_filter *)filter
    };
    if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0) == 0 &&
        prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &prog) == 0) {
      syscall(__NR_exit_group, (intptr_t)0);
    }
    _exit(1);
  }
  int status;
  TEMP_FAILURE_RETRY(waitpid(pid, &status, 0));
  return WIFEXITED(status) && !WEXITSTATUS(status);
}

jln (very slow on Chromium)

https://chromiumcodereview.appspot.com/10051022/diff/3009/content/common/sandbox_init_linux.cc File content/common/sandbox_init_linux.cc (right): https://chromiumcodereview.appspot.com/10051022/diff/3009/content/common/sandbox_init_linux.cc#newcode37 content/common/sandbox_init_linux.cc:37: static void CheckSingleThreaded() { This is racy if potentially ...

8 years, 8 months ago (2012-04-12 17:54:14 UTC) #5

apatrick_chromium

LGTM for gpu_main.cc with one suggestion. https://chromiumcodereview.appspot.com/10051022/diff/3009/content/gpu/gpu_main.cc File content/gpu/gpu_main.cc (right): https://chromiumcodereview.appspot.com/10051022/diff/3009/content/gpu/gpu_main.cc#newcode113 content/gpu/gpu_main.cc:113: (void) base::RandInt(0, 1337); ...

8 years, 8 months ago (2012-04-12 18:36:46 UTC) #6

Chris Evans

https://chromiumcodereview.appspot.com/10051022/diff/3009/content/common/sandbox_init_linux.cc File content/common/sandbox_init_linux.cc (right): https://chromiumcodereview.appspot.com/10051022/diff/3009/content/common/sandbox_init_linux.cc#newcode54 content/common/sandbox_init_linux.cc:54: unsigned int syscall = context->uc_mcontext.gregs[REG_RAX]; On 2012/04/12 01:08:56, Will ...

8 years, 8 months ago (2012-04-12 18:42:48 UTC) #7

https://chromiumcodereview.appspot.com/10051022/diff/3009/content/common/sand...
File content/common/sandbox_init_linux.cc (right):

https://chromiumcodereview.appspot.com/10051022/diff/3009/content/common/sand...
content/common/sandbox_init_linux.cc:54: unsigned int syscall =
context->uc_mcontext.gregs[REG_RAX];
On 2012/04/12 01:08:56, Will Drewry wrote:
> I guess this is easier than wrangling asm/siginfo?

Yeah, and it's also the way the Will / Kees example code grabs it.

> What other info will end up in the dump? Do you want to prime the crash dump
> with other data or are you hoping unwinding will cover all that?

Initial plan is that stack trace + syscall_nr should be sufficient to understand
and add omitted syscalls. If we get into difficulty, we can actually encode tons
more stuff in the deref ptr before we use another channel :)

https://chromiumcodereview.appspot.com/10051022/diff/3009/content/common/sand...
content/common/sandbox_init_linux.cc:133: static void
EmitKill(std::vector<struct sock_filter>* program) {
On 2012/04/12 01:08:56, Will Drewry wrote:
> nit: Any reason not to call it EmitLoggedKill?  Then if you add a
> SECCOMP_RET_KILL call, it'll just be EmitKill :)

Ah nice catch, I did previously used SECCOMP_RET_KILL but then changed to trap.
Renamed to just EmitTrap().

https://chromiumcodereview.appspot.com/10051022/diff/3009/content/common/sand...
content/common/sandbox_init_linux.cc:143: static void
ApplyGPUPolicy(std::vector<struct sock_filter>* program) {
On 2012/04/12 01:08:56, Will Drewry wrote:
> Is the plan to tighten this up further later?

Definitely. Once we've flushed out all the corner-case syscalls, we'll start
locking down arguments.

https://chromiumcodereview.appspot.com/10051022/diff/3009/content/common/sand...
content/common/sandbox_init_linux.cc:199: }
On 2012/04/12 01:08:56, Will Drewry wrote:
> An authoritative check would be:
>   ret = prctl(PR_SET_SECCOMP, 2, NULL);
>   if (ret != 0 && errno == EFAULT)
>     return true;
>   return false;
> 
> but no_new_privs is a good guard too since it is required.  The only ugly bit
is
> that now your CanUse function has side effects :)

Thanks! Much cleaner. I moved PR_SET_NEW_PRIVS to InstallFilter to complete the
cleanup.

https://chromiumcodereview.appspot.com/10051022/diff/3009/content/gpu/gpu_mai...
File content/gpu/gpu_main.cc (right):

https://chromiumcodereview.appspot.com/10051022/diff/3009/content/gpu/gpu_mai...
content/gpu/gpu_main.cc:111: if (gpu_info.vendor_id == 0x1002) {  // ATI
On 2012/04/12 01:08:56, Will Drewry wrote:
> Any chance we can get intel in here?

Yes, and imminently, but not for this first patch. I'm going to test each card
before throwing it in :)
Ideally, someone with a CrOS device + patched kernel could test this. Happy to
add a command line flag if it would help.

Kees Cook

Very exciting! :) https://chromiumcodereview.appspot.com/10051022/diff/3009/content/common/sandbox_init_linux.cc File content/common/sandbox_init_linux.cc (right): https://chromiumcodereview.appspot.com/10051022/diff/3009/content/common/sandbox_init_linux.cc#newcode54 content/common/sandbox_init_linux.cc:54: unsigned int syscall = context->uc_mcontext.gregs[REG_RAX]; On ...

8 years, 8 months ago (2012-04-12 18:56:39 UTC) #8

Chris Evans

On 2012/04/12 17:51:24, Markus (顧孟勤) wrote: > I am working on a more generic sandbox ...

8 years, 8 months ago (2012-04-12 19:01:15 UTC) #9

On 2012/04/12 17:51:24, Markus (顧孟勤) wrote:
> I am working on a more generic sandbox that will allow us to protect the
> renderers.

Awesome! The hardest work represented by this patch series is
1) Determining and then locking down a good policy (aka. syscall list).
2) "Warming up" various little subsystems etc. before entering sandbox lockdown,
so that we can keep the policy tight.

We can re-use all this when we switch over to your generic API.

> But the code isn't quite ready for review yet (most notably, we I
> don't have support for 32bit, yet).
> 
> My main problem is that I need to support sendmsg() and SysV IPC. Both of them
> have interesting security problems. And they are unfortunately not easy to
> sandbox.

Yeah, jln@, jorje@ and I were chatting about this the other day. Depending on
how the renderer <-> browser share stuff, it may need to be rearranged. And
we're hoping X won't through a wrench in :-/

> 
> Having said all of that, the basic code looks similar to what you are doing. I
> have two minor comments that might help you:
> 
>
https://chromiumcodereview.appspot.com/10051022/diff/3009/content/common/sand...
> File content/common/sandbox_init_linux.cc (right):
> 
>
https://chromiumcodereview.appspot.com/10051022/diff/3009/content/common/sand...
> content/common/sandbox_init_linux.cc:143: static void
> ApplyGPUPolicy(std::vector<struct sock_filter>* program) {
> My ultimate plan for efficient syscall filtering is actually to build a tree
> that does a binary search.

That would be nice!
wad@ and I have also chatted about a simple new BPF instruction that permits
either a jump table or a data table solution. I think that's preferable as it
would save many instructions over even a binary tree approach.

> 
> On the other hand, I am not sure if optimizing the way that we test for system
> calls is actually worth the trouble. Has anybody bothered benchmarking this
> code? Does the order of the tests show up in benchmarks?
> 
>
https://chromiumcodereview.appspot.com/10051022/diff/3009/content/common/sand...
> content/common/sandbox_init_linux.cc:199: }
> If you don't mind the minimal cost of doing an extra fork(), a more thorough
> test looks like this:

Thanks. I've gone with Will's test for now as I have confidence in it, plus it's
simple.

> 
> 
> int Sandbox::supportsSeccompSandbox(int proc_self) {
>   pid_t pid = fork();
>   if (pid < 0) {
>     die("Failed to check for sandbox support");
>   }
>   if (!pid) {
>     static const struct sock_filter filter[] = {
>       // If the architecture doesn't match SECCOMP_ARCH, disallow the
>       // system call.
>       BPF_STMT(BPF_LD+BPF_W+BPF_ABS, offsetof(struct arch_seccomp_data,
arch)),
>       BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, SECCOMP_ARCH, 0, 3),
> 
>       // Check the system call number. The only allowed call is exit_group()
>       BPF_STMT(BPF_LD+BPF_W+BPF_ABS, offsetof(struct arch_seccomp_data, nr)),
>       BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit_group, 0, 1),
>       BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
>       BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_KILL),
>     };
> 
>     // Try to install filter. If we succeed, return success.
>     const struct sock_fprog prog = {
>       ARRAYSIZE(filter),
>       (struct sock_filter *)filter
>     };
>     if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0) == 0 &&
>         prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &prog) == 0) {
>       syscall(__NR_exit_group, (intptr_t)0);
>     }
>     _exit(1);
>   }
>   int status;
>   TEMP_FAILURE_RETRY(waitpid(pid, &status, 0));
>   return WIFEXITED(status) && !WEXITSTATUS(status);
> }

Markus (顧孟勤)

On Thu, Apr 12, 2012 at 12:01, <cevans@chromium.org> wrote: > Awesome! It'll be a drop-in ...

8 years, 8 months ago (2012-04-12 19:42:44 UTC) #10

On Thu, Apr 12, 2012 at 12:01,  <cevans@chromium.org> wrote:
> Awesome!

It'll be a drop-in replacement for the existing Mode-1 seccomp
sandbox. I still can't make up my mind whether I'll keep the old
sandbox around, or whether I'll delete it and instead install the new
one.

> The hardest work represented by this patch series is
> 1) Determining and then locking down a good policy (aka. syscall list).

I obviously don't know about the GPU task, but for the renderer this
work has already been done:
http://code.google.com/p/seccompsandbox/source/browse/trunk/system_call_table.cc
That list has been surprisingly stable.

> 2) "Warming up" various little subsystems etc. before entering sandbox
> lockdown, so that we can keep the policy tight.

Fortunately, for the renderers this work has already been done a long
time ago. I think, Windows and Mac had the same problem.

> We can re-use all this when we switch over to your generic API.

Definitely. One of the things that I need to make easier is a way for
different users of the sandbox to select different policies. I have a
couple of options already, but this probably needs to be refined a
little more, as we get more users.

> Yeah, jln@, jorje@ and I were chatting about this the other day. Depending
> on
> how the renderer <-> browser share stuff, it may need to be rearranged. And
> we're hoping X won't through a wrench in :-/

This is tricky. I can change the renderer so that it greatly reduces
the number of sendmsg() calls. That's pretty easy to do by falling
back on recvmsg() instead. The preference for sendmsg() was only
there, because that works a lot more efficiently with the old seccomp
sandbox (ping me, if you want the gory details).

But it is a single flag that can be tuned one way or the other.

Unfortunately, besides our main IPC mechanism, we have a few minor
ad-hoc IPC mechanisms that are hard to find/fix. I think, we have an
IPC mechanism somewhere, that is a callback into fontconfig. I don't
recall the details. AGL probably knows more.

And we also had some issues at some point, where NaCl was using
sendmsg() for each video frame that they were producing. No idea,
whether this is still the case.

The upshot is, the renderer is a complex piece of software, and it'll
be hard to completely eradicate a system call; and to make sure it
doesn't get new users.

The good news is that I have solution for sendmsg() that I am
reasonably happy with. It's not as good as having kernel support, and
it does require a helper process. But other than that, it shouldn't
hurt much.

SysV IPC is a whole different issue. It is very complex to handle this
API safely. It is just as horribly insecure and broken as it looks.
Unfortunately, it is needed for MIT Shared Memory, if we want to get
fast access to the X server. I should have a working solution in the
next day or two. But I am not happy with it in general; very
complicated code.

If I had one wish for a new kernel API, then it would be the ability
to send memory mappings as ancillary data in a call to sendmsg(). That
would allow us to make SysV shared memory calls from a helper process,
without adding much complexity.

> That would be nice!
> wad@ and I have also chatted about a simple new BPF instruction that permits
> either a jump table or a data table solution. I think that's preferable as
> it
> would save many instructions over even a binary tree approach.

It probably would produce about the same code in the JIT. So, I don't
feel it is absolutely required to have this instruction. But it
certainly would make user space much cleaner -- at the cost of some
complexity in the kernel. I leave it up to Will to decide if he wants
to fight this battle.

Again, I'd love to see benchmark numbers in order to judge whether any
of this effort is actually worth it.

Markus

Chris Evans

On 2012/04/12 18:56:39, Kees Cook wrote: > Very exciting! :) > > https://chromiumcodereview.appspot.com/10051022/diff/3009/content/common/sandbox_init_linux.cc > File ...

8 years, 8 months ago (2012-04-12 19:45:18 UTC) #11

On 2012/04/12 18:56:39, Kees Cook wrote:
> Very exciting! :)
> 
>
https://chromiumcodereview.appspot.com/10051022/diff/3009/content/common/sand...
> File content/common/sandbox_init_linux.cc (right):
> 
>
https://chromiumcodereview.appspot.com/10051022/diff/3009/content/common/sand...
> content/common/sandbox_init_linux.cc:54: unsigned int syscall =
> context->uc_mcontext.gregs[REG_RAX];
> On 2012/04/12 18:42:49, Chris Evans wrote:
> > On 2012/04/12 01:08:56, Will Drewry wrote:
> > > I guess this is easier than wrangling asm/siginfo?
> > 
> > Yeah, and it's also the way the Will / Kees example code grabs it.
> 
> Will found a way to include the kernel's siginfo.h sanely, but that'll make
> things depend on that specific siginfo.h file for builds. I think instead of
> using the arch-specific REG_RAX, it'd be better to include a copy of the
> kernel's siginfo structure instead, and just recast the glibc siginfo_t to the
> new structure.

Chatted with Will and for now we propose to leave it as-is, in case access to
other registers becomes desirable (e.g. to work out _which_ fcntl() someone is
doing).

> 
>
https://chromiumcodereview.appspot.com/10051022/diff/3009/content/common/sand...
> content/common/sandbox_init_linux.cc:92: filter.k = 0;
> I would use the actual SECCOMP_* defines here and later, instead of literals,

Yeah, nice idea, done.
I also started using SECCOMP_MODE_FILTER instead of the magic value "2". I
included linux/audit.h for AUDIT_ARCH_X86_64 because that header file has been
around for a while. I also added a define for eventfd2, so older machines can
still build.

> with a section at the top with the values defined. That way, once #include
> <linux/seccomp.h> is possible, the defines can go away easily without needing
to
> check all the code.
> 
>
https://chromiumcodereview.appspot.com/10051022/diff/3009/content/common/sand...
> content/common/sandbox_init_linux.cc:199: }
> I wonder if the filter should test for failure instead of passing exit 0
> cleanly. i.e. verify that the filter actually blocks things rather than just
> installs. So, reduce it to a single statement (RET_KILL), and check
> WIFSIGNALED(status) && WTERMSIG(status) == 31

Jorge Lucangeli Obes

On 2012/04/12 19:42:44, Markus (顧孟勤) wrote: > On Thu, Apr 12, 2012 at 12:01, <mailto:cevans@chromium.org> ...

8 years, 8 months ago (2012-04-12 19:52:10 UTC) #12

On 2012/04/12 19:42:44, Markus (顧孟勤) wrote:
> On Thu, Apr 12, 2012 at 12:01,  <mailto:cevans@chromium.org> wrote:
> > Awesome!
> 
> It'll be a drop-in replacement for the existing Mode-1 seccomp
> sandbox. I still can't make up my mind whether I'll keep the old
> sandbox around, or whether I'll delete it and instead install the new
> one.
> 
> > The hardest work represented by this patch series is
> > 1) Determining and then locking down a good policy (aka. syscall list).
> 
> I obviously don't know about the GPU task, but for the renderer this
> work has already been done:
>
http://code.google.com/p/seccompsandbox/source/browse/trunk/system_call_table.cc
> That list has been surprisingly stable.
> 
> > 2) "Warming up" various little subsystems etc. before entering sandbox
> > lockdown, so that we can keep the policy tight.
> 
> Fortunately, for the renderers this work has already been done a long
> time ago. I think, Windows and Mac had the same problem.
> 
> > We can re-use all this when we switch over to your generic API.
> 
> Definitely. One of the things that I need to make easier is a way for
> different users of the sandbox to select different policies. I have a
> couple of options already, but this probably needs to be refined a
> little more, as we get more users.
> 
> > Yeah, jln@, jorje@ and I were chatting about this the other day. Depending
> > on
> > how the renderer <-> browser share stuff, it may need to be rearranged. And
> > we're hoping X won't through a wrench in :-/
> 
> This is tricky. I can change the renderer so that it greatly reduces
> the number of sendmsg() calls. That's pretty easy to do by falling
> back on recvmsg() instead. The preference for sendmsg() was only
> there, because that works a lot more efficiently with the old seccomp
> sandbox (ping me, if you want the gory details).
> 
> But it is a single flag that can be tuned one way or the other.
> 
> Unfortunately, besides our main IPC mechanism, we have a few minor
> ad-hoc IPC mechanisms that are hard to find/fix. I think, we have an
> IPC mechanism somewhere, that is a callback into fontconfig. I don't
> recall the details. AGL probably knows more.

Yeah, that's the "Sandbox IPC", which is used to get the renderers to talk to
fontconfig. It uses sendmsg/recvmsg so it shouldn't be a problem.

http://code.google.com/searchframe#OAMlx_jo-ck/src/content/browser/zygote_mai...

> And we also had some issues at some point, where NaCl was using
> sendmsg() for each video frame that they were producing. No idea,
> whether this is still the case.
> 
> The upshot is, the renderer is a complex piece of software, and it'll
> be hard to completely eradicate a system call; and to make sure it
> doesn't get new users.
> 
> The good news is that I have solution for sendmsg() that I am
> reasonably happy with. It's not as good as having kernel support, and
> it does require a helper process. But other than that, it shouldn't
> hurt much.
> 
> SysV IPC is a whole different issue. It is very complex to handle this
> API safely. It is just as horribly insecure and broken as it looks.
> Unfortunately, it is needed for MIT Shared Memory, if we want to get
> fast access to the X server. I should have a working solution in the
> next day or two. But I am not happy with it in general; very
> complicated code.
> 
> If I had one wish for a new kernel API, then it would be the ability
> to send memory mappings as ancillary data in a call to sendmsg(). That
> would allow us to make SysV shared memory calls from a helper process,
> without adding much complexity.
> 
> > That would be nice!
> > wad@ and I have also chatted about a simple new BPF instruction that permits
> > either a jump table or a data table solution. I think that's preferable as
> > it
> > would save many instructions over even a binary tree approach.
> 
> It probably would produce about the same code in the JIT. So, I don't
> feel it is absolutely required to have this instruction. But it
> certainly would make user space much cleaner -- at the cost of some
> complexity in the kernel. I leave it up to Will to decide if he wants
> to fight this battle.
> 
> Again, I'd love to see benchmark numbers in order to judge whether any
> of this effort is actually worth it.
> 
> 
> Markus

Chris Evans

On 2012/04/12 18:36:46, apatrick_chromium wrote: > LGTM for gpu_main.cc with one suggestion. > > https://chromiumcodereview.appspot.com/10051022/diff/3009/content/gpu/gpu_main.cc ...

8 years, 8 months ago (2012-04-12 20:31:15 UTC) #13

Chris Evans

On 2012/04/12 17:54:14, jln wrote: > https://chromiumcodereview.appspot.com/10051022/diff/3009/content/common/sandbox_init_linux.cc > File content/common/sandbox_init_linux.cc (right): > > https://chromiumcodereview.appspot.com/10051022/diff/3009/content/common/sandbox_init_linux.cc#newcode37 > ...

8 years, 8 months ago (2012-04-12 20:33:40 UTC) #14

Markus (顧孟勤)

On Thu, Apr 12, 2012 at 12:52, <jorgelo@chromium.org> wrote: > Yeah, that's the "Sandbox IPC", ...

8 years, 8 months ago (2012-04-12 20:54:28 UTC) #15

Jorge Lucangeli Obes

On 2012/04/12 20:54:28, Markus (顧孟勤) wrote: > On Thu, Apr 12, 2012 at 12:52, <mailto:jorgelo@chromium.org> ...

8 years, 8 months ago (2012-04-12 21:52:57 UTC) #16

On 2012/04/12 20:54:28, Markus (顧孟勤) wrote:
> On Thu, Apr 12, 2012 at 12:52,  <mailto:jorgelo@chromium.org> wrote:
> > Yeah, that's the "Sandbox IPC", which is used to get the renderers to talk
> > to fontconfig. It uses sendmsg/recvmsg so it shouldn't be a problem.
> 
> sendmsg() is a bit of a problem. On x86_32, it is difficult, because
> it uses socketcall(), which introduces an extra indirection. On
> x86_64, we don't have this problem, but we still do have the problem
> of "struct msghdr" doing way too many different things at once.
> 
> We want to make sure that we cannot send on an unconnected socket, as
> that could have all sorts of interesting security implications (e.g.
> it could probably be used to map your LAN). We also, if at all
> possible, want to filter the type of ancillary messages. Sending file
> descriptors is probably safe. But this API just looks like the type of
> thing that will get richer over time; so, we don't want to allow
> sending of arbitrary ancillary data which could have unknown security
> properties.
> 
> All of this requires checking data that lives in memory rather than in
> registers. It is thus not accessible from BPF filters.
> 
> If, on the other hand, we could change the protocol and use recvmsg()
> inside of the sandbox instead of using sendmsg(), things would be much
> easier. recvmsg() doesn't have these security problems and can be
> allowed (almost) unconditionally.

I keep forgetting that we still need to support 32 bits for regular Linux. These
are all very good points.

I've been going over the code for the sandbox IPC and it does need to pass fd's
around. Moreover, since all the renderers share the same socket to communicate
with the Sandbox IPC host, each request includes the fd where the reply should
be written to.

Anyways, I did not mean to completely hijack the thread. I agree that this is a
great way to start using seccomp filter + BPF.

cevans

On Thu, Apr 12, 2012 at 2:52 PM, <jorgelo@chromium.org> wrote: > On 2012/04/12 20:54:28, Markus ...

8 years, 8 months ago (2012-04-12 22:00:29 UTC) #17

On Thu, Apr 12, 2012 at 2:52 PM, <jorgelo@chromium.org> wrote:

> On 2012/04/12 20:54:28, Markus (顧孟勤) wrote:
>
>  On Thu, Apr 12, 2012 at 12:52,  <mailto:jorgelo@chromium.org> wrote:
>> > Yeah, that's the "Sandbox IPC", which is used to get the renderers to
>> talk
>> > to fontconfig. It uses sendmsg/recvmsg so it shouldn't be a problem.
>>
>
>  sendmsg() is a bit of a problem. On x86_32, it is difficult, because
>> it uses socketcall(), which introduces an extra indirection. On
>> x86_64, we don't have this problem, but we still do have the problem
>> of "struct msghdr" doing way too many different things at once.
>>
>
>  We want to make sure that we cannot send on an unconnected socket, as
>> that could have all sorts of interesting security implications (e.g.
>> it could probably be used to map your LAN). We also, if at all
>> possible, want to filter the type of ancillary messages. Sending file
>> descriptors is probably safe. But this API just looks like the type of
>> thing that will get richer over time; so, we don't want to allow
>> sending of arbitrary ancillary data which could have unknown security
>> properties.
>>
>
>  All of this requires checking data that lives in memory rather than in
>> registers. It is thus not accessible from BPF filters.
>>
>
>  If, on the other hand, we could change the protocol and use recvmsg()
>> inside of the sandbox instead of using sendmsg(), things would be much
>> easier. recvmsg() doesn't have these security problems and can be
>> allowed (almost) unconditionally.
>>
>
> I keep forgetting that we still need to support 32 bits for regular Linux.
> These
> are all very good points.
>
> I've been going over the code for the sandbox IPC and it does need to pass
> fd's
> around. Moreover, since all the renderers share the same socket to
> communicate
> with the Sandbox IPC host, each request includes the fd where the reply
> should
> be written to.
>
> Anyways, I did not mean to completely hijack the thread. I agree that this
> is a
> great way to start using seccomp filter + BPF.
>

One of you Linux-y guys want to lgtm so jam@ isn't scared when I approach
him for OWNERS lgtm?


>
https://chromiumcodereview.**appspot.com/10051022/<https://chromiumcodereview...
>

Markus (顧孟勤)

Hold on for a minute or two. I am about to give you a little ...

8 years, 8 months ago (2012-04-12 22:12:58 UTC) #21

Hold on for a minute or two. I am about to give you a little more feedback.

On Thu, Apr 12, 2012 at 15:00, Chris Evans <cevans@google.com> wrote:
> On Thu, Apr 12, 2012 at 2:52 PM, <jorgelo@chromium.org> wrote:
>>
>> On 2012/04/12 20:54:28, Markus (顧孟勤) wrote:
>>
>>> On Thu, Apr 12, 2012 at 12:52,  <mailto:jorgelo@chromium.org> wrote:
>>> > Yeah, that's the "Sandbox IPC", which is used to get the renderers to
>>> > talk
>>> > to fontconfig. It uses sendmsg/recvmsg so it shouldn't be a problem.
>>
>>
>>> sendmsg() is a bit of a problem. On x86_32, it is difficult, because
>>> it uses socketcall(), which introduces an extra indirection. On
>>> x86_64, we don't have this problem, but we still do have the problem
>>> of "struct msghdr" doing way too many different things at once.
>>
>>
>>> We want to make sure that we cannot send on an unconnected socket, as
>>> that could have all sorts of interesting security implications (e.g.
>>> it could probably be used to map your LAN). We also, if at all
>>> possible, want to filter the type of ancillary messages. Sending file
>>> descriptors is probably safe. But this API just looks like the type of
>>> thing that will get richer over time; so, we don't want to allow
>>> sending of arbitrary ancillary data which could have unknown security
>>> properties.
>>
>>
>>> All of this requires checking data that lives in memory rather than in
>>> registers. It is thus not accessible from BPF filters.
>>
>>
>>> If, on the other hand, we could change the protocol and use recvmsg()
>>> inside of the sandbox instead of using sendmsg(), things would be much
>>> easier. recvmsg() doesn't have these security problems and can be
>>> allowed (almost) unconditionally.
>>
>>
>> I keep forgetting that we still need to support 32 bits for regular Linux.
>> These
>> are all very good points.
>>
>> I've been going over the code for the sandbox IPC and it does need to pass
>> fd's
>> around. Moreover, since all the renderers share the same socket to
>> communicate
>> with the Sandbox IPC host, each request includes the fd where the reply
>> should
>> be written to.
>>
>> Anyways, I did not mean to completely hijack the thread. I agree that this
>> is a
>> great way to start using seccomp filter + BPF.
>
>
> One of you Linux-y guys want to lgtm so jam@ isn't scared when I approach
> him for OWNERS lgtm?
>
>>
>> https://chromiumcodereview.appspot.com/10051022/
>
>

Markus (顧孟勤)

lgtm Overall, this looks good. There are still a couple of noticeable gaps. But we ...

8 years, 8 months ago (2012-04-12 22:22:33 UTC) #22

lgtm

Overall, this looks good. There are still a couple of noticeable gaps. But we
can close those later.

I am wondering whether the location for this code is correct though. Shouldn't
it go into src/sandbox/linux/...? If so, you should probably pick some unique
directory name, so that we don't run into problems when I am ready to check in
the more generic code. That one will likely go into src/sandbox/linux/seccomp2
or possibly src/sandbox/linux/seccomp-bpf

https://chromiumcodereview.appspot.com/10051022/diff/12003/content/common/san...
File content/common/sandbox_init_linux.cc (right):

https://chromiumcodereview.appspot.com/10051022/diff/12003/content/common/san...
content/common/sandbox_init_linux.cc:93: filter.k = 4;
Instead of hardcoding the offset, you might want to compute it with offsetof()

https://chromiumcodereview.appspot.com/10051022/diff/12003/content/common/san...
content/common/sandbox_init_linux.cc:112: filter.k = 0;
Same here

https://chromiumcodereview.appspot.com/10051022/diff/12003/content/common/san...
content/common/sandbox_init_linux.cc:127: filter.k = SECCOMP_RET_ALLOW;
This generates twice as many instructions as necessary.

Not sure if you really worry about this. But you could have one comparison per
system call and then one single SECCOMP_RET_ALLOW for the whole group of system
calls.

You just have to make sure you don't exceed 256 calls.

https://chromiumcodereview.appspot.com/10051022/diff/12003/content/common/san...
content/common/sandbox_init_linux.cc:132: std::vector<struct sock_filter>*
program) {
EmitAllowSyscall() is really the same as EmitFailSyscall() with the exception of
the exit code. You might want to combine these two functions.

https://chromiumcodereview.appspot.com/10051022/diff/12003/content/common/san...
content/common/sandbox_init_linux.cc:159: EmitAllowSyscall(__NR_ioctl, program);
That sound dangerous. There are way too many things that ioctl() can do. Can you
look at the arguments and limit things to something a little saner.

https://chromiumcodereview.appspot.com/10051022/diff/12003/content/common/san...
content/common/sandbox_init_linux.cc:169: EmitAllowSyscall(__NR_madvise,
program);
The memory functions (madvise, mmap, munmap, mprotect, mremap, ...) are
potentially tricky, if your code relies on the integrity of some of its data. I
don't think this is a problem you are currently having. But just something to
keep in mind, if you ever plan on extending the sandbox.

https://chromiumcodereview.appspot.com/10051022/diff/12003/content/common/san...
content/common/sandbox_init_linux.cc:170: EmitAllowSyscall(__NR_sendmsg,
program);
sendmsg() is potentially dangerous. At the very least, if could be used to map
your LAN. But other types of attack are also conceivable.

https://chromiumcodereview.appspot.com/10051022/diff/12003/content/common/san...
content/common/sandbox_init_linux.cc:176: EmitAllowSyscall(__NR_clone, program);
clone() can have surprising semantics. I'd suggest you only allow the specific
subset of flags that you actually need to support.

https://chromiumcodereview.appspot.com/10051022/diff/12003/content/common/san...
content/common/sandbox_init_linux.cc:186: EmitAllowSyscall(__NR_prctl, program);
I would strongly encourage you to limit prctl() to just those arguments that you
care about. It's a grab-bag of different features that probably shouldn't all be
allowed.

https://chromiumcodereview.appspot.com/10051022/diff/12003/content/common/san...
content/common/sandbox_init_linux.cc:189: EmitAllowSyscall(__NR_restart_syscall,
program);
Is this actually visible to seccomp/BPF? I thought it was an internal function,
but I might very well be mistaken.

jln (very slow on Chromium)

https://chromiumcodereview.appspot.com/10051022/diff/12003/content/common/sandbox_init_linux.cc File content/common/sandbox_init_linux.cc (right): https://chromiumcodereview.appspot.com/10051022/diff/12003/content/common/sandbox_init_linux.cc#newcode189 content/common/sandbox_init_linux.cc:189: EmitAllowSyscall(__NR_restart_syscall, program); On 2012/04/12 22:22:33, Markus (顧孟勤) wrote: > ...

8 years, 8 months ago (2012-04-12 22:36:46 UTC) #23

Markus (顧孟勤)

I wrote a simple little test program. And no matter what I throw at it, ...

8 years, 8 months ago (2012-04-12 23:25:27 UTC) #24

cevans

On Thu, Apr 12, 2012 at 4:25 PM, Markus Gutschke <markus@chromium.org>wrote: > I wrote a ...

8 years, 8 months ago (2012-04-12 23:28:19 UTC) #25

Markus (顧孟勤)

strace() output is not necessarily to be trusted in this situation. For instance, strace can ...

8 years, 8 months ago (2012-04-12 23:35:24 UTC) #26

cevans

On Thu, Apr 12, 2012 at 4:34 PM, Markus Gutschke <markus@chromium.org>wrote: > strace() output is ...

8 years, 8 months ago (2012-04-12 23:36:05 UTC) #27

Chris Evans

On 2012/04/12 22:22:33, Markus (顧孟勤) wrote: > lgtm > > Overall, this looks good. There ...

8 years, 8 months ago (2012-04-12 23:52:42 UTC) #28

On 2012/04/12 22:22:33, Markus (顧孟勤) wrote:
> lgtm
> 
> Overall, this looks good. There are still a couple of noticeable gaps. But we
> can close those later.
> 
> I am wondering whether the location for this code is correct though. Shouldn't
> it go into src/sandbox/linux/...? If so, you should probably pick some unique
> directory name, so that we don't run into problems when I am ready to check in
> the more generic code. That one will likely go into src/sandbox/linux/seccomp2
> or possibly src/sandbox/linux/seccomp-bpf

I think longer term, the policy should live with the code it applies to (e.g.
content/ ) and the mechanism should live in sandbox. We'll get there when you
flip over to using your generic API in the future.

> 
>
https://chromiumcodereview.appspot.com/10051022/diff/12003/content/common/san...
> File content/common/sandbox_init_linux.cc (right):
> 
>
https://chromiumcodereview.appspot.com/10051022/diff/12003/content/common/san...
> content/common/sandbox_init_linux.cc:93: filter.k = 4;
> Instead of hardcoding the offset, you might want to compute it with offsetof()

Do you mind if we defer for now? I'm trying to keep copies of structures out.
I'll certainly 

> 
>
https://chromiumcodereview.appspot.com/10051022/diff/12003/content/common/san...
> content/common/sandbox_init_linux.cc:112: filter.k = 0;
> Same here
> 
>
https://chromiumcodereview.appspot.com/10051022/diff/12003/content/common/san...
> content/common/sandbox_init_linux.cc:127: filter.k = SECCOMP_RET_ALLOW;
> This generates twice as many instructions as necessary.
> 
> Not sure if you really worry about this. But you could have one comparison per
> system call and then one single SECCOMP_RET_ALLOW for the whole group of
system
> calls.

I prefer to leave it as-is now, for the sake of simplicity. Otherwise, I might
need to know the length of the program when generating jumps, etc.

> 
> You just have to make sure you don't exceed 256 calls.
> 
>
https://chromiumcodereview.appspot.com/10051022/diff/12003/content/common/san...
> content/common/sandbox_init_linux.cc:132: std::vector<struct sock_filter>*
> program) {
> EmitAllowSyscall() is really the same as EmitFailSyscall() with the exception
of
> the exit code. You might want to combine these two functions.

Yeah, I'll factor out this commonality (and some other commonality I see) and
make it cleaner. Good idea. I'll do it before I land it.

> 
>
https://chromiumcodereview.appspot.com/10051022/diff/12003/content/common/san...
> content/common/sandbox_init_linux.cc:159: EmitAllowSyscall(__NR_ioctl,
program);
> That sound dangerous. There are way too many things that ioctl() can do. Can
you
> look at the arguments and limit things to something a little saner.

Julien and Will were raising questions about argument checking earlier. It's
intended that we revisit this once we've got a stable policy that works. We will
then lock it down to the extent possible.

(Will also raised MAP_EXEC which unfortunately we need for the case LLVM kicks
it when a card lacks a certain shader support :)

I'm not too worried about the specific ioctl() case. There are very few generic
file ioctls, and the policy doesn't permit opening of other special files. And
fd 0/1/2 are never terminal fds so all the horrible terminal ioctl()s are in
play.

That said, we can try and lock it down in the future.

> 
>
https://chromiumcodereview.appspot.com/10051022/diff/12003/content/common/san...
> content/common/sandbox_init_linux.cc:169: EmitAllowSyscall(__NR_madvise,
> program);
> The memory functions (madvise, mmap, munmap, mprotect, mremap, ...) are
> potentially tricky, if your code relies on the integrity of some of its data.
I
> don't think this is a problem you are currently having. But just something to
> keep in mind, if you ever plan on extending the sandbox.
> 
>
https://chromiumcodereview.appspot.com/10051022/diff/12003/content/common/san...
> content/common/sandbox_init_linux.cc:170: EmitAllowSyscall(__NR_sendmsg,
> program);
> sendmsg() is potentially dangerous. At the very least, if could be used to map
> your LAN. But other types of attack are also conceivable.

I don't think the process has any open network fd's (other than local and
connected socks from socketpairs). Neither does the policy permit socket(),
thank goodness. So I doubt an attack can be mounted.

> 
>
https://chromiumcodereview.appspot.com/10051022/diff/12003/content/common/san...
> content/common/sandbox_init_linux.cc:176: EmitAllowSyscall(__NR_clone,
program);
> clone() can have surprising semantics. I'd suggest you only allow the specific
> subset of flags that you actually need to support.
> 
>
https://chromiumcodereview.appspot.com/10051022/diff/12003/content/common/san...
> content/common/sandbox_init_linux.cc:186: EmitAllowSyscall(__NR_prctl,
program);
> I would strongly encourage you to limit prctl() to just those arguments that
you
> care about. It's a grab-bag of different features that probably shouldn't all
be
> allowed.

Yeah, definitely on the TODO list.

> 
>
https://chromiumcodereview.appspot.com/10051022/diff/12003/content/common/san...
> content/common/sandbox_init_linux.cc:189:
EmitAllowSyscall(__NR_restart_syscall,
> program);
> Is this actually visible to seccomp/BPF? I thought it was an internal
function,
> but I might very well be mistaken.

Markus (顧孟勤)

On Thu, Apr 12, 2012 at 16:52, <cevans@chromium.org> wrote: > Do you mind if we ...

8 years, 8 months ago (2012-04-13 00:06:16 UTC) #29

On Thu, Apr 12, 2012 at 16:52,  <cevans@chromium.org> wrote:
> Do you mind if we defer for now? I'm trying to keep copies of structures
> out.

If you think this code is eventually going to go away anyway, I don't
mind too much. I think, having hard-coded constants makes the code a
little less readable. But that's your call. In my case, I opted to
define "struct arch_seccomp_data", which looks exactly like the
official "struct seccomp_data". But it doesn't cause compilation
failures, independent of whether the system headers define the struct.

> I prefer to leave it as-is now, for the sake of simplicity. Otherwise, I
> might
> need to know the length of the program when generating jumps, etc.

It's not particularly complicated to automatically compute these
jumps. But again, if you ultimately expect this code to go away in
favor of the more generic sandbox, then removing these instructions
sounds like pre-mature optimization. It probably is more important to
have some code that is easy to understand and that allows you to start
experimenting with.

Maybe, just add a comment that this is known to be sub-optimal, but it
doesn't matter at this time.

> Julien and Will were raising questions about argument checking earlier. It's
> intended that we revisit this once we've got a stable policy that works. We
> will then lock it down to the extent possible.

ioctl() has been the source of numerous kernel bugs, and I am worried
about crazy super-permissive ioctl()'s in graphics drivers. But there
is probably not a lot we can do there. And you are correct that there
are not that many generic ioctl()s.

> I don't think the process has any open network fd's (other than local and
> connected socks from socketpairs). Neither does the policy permit socket(),
> thank goodness. So I doubt an attack can be mounted.

Socketpair is probably OK, as it creates a UNIX domain socket, which
should be safe. But any INET socket would be dangerous.

Markus (顧孟勤)

On Thu, Apr 12, 2012 at 16:36, Chris Evans <cevans@google.com> wrote: > On Thu, Apr ...

8 years, 8 months ago (2012-04-13 00:09:07 UTC) #30

cevans

On Thu, Apr 12, 2012 at 5:08 PM, Markus Gutschke <markus@chromium.org>wrote: > On Thu, Apr ...

8 years, 8 months ago (2012-04-13 00:11:46 UTC) #31

Markus (顧孟勤)

That would help. I am trying to reproduce, but I don't have any luck yet. ...

8 years, 8 months ago (2012-04-13 00:12:52 UTC) #32

cevans

On Thu, Apr 12, 2012 at 5:12 PM, Markus Gutschke <markus@chromium.org>wrote: > That would help. ...

8 years, 8 months ago (2012-04-13 00:24:53 UTC) #33

On Thu, Apr 12, 2012 at 5:12 PM, Markus Gutschke <markus@chromium.org>wrote:

> That would help. I am trying to reproduce, but I don't have any luck yet.
>

[ 3899.008325] type=1701 audit(1334276389.054:3392): auid=4294967295
uid=115 gid=125 ses=4294967295 pid=5411 comm="vsftpd" reason="seccomp"
sig=31 syscall=219 compat=0 ip=0x7f1d9bff7020 code=0x0
#define __NR_restart_syscall                    219

- vsftpd-3.0.0
- comment out "  allow_nr(__NR_restart_syscall);"
- run a simple anonymous setup, but with:
trans_chunk_size=4096
anon_max_rate=1000
... in /etc/vsftpd.conf
- start the transfer of a large (some MB) file
- strace the process running as "ftp" (this will probably interrupt
sendfile)
- bang

I also tried with "use_sendfile=NO" and still bang so it seems to work for
many forms of blocking read.


>
> Markus
>
> On Thu, Apr 12, 2012 at 17:11, Chris Evans <cevans@google.com> wrote:
> > On Thu, Apr 12, 2012 at 5:08 PM, Markus Gutschke <markus@chromium.org>
> > wrote:
> >>
> >> On Thu, Apr 12, 2012 at 16:36, Chris Evans <cevans@google.com> wrote:
> >> > On Thu, Apr 12, 2012 at 4:34 PM, Markus Gutschke <markus@chromium.org
> >
> >> > wrote:
> >> >>
> >> >> strace() output is not necessarily to be trusted in this situation.
> >> >> For instance, strace can occasionally see ERESTART, which I believe
> is
> >> >> otherwise invisible to userspace.
> >> >
> >> >
> >> > My program died and the kernel logged a violation.
> >>
> >> Are you sure you didn't see the result of a call to sigreturn() or
> >> rt_sigreturn(). I don't think you are currently white-listing both of
> >> them.
> >
> >
> > vsftpd-3.0.0, seccompsandbox.c
> >   /* Misc simple low-risk calls. */
> >   allow_nr(__NR_rt_sigreturn); /* Used to handle SIGPIPE. */
> >   allow_nr(__NR_restart_syscall);
> >   allow_nr(__NR_close);
> >
> > Do you want me to retry it?
> >
> >>
> >> And on x86-32 you could definitely see sigreturn() instead of
> >> rt_sigreturn() (this depends on the sa_flags that you gave in
> >> sigaction()).
> >>
> >>
> >> Markus
> >
> >
>

cevans

On Thu, Apr 12, 2012 at 5:24 PM, Chris Evans <cevans@google.com> wrote: > On Thu, ...

8 years, 8 months ago (2012-04-13 00:25:54 UTC) #34

On Thu, Apr 12, 2012 at 5:24 PM, Chris Evans <cevans@google.com> wrote:

> On Thu, Apr 12, 2012 at 5:12 PM, Markus Gutschke <markus@chromium.org>wrote:
>
>> That would help. I am trying to reproduce, but I don't have any luck yet.
>>
>
> [ 3899.008325] type=1701 audit(1334276389.054:3392): auid=4294967295
> uid=115 gid=125 ses=4294967295 pid=5411 comm="vsftpd" reason="seccomp"
> sig=31 syscall=219 compat=0 ip=0x7f1d9bff7020 code=0x0
> #define __NR_restart_syscall                    219
>
> - vsftpd-3.0.0
> - comment out "  allow_nr(__NR_restart_syscall);"
> - run a simple anonymous setup, but with:
> trans_chunk_size=4096
> anon_max_rate=1000
> ... in /etc/vsftpd.conf
> - start the transfer of a large (some MB) file
> - strace the process running as "ftp" (this will probably interrupt
> sendfile)
> - bang
>
> I also tried with "use_sendfile=NO" and still bang so it seems to work for
> many forms of blocking read.
>

Ah, Julien points out (over chat) that it could be nanosleep() that is
interrupted. anon_max_rate causes vsftpd to sleep a lot :)


>
>
>>
>> Markus
>>
>> On Thu, Apr 12, 2012 at 17:11, Chris Evans <cevans@google.com> wrote:
>> > On Thu, Apr 12, 2012 at 5:08 PM, Markus Gutschke <markus@chromium.org>
>> > wrote:
>> >>
>> >> On Thu, Apr 12, 2012 at 16:36, Chris Evans <cevans@google.com> wrote:
>> >> > On Thu, Apr 12, 2012 at 4:34 PM, Markus Gutschke <
>> markus@chromium.org>
>> >> > wrote:
>> >> >>
>> >> >> strace() output is not necessarily to be trusted in this situation.
>> >> >> For instance, strace can occasionally see ERESTART, which I believe
>> is
>> >> >> otherwise invisible to userspace.
>> >> >
>> >> >
>> >> > My program died and the kernel logged a violation.
>> >>
>> >> Are you sure you didn't see the result of a call to sigreturn() or
>> >> rt_sigreturn(). I don't think you are currently white-listing both of
>> >> them.
>> >
>> >
>> > vsftpd-3.0.0, seccompsandbox.c
>> >   /* Misc simple low-risk calls. */
>> >   allow_nr(__NR_rt_sigreturn); /* Used to handle SIGPIPE. */
>> >   allow_nr(__NR_restart_syscall);
>> >   allow_nr(__NR_close);
>> >
>> > Do you want me to retry it?
>> >
>> >>
>> >> And on x86-32 you could definitely see sigreturn() instead of
>> >> rt_sigreturn() (this depends on the sa_flags that you gave in
>> >> sigaction()).
>> >>
>> >>
>> >> Markus
>> >
>> >
>>
>
>

jln (very slow on Chromium)

On Thu, Apr 12, 2012 at 5:25 PM, Chris Evans <cevans@google.com> wrote: > On Thu, ...

8 years, 8 months ago (2012-04-13 00:28:41 UTC) #35

Markus (顧孟勤)

Yes, Julien figured it out. If you are stuck in a call to nanosleep() and ...

8 years, 8 months ago (2012-04-13 00:35:33 UTC) #36

Will Drewry

On 2012/04/13 00:35:33, Markus (顧孟勤) wrote: > Yes, Julien figured it out. If you are ...

8 years, 8 months ago (2012-04-13 02:53:15 UTC) #37

jam

http://codereview.chromium.org/10051022/diff/14004/content/public/common/sandbox_init.h File content/public/common/sandbox_init.h (right): http://codereview.chromium.org/10051022/diff/14004/content/public/common/sandbox_init.h#newcode50 content/public/common/sandbox_init.h:50: CONTENT_EXPORT void InitializeSandbox(); since this isn't called by outside ...

8 years, 8 months ago (2012-04-13 05:57:58 UTC) #38

cevans

On Thu, Apr 12, 2012 at 10:57 PM, <jam@chromium.org> wrote: > > http://codereview.chromium.**org/10051022/diff/14004/** > content/public/common/sandbox_**init.h<http://codereview.chromium.org/10051022/diff/14004/content/public/common/sandbox_init.h> ...

8 years, 8 months ago (2012-04-13 06:34:18 UTC) #39

jam

On 2012/04/13 06:34:18, cevans wrote: > On Thu, Apr 12, 2012 at 10:57 PM, <mailto:jam@chromium.org> ...

8 years, 8 months ago (2012-04-13 06:37:56 UTC) #40

cevans

On Thu, Apr 12, 2012 at 11:37 PM, <jam@chromium.org> wrote: > On 2012/04/13 06:34:18, cevans ...

8 years, 8 months ago (2012-04-13 06:47:12 UTC) #41

On Thu, Apr 12, 2012 at 11:37 PM, <jam@chromium.org> wrote:

> On 2012/04/13 06:34:18, cevans wrote:
>
>> On Thu, Apr 12, 2012 at 10:57 PM, <mailto:jam@chromium.org> wrote:
>>
>
>  >
>> > http://codereview.chromium.****org/10051022/diff/14004/**
>> >
>>
>
> content/public/common/sandbox_****init.h<http://codereview.**
>
chromium.org/10051022/diff/**14004/content/public/common/**sandbox_init.h<http://codereview.chromium.org/10051022/diff/14004/content/public/common/sandbox_init.h>
> >
>
>> > File content/public/common/sandbox_****init.h (right):
>> >
>> > http://codereview.chromium.****org/10051022/diff/14004/**
>> >
>>
>
> content/public/common/sandbox_****init.h#newcode50<http://**
> codereview.chromium.org/**10051022/diff/14004/content/**
>
public/common/sandbox_init.h#**newcode50<http://codereview.chromium.org/10051022/diff/14004/content/public/common/sandbox_init.h#newcode50>
> >
>
>> > content/public/common/sandbox_****init.h:50: CONTENT_EXPORT void
>>
>> > InitializeSandbox();
>> > since this isn't called by outside content, it doesn't need to be
>> > exposed in the public api. can you just put this in a
>> > sandbox_init_linux.h beside the new cc file?
>> >
>>
>
>  Ok. Any idea why it's CONTENT_EXPORT for all the other platforms?
>>
>
> the other ones are called from chrome now
> http://code.google.com/p/**chromium/source/search?q=%**
> 22InitializeSandbox%28%22+**file%3Achrome&origq=%**
>
22InitializeSandbox%28%22+**file%3Achrome&btnG=Search+**Trunk<http://code.google.com/p/chromium/source/search?q=%22InitializeSandbox%28%22+file%3Achrome&origq=%22InitializeSandbox%28%22+file%3Achrome&btnG=Search+Trunk>
>
>
http://codereview.chromium.**org/10051022/<http://codereview.chromium.org/100...
>

Yeah, I just found the reference from chrome/nacl -- for the Windows port.
I think we might want to have a similar setup for Linux in the future? No
reason to not lock down nacl further on all platforms :)

jam

On 2012/04/13 06:47:12, cevans wrote: > On Thu, Apr 12, 2012 at 11:37 PM, <mailto:jam@chromium.org> ...

8 years, 8 months ago (2012-04-13 18:06:50 UTC) #42

On 2012/04/13 06:47:12, cevans wrote:
> On Thu, Apr 12, 2012 at 11:37 PM, <mailto:jam@chromium.org> wrote:
> 
> > On 2012/04/13 06:34:18, cevans wrote:
> >
> >> On Thu, Apr 12, 2012 at 10:57 PM, <mailto:jam@chromium.org> wrote:
> >>
> >
> >  >
> >> > http://codereview.chromium.****org/10051022/diff/14004/**
> >> >
> >>
> >
> > content/public/common/sandbox_****init.h<http://codereview.**
> >
>
chromium.org/10051022/diff/**14004/content/public/common/**sandbox_init.h<http://codereview.chromium.org/10051022/diff/14004/content/public/common/sandbox_init.h>
> > >
> >
> >> > File content/public/common/sandbox_****init.h (right):
> >> >
> >> > http://codereview.chromium.****org/10051022/diff/14004/**
> >> >
> >>
> >
> > content/public/common/sandbox_****init.h#newcode50<http://**
> > codereview.chromium.org/**10051022/diff/14004/content/**
> >
>
public/common/sandbox_init.h#**newcode50<http://codereview.chromium.org/10051022/diff/14004/content/public/common/sandbox_init.h#newcode50>
> > >
> >
> >> > content/public/common/sandbox_****init.h:50: CONTENT_EXPORT void
> >>
> >> > InitializeSandbox();
> >> > since this isn't called by outside content, it doesn't need to be
> >> > exposed in the public api. can you just put this in a
> >> > sandbox_init_linux.h beside the new cc file?
> >> >
> >>
> >
> >  Ok. Any idea why it's CONTENT_EXPORT for all the other platforms?
> >>
> >
> > the other ones are called from chrome now
> > http://code.google.com/p/**chromium/source/search?q=%25**
> > 22InitializeSandbox%28%22+**file%3Achrome&origq=%**
> >
>
22InitializeSandbox%28%22+**file%3Achrome&btnG=Search+**Trunk<http://code.google.com/p/chromium/source/search?q=%22InitializeSandbox%28%22+file%3Achrome&origq=%22InitializeSandbox%28%22+file%3Achrome&btnG=Search+Trunk>
> >
> >
>
http://codereview.chromium.**org/10051022/%3Chttp://codereview.chromium.org/1...>
> >
> 
> Yeah, I just found the reference from chrome/nacl -- for the Windows port.
> I think we might want to have a similar setup for Linux in the future? No
> reason to not lock down nacl further on all platforms :)

ok, if you're planning on using this for nacl, lgtm

Chris Evans

8 years, 8 months ago (2012-04-13 20:37:00 UTC) #43

On 2012/04/12 01:08:56, Will Drewry wrote:
> Awesome!
> 
>
https://chromiumcodereview.appspot.com/10051022/diff/3009/content/common/sand...
> File content/common/sandbox_init_linux.cc (right):
> 
>
https://chromiumcodereview.appspot.com/10051022/diff/3009/content/common/sand...
> content/common/sandbox_init_linux.cc:54: unsigned int syscall =
> context->uc_mcontext.gregs[REG_RAX];
> I guess this is easier than wrangling asm/siginfo?
> What other info will end up in the dump? Do you want to prime the crash dump
> with other data or are you hoping unwinding will cover all that?
> 
>
https://chromiumcodereview.appspot.com/10051022/diff/3009/content/common/sand...
> content/common/sandbox_init_linux.cc:133: static void
> EmitKill(std::vector<struct sock_filter>* program) {
> nit: Any reason not to call it EmitLoggedKill?  Then if you add a
> SECCOMP_RET_KILL call, it'll just be EmitKill :)
> 
>
https://chromiumcodereview.appspot.com/10051022/diff/3009/content/common/sand...
> content/common/sandbox_init_linux.cc:143: static void
> ApplyGPUPolicy(std::vector<struct sock_filter>* program) {
> Is the plan to tighten this up further later?
> 
>
https://chromiumcodereview.appspot.com/10051022/diff/3009/content/common/sand...
> content/common/sandbox_init_linux.cc:199: }
> An authoritative check would be:
>   ret = prctl(PR_SET_SECCOMP, 2, NULL);
>   if (ret != 0 && errno == EFAULT)
>     return true;
>   return false;
> 
> but no_new_privs is a good guard too since it is required.  The only ugly bit
is
> that now your CanUse function has side effects :)
> 
>
https://chromiumcodereview.appspot.com/10051022/diff/3009/content/gpu/gpu_mai...
> File content/gpu/gpu_main.cc (right):
> 
>
https://chromiumcodereview.appspot.com/10051022/diff/3009/content/gpu/gpu_mai...
> content/gpu/gpu_main.cc:111: if (gpu_info.vendor_id == 0x1002) {  // ATI
> Any chance we can get intel in here?

Just an update for completeness: I'm going to land for all graphics cards. I got
my hand on an nvidia card and all was fine. Intel is blacklisted for desktop
Linux anyway, so we're all set for ChromeOS once the ChromeOS kernel patch lands
:D

Issue 10051022: Add an initial Linux GPU sandbox using the seccomp filter framework. (Closed)

Description

Patch Set 1 #

Patch Set 2 : #

Patch Set 3 : #

Patch Set 4 : #

Patch Set 5 : #

Patch Set 6 : #

Patch Set 7 : #

Patch Set 8 : #

Patch Set 9 : #

Patch Set 10 : #

Messages