|
|
Created:
8 years, 8 months ago by Chris Evans Modified:
7 years, 11 months ago Reviewers:
Will Drewry, Kees Cook, Markus (顧孟勤), cevans, Jorge Lucangeli Obes, jln (very slow on Chromium), rvargas (doing something else) CC:
chromium-reviews, joi+watch-content_chromium.org, darin-cc_chromium.org, jam Base URL:
svn://svn.chromium.org/chrome/trunk/src/ Visibility:
Public. |
DescriptionApply a policy to the renderer.
Patch Set 1 #
Messages
Total messages: 31 (0 generated)
This is working reasonably locally. I'm soliciting suggestions on whether we want to land it right now. Pros: 1) Good to start as early in the M20 dev cycle as possible and 2) Good to start ASAP because Ubuntu 12.04 is about to migrate from Beta to Stable. Cons: 1) I'm off on vacation for a bit.
I told cevans that if it were up to me I'd let GPU and Flash bake for a bit after 12.04 goes stable before doing renderers.
On 2012/04/20 22:27:45, Jorge Lucangeli Obes wrote: > I told cevans that if it were up to me I'd let GPU and Flash bake for a bit > after 12.04 goes stable before doing renderers. Although it is true that as jschuh says, if something trips the filter, we might very well want to take a look at it.
Not specific to this new patch, but I don't like the "undefined behavior" of the SIG_SYS handler. Any chance we could limit this to Debug ? This is a good start and a nice POC, but I believe we should re-think a little how we want to engineer this. See the separate thread.
On Fri, Apr 20, 2012 at 7:19 PM, <jln@chromium.org> wrote: > Not specific to this new patch, but I don't like the "undefined behavior" > of the > SIG_SYS handler. Any chance we could limit this to Debug ? > Could you be a little more specific? > > This is a good start and a nice POC, but I believe we should re-think a > little > how we want to engineer this. See the separate thread. > > https://chromiumcodereview.**appspot.com/10165018/<https://chromiumcodereview... >
On 2012/04/21 05:36:57, cevans wrote: > On Fri, Apr 20, 2012 at 7:19 PM, <mailto:jln@chromium.org> wrote: > > > Not specific to this new patch, but I don't like the "undefined behavior" > > of the > > SIG_SYS handler. Any chance we could limit this to Debug ? > > > > Could you be a little more specific? I realize that the current SIGSYS_Handler is handy, but I would prefer the current behavior to be limited to DEBUG builds. The current behavior is technically undefined, we write '0' to some computed address, there is no saying what this will do. This could very well tell another thread to immediately start Rick-rolling the user (even before the last part of the function might signal the process). Even the last part of the function is technically undefined. There is no saying what is at address zero (mmap_min_addr is not mandatory). Something a little better for now, perhaps, could be to use a non canonical 64 bits address to lead to a certain GPF from the processor. But even that is not ideal because the behavior could change in newer processors. Or, instead of limiting to DEBUG, I wonder if it wouldn't be cleaner to bite the bullet and write some simple assembly here. Set SP to a value that leaks useful information and then cause a GPF.
On 2012/04/23 18:27:11, Julien Tinnes wrote: > On 2012/04/21 05:36:57, cevans wrote: > > On Fri, Apr 20, 2012 at 7:19 PM, <mailto:jln@chromium.org> wrote: > > > > > Not specific to this new patch, but I don't like the "undefined behavior" > > > of the > > > SIG_SYS handler. Any chance we could limit this to Debug ? > > > > > > > Could you be a little more specific? > > I realize that the current SIGSYS_Handler is handy, but I would prefer the > current behavior to be limited to DEBUG builds. > > The current behavior is technically undefined, we write '0' to some computed > address, there is no saying what this will do. This could very well tell another > thread to immediately start Rick-rolling the user (even before the last part of > the function might signal the process). > > Even the last part of the function is technically undefined. There is no saying > what is at address zero (mmap_min_addr is not mandatory). > Something a little better for now, perhaps, could be to use a non canonical 64 > bits address to lead to a certain GPF from the processor. But even that is not > ideal because the behavior could change in newer processors. > > Or, instead of limiting to DEBUG, I wonder if it wouldn't be cleaner to bite the > bullet and write some simple assembly here. Set SP to a value that leaks useful > information and then cause a GPF. Or, at install time, map a single page PROT_NONE, annotate its address, and add a filter for mprotect() and munmap() to not be allowed to touch that page. Then use an access to that region to trigger the fault. That should result in defined behavior and not introduce any assembly into this top level sandbox.
On Mon, Apr 23, 2012 at 11:27 AM, <jln@chromium.org> wrote: > On 2012/04/21 05:36:57, cevans wrote: > > On Fri, Apr 20, 2012 at 7:19 PM, <mailto:jln@chromium.org> wrote: >> > > > Not specific to this new patch, but I don't like the "undefined >> behavior" >> > of the >> > SIG_SYS handler. Any chance we could limit this to Debug ? >> > >> > > Could you be a little more specific? >> > > I realize that the current SIGSYS_Handler is handy, but I would prefer the > current behavior to be limited to DEBUG builds. > > The current behavior is technically undefined, we write '0' to some > computed > address, there is no saying what this will do. This could very well tell > another > thread to immediately start Rick-rolling the user (even before the last > part of > the function might signal the process). > > Even the last part of the function is technically undefined. There is no > saying > what is at address zero (mmap_min_addr is not mandatory). > Something a little better for now, perhaps, could be to use a non > canonical 64 > bits address to lead to a certain GPF from the processor. But even that is > not > ideal because the behavior could change in newer processors. > > Or, instead of limiting to DEBUG, I wonder if it wouldn't be cleaner to > bite the > bullet and write some simple assembly here. Set SP to a value that leaks > useful > information and then cause a GPF. > Right, I think my longer term plan (once we've got near-zero SIGSYS events in the dev channel population) is to simply use asm to set rax, rbx, rcx etc. to values of interest. And then raise SIGABRT. I think the short-term value is worth the ugliness of the dereference. I know it's undefined but practically, I have faith that will do what I want. > https://chromiumcodereview.**appspot.com/10165018/<https://chromiumcodereview... >
Anyway -- today is the deadline if we want to land this for the next dev channel (the final pre-Precise dev channel ;-) To give it a shot or not, that is the question. On Mon, Apr 23, 2012 at 1:14 PM, Chris Evans <cevans@google.com> wrote: > On Mon, Apr 23, 2012 at 11:27 AM, <jln@chromium.org> wrote: > >> On 2012/04/21 05:36:57, cevans wrote: >> >> On Fri, Apr 20, 2012 at 7:19 PM, <mailto:jln@chromium.org> wrote: >>> >> >> > Not specific to this new patch, but I don't like the "undefined >>> behavior" >>> > of the >>> > SIG_SYS handler. Any chance we could limit this to Debug ? >>> > >>> >> >> Could you be a little more specific? >>> >> >> I realize that the current SIGSYS_Handler is handy, but I would prefer the >> current behavior to be limited to DEBUG builds. >> >> The current behavior is technically undefined, we write '0' to some >> computed >> address, there is no saying what this will do. This could very well tell >> another >> thread to immediately start Rick-rolling the user (even before the last >> part of >> the function might signal the process). >> >> Even the last part of the function is technically undefined. There is no >> saying >> what is at address zero (mmap_min_addr is not mandatory). >> Something a little better for now, perhaps, could be to use a non >> canonical 64 >> bits address to lead to a certain GPF from the processor. But even that >> is not >> ideal because the behavior could change in newer processors. >> >> Or, instead of limiting to DEBUG, I wonder if it wouldn't be cleaner to >> bite the >> bullet and write some simple assembly here. Set SP to a value that leaks >> useful >> information and then cause a GPF. >> > > Right, I think my longer term plan (once we've got near-zero SIGSYS events > in the dev channel population) is to simply use asm to set rax, rbx, rcx > etc. to values of interest. And then raise SIGABRT. > > I think the short-term value is worth the ugliness of the dereference. I > know it's undefined but practically, I have faith that will do what I want. > > >> https://chromiumcodereview.**appspot.com/10165018/<https://chromiumcodereview... >> > >
On Mon, Apr 23, 2012 at 1:15 PM, Chris Evans <cevans@google.com> wrote: > Anyway -- today is the deadline if we want to land this for the next dev > channel (the final pre-Precise dev channel ;-) > > To give it a shot or not, that is the question. Is there a good way to get this into the Dev channel to get all the interesting and useful data, but not merge it to beta / stable ? I think it would be ideal. We're "lucky" in a way that for now this patch will be limited to one distribution, so we'll get a fairly consistent set of users and shouldn't expect too many crazy failures. But fore reasons I outlined elsewhere, I think we should fail more gracefully on violations (with errno) and let the program have a chance to handle this gracefully (probably log something, cleanup and exit()). But perhaps this can be an incremental change.
On Mon, Apr 23, 2012 at 2:12 PM, Julien Tinnes <jln@chromium.org> wrote: > On Mon, Apr 23, 2012 at 1:15 PM, Chris Evans <cevans@google.com> wrote: > > Anyway -- today is the deadline if we want to land this for the next dev > > channel (the final pre-Precise dev channel ;-) > > > > To give it a shot or not, that is the question. > > Is there a good way to get this into the Dev channel to get all the > interesting and useful data, but not merge it to beta / stable ? Not sure what you mean by "merge", so I'll reply generally: If we land it now, it's very early in the M20 dev cycle. We'd be looking at over 2 months before it hit stable. If we didn't have confidence in stability by then, we could stick the feature behind a new flag --enable-seccomp-filter-sandbox. I > think it would be ideal. We're "lucky" in a way that for now this > patch will be limited to one distribution, so we'll get a fairly > consistent set of users and shouldn't expect too many crazy failures. > FWIW, I believe our Linux userbase is _dominated_ by Ubuntu users. Isn't it something like 90%? We would be limiting ourselves to just the most recent version -- 12.04, but isn't Ubuntu upgrade / update rate pretty high? The other bucket is 32-bit vs. 64-bit. We'd only be hitting 64-bit users, which I believe is about a third of Ubuntu users? > But fore reasons I outlined elsewhere, I think we should fail more > gracefully on violations (with errno) and let the program have a > chance to handle this gracefully (probably log something, cleanup and > exit()). But perhaps this can be an incremental change. > I don't think this is a good idea. If we start errno-failing weird syscalls in corner-case code, I expect things might limp along but sometimes with degraded or broken functionality. We'd then be relying on users to notice and file the broken behaviour. On the contrary, a sad tab is a clear problem signal and also a signal that we can look for in crash dumps.
On Mon, Apr 23, 2012 at 2:40 PM, Chris Evans <cevans@google.com> wrote: > On Mon, Apr 23, 2012 at 2:12 PM, Julien Tinnes <jln@chromium.org> wrote: >> >> On Mon, Apr 23, 2012 at 1:15 PM, Chris Evans <cevans@google.com> wrote: >> > Anyway -- today is the deadline if we want to land this for the next dev >> > channel (the final pre-Precise dev channel ;-) >> > >> > To give it a shot or not, that is the question. >> >> Is there a good way to get this into the Dev channel to get all the >> interesting and useful data, but not merge it to beta / stable ? > > > Not sure what you mean by "merge", so I'll reply generally: > > If we land it now, it's very early in the M20 dev cycle. We'd be looking at > over 2 months before it hit stable. If we didn't have confidence in > stability by then, we could stick the feature behind a new flag > --enable-seccomp-filter-sandbox. > >> I >> think it would be ideal. We're "lucky" in a way that for now this >> patch will be limited to one distribution, so we'll get a fairly >> consistent set of users and shouldn't expect too many crazy failures. > > > FWIW, I believe our Linux userbase is _dominated_ by Ubuntu users. Isn't it > something like 90%? > > We would be limiting ourselves to just the most recent version -- 12.04, but > isn't Ubuntu upgrade / update rate pretty high? > > The other bucket is 32-bit vs. 64-bit. We'd only be hitting 64-bit users, > which I believe is about a third of Ubuntu users? > >> >> But fore reasons I outlined elsewhere, I think we should fail more >> gracefully on violations (with errno) and let the program have a >> chance to handle this gracefully (probably log something, cleanup and >> exit()). But perhaps this can be an incremental change. > > > I don't think this is a good idea. If we start errno-failing weird syscalls > in corner-case code, I expect things might limp along but sometimes with > degraded or broken functionality. We'd then be relying on users to notice > and file the broken behaviour. On the contrary, a sad tab is a clear problem > signal and also a signal that we can look for in crash dumps. Errno is the end result, we can always log in a SIGSYS handler and then errno in it. Maybe we can think of a way to let users notice and report those logs to us. In DEBUG mode we always should log. But for release Chrome on generic Linux, it's dangerous to just make a process disappear. We should let well written code downgrade gracefully on denied syscalls. Otherwise things will start crashing randomly after various libraries upgrades. There are even people out there who use Debian unstable. We'll have breakage from glibc upgrades and whatnot. Think about what we're doing here. We're breaking / changing a well known API (kernel interface) and instead of returning an error ("nope, sorry you can't do that"), we're just killing stuff in the hope that we'll play catch-up and fix everything in time. We should ship a policy that we can expect to work everywhere, including when codepaths that weren't exercised before are exercised. It's an illusion to believe that our testing will be comprehensive enough that we'll catch every possible sandbox violation fast enough to write a fix. Graceful downgrade is a necessity if we don't want to alienate our users. Julien
On Mon, Apr 23, 2012 at 3:01 PM, Julien Tinnes <jln@chromium.org> wrote: > On Mon, Apr 23, 2012 at 2:40 PM, Chris Evans <cevans@google.com> wrote: > > On Mon, Apr 23, 2012 at 2:12 PM, Julien Tinnes <jln@chromium.org> wrote: > >> > >> On Mon, Apr 23, 2012 at 1:15 PM, Chris Evans <cevans@google.com> wrote: > >> > Anyway -- today is the deadline if we want to land this for the next > dev > >> > channel (the final pre-Precise dev channel ;-) > >> > > >> > To give it a shot or not, that is the question. > >> > >> Is there a good way to get this into the Dev channel to get all the > >> interesting and useful data, but not merge it to beta / stable ? > > > > > > Not sure what you mean by "merge", so I'll reply generally: > > > > If we land it now, it's very early in the M20 dev cycle. We'd be looking > at > > over 2 months before it hit stable. If we didn't have confidence in > > stability by then, we could stick the feature behind a new flag > > --enable-seccomp-filter-sandbox. > > > >> I > >> think it would be ideal. We're "lucky" in a way that for now this > >> patch will be limited to one distribution, so we'll get a fairly > >> consistent set of users and shouldn't expect too many crazy failures. > > > > > > FWIW, I believe our Linux userbase is _dominated_ by Ubuntu users. Isn't > it > > something like 90%? > > > > We would be limiting ourselves to just the most recent version -- 12.04, > but > > isn't Ubuntu upgrade / update rate pretty high? > > > > The other bucket is 32-bit vs. 64-bit. We'd only be hitting 64-bit users, > > which I believe is about a third of Ubuntu users? > > > >> > >> But fore reasons I outlined elsewhere, I think we should fail more > >> gracefully on violations (with errno) and let the program have a > >> chance to handle this gracefully (probably log something, cleanup and > >> exit()). But perhaps this can be an incremental change. > > > > > > I don't think this is a good idea. If we start errno-failing weird > syscalls > > in corner-case code, I expect things might limp along but sometimes with > > degraded or broken functionality. We'd then be relying on users to notice > > and file the broken behaviour. On the contrary, a sad tab is a clear > problem > > signal and also a signal that we can look for in crash dumps. > > Errno is the end result, we can always log in a SIGSYS handler and > then errno in it. It's easy to make off-hand suggestions starting "we can always" but the complexities are awful: 1) The thought of trying to "log" safely in the async sig handler context gives me nightmares. 2) We need the stack trace for a useful report. I don't know of a useful framework to collect / report that outside of the crash handling. Maybe we can think of a way to let users notice and > report those logs to us. > What did you have in mind? Users will notice a sad tab. > In DEBUG mode we always should log. But for release Chrome on generic > Linux, it's dangerous to just make a process disappear. I believe quite the opposite: it's dangerous to permit the process to continue. We'd be continuing it in a highly unusual state, including a failure / errno that the surrounding code might never have encountered before. We should let > well written code downgrade gracefully on denied syscalls. In an ideal world yes. We live far from an ideal world, including for example no control over glibc and other system libraries. Otherwise > things will start crashing randomly after various libraries upgrades. > There are even people out there who use Debian unstable. We'll have > breakage from glibc upgrades and whatnot. > > Think about what we're doing here. We're breaking / changing a well > known API (kernel interface) and instead of returning an error ("nope, > sorry you can't do that"), we're just killing stuff in the hope that > we'll play catch-up and fix everything in time. > > We should ship a policy that we can expect to work everywhere, > including when codepaths that weren't exercised before are exercised. > It's an illusion to believe that our testing will be comprehensive > enough that we'll catch every possible sandbox violation fast enough > to write a fix. Graceful downgrade is a necessity if we don't want to > alienate our users. > You make a compelling argument. Perhaps a debug vs. opt build difference would be worth trying for M21. I believe we want at least one stable release with the SIGSYS -> SIGSEGV behaviour, to have confidence in broad testing of corner-case functionality. > Julien >
On Mon, Apr 23, 2012 at 3:22 PM, Chris Evans <cevans@google.com> wrote: >> We should let >> well written code downgrade gracefully on denied syscalls. > In an ideal world yes. We live far from an ideal world, including for > example no control over glibc and other system libraries. Indeed. Which is why I think expecting them to use a specific subset of system calls that will never change is dangerous. I think we both agree things will break on libraries changes and updates, don't we? The discussion in on how to best react to those cases: errno or crash. Crash makes it easier to get some of the information we want and notify the user. Errno makes it more likely that things will continue to work and that we won't annoy users, and with some non trivial engineering effort could still let us to notify and log. >> Otherwise >> things will start crashing randomly after various libraries upgrades. >> There are even people out there who use Debian unstable. We'll have >> breakage from glibc upgrades and whatnot. >> >> Think about what we're doing here. We're breaking / changing a well >> known API (kernel interface) and instead of returning an error ("nope, >> sorry you can't do that"), we're just killing stuff in the hope that >> we'll play catch-up and fix everything in time. >> >> We should ship a policy that we can expect to work everywhere, >> including when codepaths that weren't exercised before are exercised. >> It's an illusion to believe that our testing will be comprehensive >> enough that we'll catch every possible sandbox violation fast enough >> to write a fix. Graceful downgrade is a necessity if we don't want to >> alienate our users. > > > You make a compelling argument. Perhaps a debug vs. opt build difference > would be worth trying for M21. I believe we want at least one stable release > with the SIGSYS -> SIGSEGV behaviour, to have confidence in broad testing of > corner-case functionality. Given that this will only have an effect on Ubuntu 12.04 users for now (i.e. one well defined Linux distribution that we can test on) and that we desperately need more data, I agree that lading this today is the right choice. But we should work on have a better story in the upcoming months, either by white listing only a specific Linux distribution for this feature to be enabled by default, or by allowing better recovery on (unexpected) denied system calls. Julien
On Mon, Apr 23, 2012 at 4:11 PM, Julien Tinnes <jln@chromium.org> wrote: > On Mon, Apr 23, 2012 at 3:22 PM, Chris Evans <cevans@google.com> wrote: > > >> We should let > >> well written code downgrade gracefully on denied syscalls. > > > In an ideal world yes. We live far from an ideal world, including for > > example no control over glibc and other system libraries. > > Indeed. Which is why I think expecting them to use a specific subset > of system calls that will never change is dangerous. > > I think we both agree things will break on libraries changes and > updates, don't we? > Yep. > The discussion in on how to best react to those cases: errno or crash. > Crash makes it easier to get some of the information we want and > notify the user. Errno makes it more likely that things will continue > to work and that we won't annoy users, and with some non trivial > engineering effort could still let us to notify and log. > I am not a fan of non-trivial engineering effort :) Luckily, I think you've suggested a couple of different trivial suggestions now. > >> Otherwise > >> things will start crashing randomly after various libraries upgrades. > >> There are even people out there who use Debian unstable. We'll have > >> breakage from glibc upgrades and whatnot. > >> > >> Think about what we're doing here. We're breaking / changing a well > >> known API (kernel interface) and instead of returning an error ("nope, > >> sorry you can't do that"), we're just killing stuff in the hope that > >> we'll play catch-up and fix everything in time. > >> > >> We should ship a policy that we can expect to work everywhere, > >> including when codepaths that weren't exercised before are exercised. > >> It's an illusion to believe that our testing will be comprehensive > >> enough that we'll catch every possible sandbox violation fast enough > >> to write a fix. Graceful downgrade is a necessity if we don't want to > >> alienate our users. > > > > > > You make a compelling argument. Perhaps a debug vs. opt build difference > > would be worth trying for M21. I believe we want at least one stable > release > > with the SIGSYS -> SIGSEGV behaviour, to have confidence in broad > testing of > > corner-case functionality. > > Given that this will only have an effect on Ubuntu 12.04 users for now > (i.e. one well defined Linux distribution that we can test on) and > that we desperately need more data, I agree that lading this today is > the right choice. > Ok. Will / Kees / Jorge? > But we should work on have a better story in the upcoming months, > either by white listing only a specific Linux distribution for this > feature to be enabled by default, That's an interesting idea. I wonder what distributions we claim to support? Ubuntu / Fedora? Anything else? We'd probably be looking at specific versions of specific distributions. or by allowing better recovery on > (unexpected) denied system calls. > > Julien >
On Mon, Apr 23, 2012 at 4:16 PM, Chris Evans <cevans@google.com> wrote: > On Mon, Apr 23, 2012 at 4:11 PM, Julien Tinnes <jln@chromium.org> wrote: >> >> On Mon, Apr 23, 2012 at 3:22 PM, Chris Evans <cevans@google.com> wrote: >> >> >> We should let >> >> well written code downgrade gracefully on denied syscalls. >> >> > In an ideal world yes. We live far from an ideal world, including for >> > example no control over glibc and other system libraries. >> >> Indeed. Which is why I think expecting them to use a specific subset >> of system calls that will never change is dangerous. >> >> I think we both agree things will break on libraries changes and >> updates, don't we? > > > Yep. > >> >> The discussion in on how to best react to those cases: errno or crash. >> Crash makes it easier to get some of the information we want and >> notify the user. Errno makes it more likely that things will continue >> to work and that we won't annoy users, and with some non trivial >> engineering effort could still let us to notify and log. > > > I am not a fan of non-trivial engineering effort :) Luckily, I think you've > suggested a couple of different trivial suggestions now. > >> >> >> Otherwise >> >> things will start crashing randomly after various libraries upgrades. >> >> There are even people out there who use Debian unstable. We'll have >> >> breakage from glibc upgrades and whatnot. >> >> >> >> Think about what we're doing here. We're breaking / changing a well >> >> known API (kernel interface) and instead of returning an error ("nope, >> >> sorry you can't do that"), we're just killing stuff in the hope that >> >> we'll play catch-up and fix everything in time. >> >> >> >> We should ship a policy that we can expect to work everywhere, >> >> including when codepaths that weren't exercised before are exercised. >> >> It's an illusion to believe that our testing will be comprehensive >> >> enough that we'll catch every possible sandbox violation fast enough >> >> to write a fix. Graceful downgrade is a necessity if we don't want to >> >> alienate our users. >> > >> > >> > You make a compelling argument. Perhaps a debug vs. opt build difference >> > would be worth trying for M21. I believe we want at least one stable >> > release >> > with the SIGSYS -> SIGSEGV behaviour, to have confidence in broad >> > testing of >> > corner-case functionality. >> >> Given that this will only have an effect on Ubuntu 12.04 users for now >> (i.e. one well defined Linux distribution that we can test on) and >> that we desperately need more data, I agree that lading this today is >> the right choice. > > Ok. Will / Kees / Jorge? I would agree; let's get it landed. Ubuntu 12.04's libraries aren't going to change much, so I think the risk of not using the errno method is minimal. That said, I still prefer the kill method on platforms we directly control (Chrome OS). >> But we should work on have a better story in the upcoming months, >> either by white listing only a specific Linux distribution for this >> feature to be enabled by default, > > That's an interesting idea. I wonder what distributions we claim to support? > Ubuntu / Fedora? Anything else? We'd probably be looking at specific > versions of specific distributions. In theory, any distro with a 3.5 kernel. I'm only aware of Ubuntu carrying the backport of seccomp_bpf. -Kees -- Kees Cook Chrome OS Security
On Mon, Apr 23, 2012 at 4:16 PM, Chris Evans <cevans@google.com> wrote: > That's an interesting idea. I wonder what distributions we claim to support? > Ubuntu / Fedora? Anything else? We'd probably be looking at specific > versions of specific distributions. Yes, hopefully we can go the errno route instead and stay generic. There are lots of cases where I think this will allow for a graceful downgrade. In fact you're already using this for a bunch of system calls. Perhaps we could at the very least agree that for anything where ENOENT or EPERM makes sense we can expect a sensible downgrade? Julien (also apologies for all the awful typos in the previous mail, I'm mortified)
On Mon, Apr 23, 2012 at 4:21 PM, Kees Cook <keescook@chromium.org> wrote: > On Mon, Apr 23, 2012 at 4:16 PM, Chris Evans <cevans@google.com> wrote: > > On Mon, Apr 23, 2012 at 4:11 PM, Julien Tinnes <jln@chromium.org> wrote: > >> > >> On Mon, Apr 23, 2012 at 3:22 PM, Chris Evans <cevans@google.com> wrote: > >> > >> >> We should let > >> >> well written code downgrade gracefully on denied syscalls. > >> > >> > In an ideal world yes. We live far from an ideal world, including for > >> > example no control over glibc and other system libraries. > >> > >> Indeed. Which is why I think expecting them to use a specific subset > >> of system calls that will never change is dangerous. > >> > >> I think we both agree things will break on libraries changes and > >> updates, don't we? > > > > > > Yep. > > > >> > >> The discussion in on how to best react to those cases: errno or crash. > >> Crash makes it easier to get some of the information we want and > >> notify the user. Errno makes it more likely that things will continue > >> to work and that we won't annoy users, and with some non trivial > >> engineering effort could still let us to notify and log. > > > > > > I am not a fan of non-trivial engineering effort :) Luckily, I think > you've > > suggested a couple of different trivial suggestions now. > > > >> > >> >> Otherwise > >> >> things will start crashing randomly after various libraries upgrades. > >> >> There are even people out there who use Debian unstable. We'll have > >> >> breakage from glibc upgrades and whatnot. > >> >> > >> >> Think about what we're doing here. We're breaking / changing a well > >> >> known API (kernel interface) and instead of returning an error > ("nope, > >> >> sorry you can't do that"), we're just killing stuff in the hope that > >> >> we'll play catch-up and fix everything in time. > >> >> > >> >> We should ship a policy that we can expect to work everywhere, > >> >> including when codepaths that weren't exercised before are exercised. > >> >> It's an illusion to believe that our testing will be comprehensive > >> >> enough that we'll catch every possible sandbox violation fast enough > >> >> to write a fix. Graceful downgrade is a necessity if we don't want to > >> >> alienate our users. > >> > > >> > > >> > You make a compelling argument. Perhaps a debug vs. opt build > difference > >> > would be worth trying for M21. I believe we want at least one stable > >> > release > >> > with the SIGSYS -> SIGSEGV behaviour, to have confidence in broad > >> > testing of > >> > corner-case functionality. > >> > >> Given that this will only have an effect on Ubuntu 12.04 users for now > >> (i.e. one well defined Linux distribution that we can test on) and > >> that we desperately need more data, I agree that lading this today is > >> the right choice. > > > > Ok. Will / Kees / Jorge? > > I would agree; let's get it landed. Ubuntu 12.04's libraries aren't > going to change much, so I think the risk of not using the errno > method is minimal. > Ok. Not sure I have an actual LGTM from anyone; I'd like to collect at least two :P > That said, I still prefer the kill method on platforms we directly > control (Chrome OS). > Good point. We can ifdef that cleanly in a future CL. > >> But we should work on have a better story in the upcoming months, > >> either by white listing only a specific Linux distribution for this > >> feature to be enabled by default, > > > > That's an interesting idea. I wonder what distributions we claim to > support? > > Ubuntu / Fedora? Anything else? We'd probably be looking at specific > > versions of specific distributions. > > In theory, any distro with a 3.5 kernel. I'm only aware of Ubuntu > carrying the backport of seccomp_bpf. > > -Kees > > -- > Kees Cook > Chrome OS Security >
lgtm
lgtm
On Mon, Apr 23, 2012 at 4:35 PM, <keescook@chromium.org> wrote: > lgtm > And now that I have these lgtm's I'm somehow nervous to land it ;-) > > > https://chromiumcodereview.**appspot.com/10165018/<https://chromiumcodereview... >
Kees: I like the KILL way, and I agree we should do that in Chrome OS since we control the platform. Julien: I agree with the points you make, but I'm slightly wary of returning errno everywhere. This is a discussion we should continue, especially when we get the data that this patch will provide. lgtm
On Mon, Apr 23, 2012 at 4:38 PM, Chris Evans <cevans@google.com> wrote: > On Mon, Apr 23, 2012 at 4:35 PM, <keescook@chromium.org> wrote: >> lgtm > > And now that I have these lgtm's I'm somehow nervous to land it ;-) No need for fear; all the bug reports will go to Ubuntu! ("I upgraded to 12.04 and now Chrome doesn't work!") :) -Kees -- Kees Cook Chrome OS Security
Please fix the description before landing (and ideally link to a bug)
On Mon, Apr 23, 2012 at 5:38 PM, <rvargas@chromium.org> wrote: > Please fix the description before landing (and ideally link to a bug) > Could you elaborate on "fix"? My guess is that you want a little more verbosity ;-) Certainly it is missing some key notes, such as the fact it's a seccomp filter sandbox policy for Linux, etc. > > https://chromiumcodereview.**appspot.com/10165018/<https://chromiumcodereview... >
On 2012/04/24 00:41:52, cevans wrote: > On Mon, Apr 23, 2012 at 5:38 PM, <mailto:rvargas@chromium.org> wrote: > > > Please fix the description before landing (and ideally link to a bug) > > > > Could you elaborate on "fix"? My guess is that you want a little more > verbosity ;-) Certainly it is missing some key notes, such as the fact it's > a seccomp filter sandbox policy for Linux, etc. Correct. Something that tells what's being changed without having to look at the CL. > > > > > > > https://chromiumcodereview.**appspot.com/10165018/%3Chttps://chromiumcoderevi...> > >
This might be acceptable for CrOS, and it is certainly SOP for Google3, but please do not enable this behavior in Chrome in general. This is just not a good idea for client-software. We have already had way too many code-yellow for excessive crashes, I don't want you to contribute to another code-yellow. Crashing on error is never a good idea in release builds, and it is only rarely a good idea in development builds -- and we do have macros that help with this, if you decide you absolutely have to crash on error; they do the right thing and allow you to attach a debugger. That's a lot more likely to result in somebody fixing the problem than KILLing the process without any hope of ever debugging. In general, you have to assume that anything that can go strange and funny will do so on some users machine. People will have non-standard libraries, non-standard system-wide LD_PRELOAD environments, non-standard graphics drivers, non-standard kernels, non-standard distributions, ... Recover from the error and continue as best you can. Reporting a valid "errno" value is often a great way to do so. This also means you don't want to make the list of allowed system calls too narrow. The exact system call will differ from machine to machine. Markus P.S.: In general, I am somewhat saddened by the fact that in an effort to rush out a new sandbox, we are ignoring all the experience that we gained in the last two years with carefully designing and rolling out a sandbox that is a) secure, and b) minimizes how much it gets into people's way. On Mon, Apr 23, 2012 at 16:39, <jorgelo@chromium.org> wrote: > Kees: I like the KILL way, and I agree we should do that in Chrome OS > since we > control the platform. > > Julien: I agree with the points you make, but I'm slightly wary of > returning > errno everywhere. This is a discussion we should continue, especially when > we > get the data that this patch will provide. > > lgtm > > https://chromiumcodereview.**appspot.com/10165018/<https://chromiumcodereview... >
On Mon, Apr 23, 2012 at 4:38 PM, Chris Evans <cevans@google.com> wrote: > On Mon, Apr 23, 2012 at 4:35 PM, <keescook@chromium.org> wrote: >> >> lgtm > > > And now that I have these lgtm's I'm somehow nervous to land it ;-) Also, somewhat of a detail but I've just noticed that the policy doesn't allow sigreturn. sigreturn is part of the main kernel syscall interface and should be allowed. It's not good to feel "in a rush". Perhaps missing the Ubuntu 10.04 window wouldn't be so bad after all ? We have the dev channel precisely for the purpose of being able to roll out things slowly later, don't we? jln
On Mon, Apr 23, 2012 at 9:36 PM, Julien Tinnes <jln@chromium.org> wrote: > On Mon, Apr 23, 2012 at 4:38 PM, Chris Evans <cevans@google.com> wrote: > > On Mon, Apr 23, 2012 at 4:35 PM, <keescook@chromium.org> wrote: > >> > >> lgtm > > > > > > And now that I have these lgtm's I'm somehow nervous to land it ;-) > > Also, somewhat of a detail but I've just noticed that the policy > doesn't allow sigreturn. sigreturn is part of the main kernel syscall > interface and should be allowed. > I didn't land the change in the end. I'll add this before I do. Seems like this syscall should be in a "baseline" "PSS"? What else would be in the baseline PSS? Would you be interested in kicking off the PSS in earnest with a patch for baseline PSS? > It's not good to feel "in a rush". Perhaps missing the Ubuntu 10.04 > window wouldn't be so bad after all ? You mean 12.04 ? We have the dev channel > precisely for the purpose of being able to roll out things slowly > later, don't we? > We're using the M20 dev channel for exactly this :) It's pretty early in the M20 dev cycle so now is in fact an ideal time to be landing these things. Cheers Chris > jln >
On Mon, Apr 23, 2012 at 11:16 PM, Chris Evans <cevans@google.com> wrote: > On Mon, Apr 23, 2012 at 9:36 PM, Julien Tinnes <jln@chromium.org> wrote: > >> On Mon, Apr 23, 2012 at 4:38 PM, Chris Evans <cevans@google.com> wrote: >> > On Mon, Apr 23, 2012 at 4:35 PM, <keescook@chromium.org> wrote: >> >> >> >> lgtm >> > >> > >> > And now that I have these lgtm's I'm somehow nervous to land it ;-) >> >> Also, somewhat of a detail but I've just noticed that the policy >> doesn't allow sigreturn. sigreturn is part of the main kernel syscall >> interface and should be allowed. >> > > I didn't land the change in the end. I'll add this before I do. Seems like > this syscall should be in a "baseline" "PSS"? What else would be in the > baseline PSS? Would you be interested in kicking off the PSS in earnest > with a patch for baseline PSS? > I've decided to retire this CL for now. For M20, it seems we should focus on: - Making the GPU and Flash sandboxes stable. - Resolving the runtime behaviour of a failed syscall. - Refactoring to move the policies towards Julien's PSS idea. > >> It's not good to feel "in a rush". Perhaps missing the Ubuntu 10.04 >> window wouldn't be so bad after all ? > > > You mean 12.04 ? > > We have the dev channel >> precisely for the purpose of being able to roll out things slowly >> later, don't we? >> > > We're using the M20 dev channel for exactly this :) It's pretty early in > the M20 dev cycle so now is in fact an ideal time to be landing these > things. > > > Cheers > Chris > > >> jln >> > >
On Thu, Apr 26, 2012 at 10:40 AM, Chris Evans <cevans@google.com> wrote: > On Mon, Apr 23, 2012 at 11:16 PM, Chris Evans <cevans@google.com> wrote: >> >> On Mon, Apr 23, 2012 at 9:36 PM, Julien Tinnes <jln@chromium.org> wrote: >>> >>> On Mon, Apr 23, 2012 at 4:38 PM, Chris Evans <cevans@google.com> wrote: >>> > On Mon, Apr 23, 2012 at 4:35 PM, <keescook@chromium.org> wrote: >>> >> >>> >> lgtm >>> > >>> > >>> > And now that I have these lgtm's I'm somehow nervous to land it ;-) >>> >>> Also, somewhat of a detail but I've just noticed that the policy >>> doesn't allow sigreturn. sigreturn is part of the main kernel syscall >>> interface and should be allowed. >> >> >> I didn't land the change in the end. I'll add this before I do. Seems like >> this syscall should be in a "baseline" "PSS"? What else would be in the >> baseline PSS? Would you be interested in kicking off the PSS in earnest with >> a patch for baseline PSS? > > > I've decided to retire this CL for now. For M20, it seems we should focus > on: > - Making the GPU and Flash sandboxes stable. > - Resolving the runtime behaviour of a failed syscall. > - Refactoring to move the policies towards Julien's PSS idea. Ok, I'll start working on that. I also plan to update about:sandbox with some more information about the current state. jln |