|
|
Created:
8 years, 2 months ago by khim Modified:
7 years, 9 months ago CC:
native-client-reviews_googlegroups.com Visibility:
Public. |
DescriptionMove validator_x86_XX.rl out of unreviewed.
validator_internal.h and validator_x86_XX.rl together implement
the core validator logic. They all [relatively] short (about 1000 lines)
but these are the core of validator so we need to review them carefully
and perhaps add some more information about them.
Please read the validator_internals.html before reviewing them: this
file explains the logic which liese under all that machinery.
BUG=http://code.google.com/p/nativeclient/issues/detail?id=2597
TEST=trybots (no changes in logic)
Committed: https://src.chromium.org/viewvc/native_client?view=rev&revision=11063
Patch Set 1 #
Total comments: 44
Patch Set 2 : #
Total comments: 14
Patch Set 3 : #
Total comments: 84
Patch Set 4 : #
Total comments: 45
Patch Set 5 : #
Total comments: 18
Patch Set 6 : #
Total comments: 16
Patch Set 7 : #
Total comments: 86
Patch Set 8 : #
Total comments: 6
Patch Set 9 : #
Total comments: 34
Patch Set 10 : #
Total comments: 18
Patch Set 11 : #
Total comments: 10
Patch Set 12 : #
Total comments: 4
Patch Set 13 : #
Created: 7 years, 9 months ago
(Patch set is too large to download)
Messages
Total messages: 50 (0 generated)
ACK. Today will be mostly perf and rereading you doc. Should be able to get you some material comments tomorrow.
Here's some comments on validator_x86_64.rl that we just talked about. It would be worth agreeing on how we want to comment these blocks before you do all of them. Let's try and do that today. If you'd like to sit down and comment one together that might be a good idea. http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... File src/trusted/validator_ragel/decoder.h (right): http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... src/trusted/validator_ragel/decoder.h:21: OPERAND_SIZE_2_BIT, /* See VPERMIL2Px instruction for description. */ Could you make it clear in the comment that it only applies to only the 2_BIT enum? I initially thought it applies to all of the OPERAND_SIZE things and only figured out otherwise looking at the internals doc. http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... src/trusted/validator_ragel/decoder.h:25: OPERAND_SIZE_64_BIT, Are there other 64-bit operand_types below? In that case, does OPERAND_SIZE_64_BIT apply or not? If an operand should always have exactly one type from this list, then it would make sense I think to http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... src/trusted/validator_ragel/decoder.h:54: OPERAND_YMM For my gratification, are there other operand_sizes that need to be added for AVX? http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/va... File src/trusted/validator_ragel/validator_x86_64.rl (right): http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/va... src/trusted/validator_ragel/validator_x86_64.rl:221: It would be helpful to have a comment before every block of stuff saying what the block of stuff is. For example, something like # This block encodes call and jump instructions of the form: # 41 83 e3 e0 and $0xffffffe0,%r11d # 4d 01 fb add %r15,%r11 # 41 ff e3 jmpq *%r11 http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/va... src/trusted/validator_ragel/validator_x86_64.rl:224: # add %r15,%rax/%rcx/%rdx/%rbx/%rsp/%rbp/%rsi/%rdi A comment stating explicitly this is computing the effective address of the call/jump would be nice. I don't find the lists of registers useful at all; I know what the x86 registers are, so I'm not going to read through a list like this when I see it. Is there a way to only write about the interesting parts? For example, are there registers omitted from this list that I might not be expecting? http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/va... src/trusted/validator_ragel/validator_x86_64.rl:238: (b_0100_xxx1 0xff (0xe0|0xe1|0xe2|0xe3|0xe4|0xe5|0xe6))))) The more repetitive the comments are, the less useful they will tend to be in the context of this review. Part of what I need to understand is the interesting differences in a case, so repetition in comments works against that. http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/va... src/trusted/validator_ragel/validator_x86_64.rl:245: BitmapClearBit(valid_targets, (instruction_start - data) + 4); Could you add a level of procedural indirection so that it is obvious that this BitmapClearBit is actually removing a possible target from an instruction list?
http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... File src/trusted/validator_ragel/decoder.h (right): http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... src/trusted/validator_ragel/decoder.h:25: OPERAND_SIZE_64_BIT, On 2012/09/28 19:38:34, Brad Chen wrote: > Are there other 64-bit operand_types below? In that case, does > OPERAND_SIZE_64_BIT apply or not? If an operand should always have exactly one > type from this list, then it would make sense I think to Operand can only use one items from a list. But GENERAL_PUSPOSE_REGISTER_OR_MEMORY_OPERAND_OF_64_BIT_SIZE looked too long. http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... src/trusted/validator_ragel/decoder.h:54: OPERAND_YMM On 2012/09/28 19:38:34, Brad Chen wrote: > For my gratification, are there other operand_sizes that need to be added for > AVX? There are some for AVX2. AVX is currently supported at the DFA level and disabled at runtime via "hard CPUID ceiling". This is done this way because AVX instructions support adds a lot of complexity to the DFA and we DON'T want to suddenly find themselves with a brand-spanking-new validator design which we want to throw away and redo from scratch because we can not, e.g. compile it with MSVC 2010. http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/va... File src/trusted/validator_ragel/validator_x86_64.rl (right): http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/va... src/trusted/validator_ragel/validator_x86_64.rl:224: # add %r15,%rax/%rcx/%rdx/%rbx/%rsp/%rbp/%rsi/%rdi On 2012/09/28 19:38:34, Brad Chen wrote: > A comment stating explicitly this is computing the effective address of the > call/jump would be nice. > > I don't find the lists of registers useful at all; I know what the x86 registers > are, so I'm not going to read through a list like this when I see it. Is there a > way to only write about the interesting parts? For example, are there registers > omitted from this list that I might not be expecting? Well, there two groups in two different places: %rax/%rcx/%rdx/%rbx/%rsp/%rbp/%rsi/%rdi (first eight registers) %r8/%r9/%r10/%r11/%r12/%r13/%r14 (that is: %r15 is omitted). Surprisingly enough you CAN use %rbp and %rsp with naclcall/nacljmp and we even have such files in our tests (but not on CWS). %r15 is absolutely off-limits. http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/va... src/trusted/validator_ragel/validator_x86_64.rl:238: (b_0100_xxx1 0xff (0xe0|0xe1|0xe2|0xe3|0xe4|0xe5|0xe6))))) On 2012/09/28 19:38:34, Brad Chen wrote: > The more repetitive the comments are, the less useful they will tend to be in > the context of this review. Part of what I need to understand is the interesting > differences in a case, so repetition in comments works against that. Well, I need to somehow show that we are handling the same groups here (%r8 and %rax are not the same even if they are encoded identically in the ModR/M). http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/va... src/trusted/validator_ragel/validator_x86_64.rl:245: BitmapClearBit(valid_targets, (instruction_start - data) + 4); On 2012/09/28 19:38:34, Brad Chen wrote: > Could you add a level of procedural indirection so that it is obvious that this > BitmapClearBit is actually removing a possible target from an instruction list? Done.
http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... File src/trusted/validator_ragel/decoder.h (right): http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... src/trusted/validator_ragel/decoder.h:25: OPERAND_SIZE_64_BIT, ... then it would make sense to maybe use more of a common prefix, OPERAND_TYPE_... just a nit. On 2012/09/28 19:38:34, Brad Chen wrote: > Are there other 64-bit operand_types below? In that case, does > OPERAND_SIZE_64_BIT apply or not? If an operand should always have exactly one > type from this list, then it would make sense I think to http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... src/trusted/validator_ragel/decoder.h:75: /* These are pseudo-registers used in special cases. */ Q: Why are these better here rather than as a separate enum? Life would be simpler for the reader if REG_ always meant 'register'. http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... src/trusted/validator_ragel/decoder.h:91: * This enum extends NaClCPUFeatureID to cover instructions not recognized in Something missing here? http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... src/trusted/validator_ragel/decoder.h:98: enum disp_mode { Comment this enum please. http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... src/trusted/validator_ragel/decoder.h:106: struct instruction { Every non-trival type really merits a comment. While it's obvious in this case that this struct has fields representing an instruction, it's not obvious where it will be used, and if not everywhere where it won't be used. http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... src/trusted/validator_ragel/decoder.h:111: #ifdef _MSC_VER Yuck. I'm not liking the nearly-identical code here. Is there not a way to do something like: #ifndef _MSC_VER #define Bool _Bool #endif That would get rid of the duplication. Also, please comment any compiler-specific stuff. Don't expect people to know these differences by heart. http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... src/trusted/validator_ragel/decoder.h:149: /* All possible CPUID features enabled. */ This comment is too terse. Please mention full_cpuid_features in the comment so it's more explicit to what it applies. A couple other style rules to observe here: - Please use a prefix like "g_" for global variables. - For global constants, use kCamelCase, e.g. kFullCpuidFeatures. http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... File src/trusted/validator_ragel/decoder_x86_32.rl (right): http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... src/trusted/validator_ragel/decoder_x86_32.rl:2: * Copyright (c) 2011 The Native Client Authors. All rights reserved. 2012? Can't remember what the lawyers latest guidance is here. http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... src/trusted/validator_ragel/decoder_x86_32.rl:7: #include <assert.h> What is this file for? Does it get used when building the validator, or only for testing the stand-alone decoder? Anything you want to me to anticipate in its internal structure? Please add an appropriate header comment. Since most Googlers don't speak Ragel it might also be a good idea to add links to the ragel documentation and the design doc for the ragel validator. http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... src/trusted/validator_ragel/decoder_x86_32.rl:59: main := (one_instruction This procedure would be more readable if you made each switch below into a function, e.g. instruction.rm.offset = GetInstOffset(instruction, disp, ...). Performance impact should be zero if it is static inline. Any reason not to make this change? http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... src/trusted/validator_ragel/decoder_x86_32.rl:62: case DISPNONE: instruction.rm.offset = 0; break; One statement per line? Nothing after the colon please except maybe '{'. Any reason not to follow C++ coding style conventions in an action like this? BTW long switch statements are one instance where I expect functions to sometimes be longer than a page. http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... src/trusted/validator_ragel/decoder_x86_32.rl:64: case DISP16: instruction.rm.offset = This is a horrible line break. http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... src/trusted/validator_ragel/decoder_x86_32.rl:70: case DISP64: assert(FALSE); The style guide requires a default case. I might let it slide if you were committed to building with the compiler flag that makes it a fatal error to not omit cases from switch statements. http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... src/trusted/validator_ragel/decoder_x86_32.rl:116: #define GET_VEX_PREFIX3() vex_prefix3 It looks like this macro is defined identically in multiple files. Please use a .h file and a single defn. http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... src/trusted/validator_ragel/decoder_x86_32.rl:158: const uint8_t *current_position = data; Looks like the latest style guide prescribes the 'type* varname' rather than 'type *varname'. This is also what I'm used to seeing in Chromium code. Any reason not to conform here? http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... src/trusted/validator_ragel/decoder_x86_32.rl:185: %% write init; Can you add comments to make it obvious what these will expand to? I see from the gen files that the stuff above in main := (one_instruction ...) ends up in here someplace, but it would be nice not to have to reverse engineer such details. Comments before each %% would also make it easier to distinguish between the replacement for %% write init; and %% write exec; Am I misunderstanding these? Even more of a reason for comments. http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... File src/trusted/validator_ragel/decoder_x86_64.rl (right): http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... src/trusted/validator_ragel/decoder_x86_64.rl:70: (uint16_t) (disp[0] + 256U * disp[1]); Is it deliberate or accidental that you use uint16_t here but int16_t for the 32-bit version? Regardless, avoiding the code duplication would make these differences much easier to see and understand. At this point these maintainability issues are much more important than performance. Diffing the two files shows the files are about 80% identical with a variety of subtle differences. I would prefer if you would: - eliminate duplicated code - eliminate accidental differences - document deliberate differences Make sense? It's wonderful the 32- and 64-bit decoders are so similar; a great opportunity to simplify things and get rid of code! I'm not going to read the rest of this file until you respond to this initial feedback. At the risk of stating he obvious please consider the style comments on the 32-bit file to apply throughout.
http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/va... File src/trusted/validator_ragel/validator_x86_64.rl (right): http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/va... src/trusted/validator_ragel/validator_x86_64.rl:238: (b_0100_xxx1 0xff (0xe0|0xe1|0xe2|0xe3|0xe4|0xe5|0xe6))))) Thanks for the feedback. Just in case it's not clear, in general for every comment I make I am looking for at least one change. (I will try to note exceptions clearly.) These comments are not acceptable in their current form. If it's not clear to you how to make things better then we should work through the details together. On 2012/09/28 20:26:26, khim wrote: > On 2012/09/28 19:38:34, Brad Chen wrote: > > The more repetitive the comments are, the less useful they will tend to be in > > the context of this review. Part of what I need to understand is the > interesting > > differences in a case, so repetition in comments works against that. > > Well, I need to somehow show that we are handling the same groups here (%r8 and > %rax are not the same even if they are encoded identically in the ModR/M).
http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... File src/trusted/validator_ragel/decoder.h (right): http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... src/trusted/validator_ragel/decoder.h:21: OPERAND_SIZE_2_BIT, /* See VPERMIL2Px instruction for description. */ On 2012/09/28 19:38:34, Brad Chen wrote: > Could you make it clear in the comment that it only applies to only the 2_BIT > enum? I initially thought it applies to all of the OPERAND_SIZE things and only > figured out otherwise looking at the internals doc. Done. http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... src/trusted/validator_ragel/decoder.h:25: OPERAND_SIZE_64_BIT, On 2012/09/28 20:41:01, Brad Chen wrote: > ... then it would make sense to maybe use more of a common prefix, > OPERAND_TYPE_... just a nit. > > On 2012/09/28 19:38:34, Brad Chen wrote: > > Are there other 64-bit operand_types below? In that case, does > > OPERAND_SIZE_64_BIT apply or not? If an operand should always have exactly one > > type from this list, then it would make sense I think to > Done. http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... src/trusted/validator_ragel/decoder.h:75: /* These are pseudo-registers used in special cases. */ On 2012/09/28 20:41:01, Brad Chen wrote: > Q: Why are these better here rather than as a separate enum? Life would be > simpler for the reader if REG_ always meant 'register'. They are here to simplify the operations with operands. Operand can be one of the few registers, it can be memory access operand or immediate, and it may be some "special" operand like "%ds:(%rbx)" (in xlat). I agree that register_name is misleading (and it's name does not adhere to the style guide). Have any better suggestions? http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... src/trusted/validator_ragel/decoder.h:91: * This enum extends NaClCPUFeatureID to cover instructions not recognized in On 2012/09/28 20:41:01, Brad Chen wrote: > Something missing here? Yeah. CPUID support for decoder. Looks like we are not adding it any time soon. Better to remove it for now. http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... src/trusted/validator_ragel/decoder.h:98: enum disp_mode { On 2012/09/28 20:41:01, Brad Chen wrote: > Comment this enum please. Done. http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... src/trusted/validator_ragel/decoder.h:106: struct instruction { On 2012/09/28 20:41:01, Brad Chen wrote: > Every non-trival type really merits a comment. While it's obvious in this case > that this struct has fields representing an instruction, it's not obvious where > it will be used, and if not everywhere where it won't be used. Done. http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... src/trusted/validator_ragel/decoder.h:111: #ifdef _MSC_VER On 2012/09/28 20:41:01, Brad Chen wrote: > Yuck. I'm not liking the nearly-identical code here. Is there not a way to do > something like: > #ifndef _MSC_VER > #define Bool _Bool > #endif > That would get rid of the duplication. Unfortunately we already have Bool in NaCl codebase and it can not be used in place of _Bool because it triggers warnings when used with GCC. > Also, please comment any compiler-specific stuff. Don't expect people to know > these differences by heart. This one defenitely merits the comment. http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... src/trusted/validator_ragel/decoder.h:149: /* All possible CPUID features enabled. */ On 2012/09/28 20:41:01, Brad Chen wrote: > This comment is too terse. Please mention full_cpuid_features in the comment so > it's more explicit to what it applies. > > A couple other style rules to observe here: > - Please use a prefix like "g_" for global variables. > - For global constants, use kCamelCase, e.g. kFullCpuidFeatures. Done. http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... File src/trusted/validator_ragel/decoder_x86_32.rl (right): http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... src/trusted/validator_ragel/decoder_x86_32.rl:2: * Copyright (c) 2011 The Native Client Authors. All rights reserved. On 2012/09/28 20:41:01, Brad Chen wrote: > 2012? Can't remember what the lawyers latest guidance is here. Script accepts both 2011 and 2012. Fixed. http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... src/trusted/validator_ragel/decoder_x86_32.rl:7: #include <assert.h> On 2012/09/28 20:41:01, Brad Chen wrote: > What is this file for? Does it get used when building the validator, or only for > testing the stand-alone decoder? Anything you want to me to anticipate in its > internal structure? Please add an appropriate header comment. > > Since most Googlers don't speak Ragel it might also be a good idea to add links > to the ragel documentation and the design doc for the ragel validator. This file is only used to test stand-alone decoder, but it's pretty simple and short and I don't want to complicate already complex scons magic just to keep it in unreviewed directory. http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... src/trusted/validator_ragel/decoder_x86_32.rl:70: case DISP64: assert(FALSE); This is how our code is built. http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... File src/trusted/validator_ragel/decoder_x86_64.rl (right): http://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/de... src/trusted/validator_ragel/decoder_x86_64.rl:70: (uint16_t) (disp[0] + 256U * disp[1]); On 2012/09/28 20:41:01, Brad Chen wrote: > Is it deliberate or accidental that you use uint16_t here but int16_t for the > 32-bit version? It's just a typo - and an unimportant one at that since currently we don't decode instructions which use 16-bit offsets. > Regardless, avoiding the code duplication would make these > differences much easier to see and understand. I agree. Initially these files had more differences, but now most of them are eliminated. I'll see if I can just unify them. > At this point these maintainability issues are much more important than performance. Decoders are only used in tests anyway. For them speed is of no issue. > > Diffing the two files shows the files are about 80% identical with a variety of > subtle differences. I would prefer if you would: > - eliminate duplicated code > - eliminate accidental differences > - document deliberate differences > Make sense? It's wonderful the 32- and 64-bit decoders are so similar; a great > opportunity to simplify things and get rid of code! > > I'm not going to read the rest of this file until you respond to this initial > feedback. At the risk of stating he obvious please consider the style comments > on the 32-bit file to apply throughout. Sure.
http://codereview.chromium.org/11000033/diff/15001/src/trusted/validator_rage... File src/trusted/validator_ragel/unreviewed/decoding.h (right): http://codereview.chromium.org/11000033/diff/15001/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/decoding.h:87: /* Suggestion: use a macro "SIGN_EXTEND" for example, and comment it clearly. Then you can use it as necessary below. http://codereview.chromium.org/11000033/diff/15001/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/decoding.h:90: * finally they convert it back to unsgined uint64_t. unsigned http://codereview.chromium.org/11000033/diff/15001/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/decoding.h:92: * This operation looks pointless and dangerous but it's actually safe Some of the wording is a bit strong for such a comment; please lighten up! http://codereview.chromium.org/11000033/diff/15001/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/decoding.h:95: * unsigned does sing-extension - and this is what we need. sing => sign
More comments (from our f2f review). http://codereview.chromium.org/11000033/diff/15001/src/trusted/validator_rage... File src/trusted/validator_ragel/validator_x86_64.rl (right): http://codereview.chromium.org/11000033/diff/15001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:154: # and as such will detect case where %rbp/%rsp is illegally modified. Suggestion for comment: ## REGISTER USAGE ABBREVIATIONS: # By convention: rax == r0x, rcx == r1x, ... # R32: R version of legacy 386 registers a.k.a r1x .. r7x # R64: all # E32: # E64: # RBASE: %r15 - NaCl base register (read-only) # This block encodes call and jump instructions of the form: # 41 83 e3 e0 and $0xffffffe0,%r11d # 4d 01 fb add %r15,%r11 # 41 ff e3 jmpq *%r11 naclcall_or_nacljmp = # 83 xx e0 and $0xffffffe0,%r11d # 4x 01 xx add %r15,%rxx # 4x ff xx jmpq *%r11 # or call #### INSTRUCTION ONE (three bytes) # and $~0x1f, E32 (0x83 (0xe0|0xe1|0xe2|0xe3|0xe4|0xe5|0xe6|0xe7) 0xe0 #### INSTRUCTION TWO (three bytes) # REX add RBASE, R32 b_0100_11x0 0x01 (0xf8|0xf9|0xfa|0xfb|0xfc|0xfd|0xfe|0xff) #### INSTRUCTION THREE.a (three bytes plus optional REXP) # callq R32 ((REX_WRX? 0xff (0xd0|0xd1|0xd2|0xd3|0xd4|0xd5|0xd6|0xd7)) | #### INSTRUCTION THREE.b (three bytes plus optional REXP) # jmpq R32 (REX_WRX? 0xff (0xe0|0xe1|0xe2|0xe3|0xe4|0xe5|0xe6|0xe7)))) # This action checks the consistency of registers referenced # in the individual instructions: @{ instruction_start -= 6; if (RMFromModRM(instruction_start[1]) != RMFromModRM(instruction_start[5]) || RMFromModRM(instruction_start[1]) != RMFromModRM(*current_position)) instruction_info_collected |= UNRECOGNIZED_INSTRUCTION; MakeJumpTargetInvalid((instruction_start - data) + 3, valid_targets); MakeJumpTargetInvalid((instruction_start - data) + 6, valid_targets); restricted_register = NO_REG; } | http://codereview.chromium.org/11000033/diff/15001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:296: @{ Please comment every action. # This action redefines the range of the super-instruction # to include the preceding sandboxing sequence, then # invalidates jump targets on the interior of the # super-instructions. In this case it would also help if you made a procedure for this action, maybe "RSandboxStringInst"? http://codereview.chromium.org/11000033/diff/15001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:337: (0x89 | 0x8b) 0xf6 . # mov %esi,%esi # Note that 0x89 0xf6 and 0x8b 0xf6 both encode "mov %esi, %esi"
PTAL http://codereview.chromium.org/11000033/diff/15001/src/trusted/validator_rage... File src/trusted/validator_ragel/unreviewed/decoding.h (right): http://codereview.chromium.org/11000033/diff/15001/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/decoding.h:87: /* On 2012/09/28 23:31:54, Brad Chen wrote: > Suggestion: use a macro "SIGN_EXTEND" for example, and comment it clearly. Then > you can use it as necessary below. Done. http://codereview.chromium.org/11000033/diff/15001/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/decoding.h:90: * finally they convert it back to unsgined uint64_t. On 2012/09/28 23:31:54, Brad Chen wrote: > unsigned Done. http://codereview.chromium.org/11000033/diff/15001/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/decoding.h:92: * This operation looks pointless and dangerous but it's actually safe On 2012/09/28 23:31:54, Brad Chen wrote: > Some of the wording is a bit strong for such a comment; please lighten up! Done. http://codereview.chromium.org/11000033/diff/15001/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/decoding.h:95: * unsigned does sing-extension - and this is what we need. On 2012/09/28 23:31:54, Brad Chen wrote: > sing => sign Done. http://codereview.chromium.org/11000033/diff/15001/src/trusted/validator_rage... File src/trusted/validator_ragel/validator_x86_64.rl (right): http://codereview.chromium.org/11000033/diff/15001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:154: # and as such will detect case where %rbp/%rsp is illegally modified. On 2012/09/29 00:06:00, Brad Chen wrote: > Suggestion for comment: > ## REGISTER USAGE ABBREVIATIONS: > # By convention: rax == r0x, rcx == r1x, ... > # R32: R version of legacy 386 registers a.k.a r1x .. r7x > # R64: all > # E32: > # E64: > # RBASE: %r15 - NaCl base register (read-only) > > # This block encodes call and jump instructions of the form: > # 41 83 e3 e0 and $0xffffffe0,%r11d > # 4d 01 fb add %r15,%r11 > # 41 ff e3 jmpq *%r11 > naclcall_or_nacljmp = > # 83 xx e0 and $0xffffffe0,%r11d > # 4x 01 xx add %r15,%rxx > # 4x ff xx jmpq *%r11 # or call > #### INSTRUCTION ONE (three bytes) > # and $~0x1f, E32 > (0x83 (0xe0|0xe1|0xe2|0xe3|0xe4|0xe5|0xe6|0xe7) 0xe0 > #### INSTRUCTION TWO (three bytes) > # REX add RBASE, R32 > b_0100_11x0 0x01 (0xf8|0xf9|0xfa|0xfb|0xfc|0xfd|0xfe|0xff) > #### INSTRUCTION THREE.a (three bytes plus optional REXP) > # callq R32 > ((REX_WRX? 0xff (0xd0|0xd1|0xd2|0xd3|0xd4|0xd5|0xd6|0xd7)) | > #### INSTRUCTION THREE.b (three bytes plus optional REXP) > # jmpq R32 > (REX_WRX? 0xff (0xe0|0xe1|0xe2|0xe3|0xe4|0xe5|0xe6|0xe7)))) > # This action checks the consistency of registers referenced > # in the individual instructions: > @{ > instruction_start -= 6; > if (RMFromModRM(instruction_start[1]) != > RMFromModRM(instruction_start[5]) || > RMFromModRM(instruction_start[1]) != RMFromModRM(*current_position)) > instruction_info_collected |= UNRECOGNIZED_INSTRUCTION; > MakeJumpTargetInvalid((instruction_start - data) + 3, valid_targets); > MakeJumpTargetInvalid((instruction_start - data) + 6, valid_targets); > restricted_register = NO_REG; > } | Done. http://codereview.chromium.org/11000033/diff/15001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:296: @{ On 2012/09/29 00:06:00, Brad Chen wrote: > Please comment every action. > # This action redefines the range of the super-instruction > # to include the preceding sandboxing sequence, then > # invalidates jump targets on the interior of the > # super-instructions. > In this case it would also help if you made a procedure for this action, maybe > "RSandboxStringInst"? Done. Functions make absolutely no sense here: we have quite a few magic numbers here. They tie together action which is triggered and the DFA this action attached to (magic numbers are offsets in said DFA). We MAY define bunch of constants to satisfy the style guide but this IMNSHO will just make code less readable: each particular define will only be used once and instead of having two entities which are placed close to each other in a file we will have them in two different files. http://codereview.chromium.org/11000033/diff/15001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:337: (0x89 | 0x8b) 0xf6 . # mov %esi,%esi On 2012/09/29 00:06:00, Brad Chen wrote: > # Note that 0x89 0xf6 and 0x8b 0xf6 both encode "mov %esi, %esi" Done.
I haven't looked at everything. There are still too many general problems for that to be a good use of our time. OVERALL: Way too much repetition, both in code and in comments. Don't expect an 'LGTM' until these problems are fixed. I'd prefer if you fixed them all before asking for another review. Way too much repetition. Way too much repetition. Way too much repetition. Way too much repetition. Way too much repetition. Way too much repetition. Way too much repetition. Way too much repetition. Way too much repetition. Way too much repetition. Way too much repetition. Way too much repetition. Way too much repetition. Way too much repetition. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... File src/trusted/validator_ragel/decoder.h (right): http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/decoder.h:15: enum OperandType { Thanks; these enum decls look better now. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/decoder.h:121: #ifdef _MSC_VER Can you try to find a way to get this ifdef out of the defn of "struct Instruction"? Is there no way to use a typedef above, and then here using "foo_Bool" instead of "Bool"? I don't like how every one of these fields is declared twice. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/decoder.h:159: /* All possible CPUID features enabled. */ This comment is too terse. Please mention full_cpuid_features in the comment so it's more explicit to what it applies. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... File src/trusted/validator_ragel/decoder_internal.h (right): http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/decoder_internal.h:18: #define GET_VEX_PREFIX3() vex_prefix3 Please add a comment above this set of defines that explains where and how they are used. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/decoder_internal.h:45: enum { Please add a comment above this enum defn explaining where and how it is used. Also below for "ImmediateMode" http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/decoder_internal.h:65: enum ImmediateMode { Please group enums together and then static function defns, unless there is a really compelling reason to interleave them. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... File src/trusted/validator_ragel/decoder_x86_32.rl (right): http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/decoder_x86_32.rl:18: #define GET_REX_PREFIX() These are weird-looking defines. A comment would be appropriate. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... File src/trusted/validator_ragel/decoder_x86_64.rl (right): http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/decoder_x86_64.rl:19: #define GET_REX_PREFIX() instruction.prefix.rex I'm guessing these need to be defined differently for 32-bit vs. 64-bit builds? Please make this explicit by adding appropriate comments. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... File src/trusted/validator_ragel/decoding.h (right): http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/decoding.h:18: # define FORCEINLINE __forceinline This seems a bit heavy-handed. Why not let the compiler decide? http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/decoding.h:98: * Note that these operations are safe but slightly unusual: they come very This comment is too long. Can you please stick to just the facts and eliminate extra details and opinions? Sign extension is not that big a deal is it? http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... File src/trusted/validator_ragel/unreviewed/decoder_test.c (right): http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/decoder_test.c:23: * unreviewed/Makefile is eliminated. Will this change be in this CL or in a subsequent CL? http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/decoder_test.c:34: static void CheckBounds(unsigned char *data, size_t data_size, I don't understand what this function is for. Please comment or delete. When is it to be used? Looks like it is never used. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/decoder_test.c:40: void ReadImage(const char *filename, uint8_t **result, size_t *result_size) { As a reader it's a bit frustrating to have a procedure with a very generic name like this but no comment. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/decoder_test.c:80: void ProcessInstruction(const uint8_t *begin, const uint8_t *end, This procedure is way too long. Please factor it into multiple functions; ideally when you are done this will read as a sequence of procedure calls, eg. void ProcessInstruction(...) { DoFirstImportantStep() DoSecondImportantStep() ... } Please apply this notion throughout this CL. A CL is not ready for review if it includes procedures like this. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/decoder_test.c:88: Bool data16_prefix = instruction->prefix.data16; Again, it is really awful that you have to declare everything twice. Please propose an alternative fix. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/decoder_test.c:412: if (strcmp(instruction_name, "callq") && Please break this if statement out into its own function. Give it a name so it is obvious to a casual reader what it does. Think about whether there is a better option than this if statement for implementing this functionality. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... File src/trusted/validator_ragel/validator_x86_64.rl (right): http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:6: What is this file? Please add a comment. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:108: # Special %rbp modifications without required sandboxing This comment is unclear. Why don't these special modifications require sandboxing? Or maybe there should be a TODO(khim) comment that says you are going to come back and add sandboxing someday? Please clarify. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:117: #(b_0100_1xx0 0x83 0xe5 (0x80 .. 0xff)) # and $XXX,%rbp The strange positioning of '|' and '#' and the commented-out lines of ragel input make this quite confusing. Why are these lines commented out? Why do the comments look like code? Please make this clearer. Maybe start by aligning the comments? http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:131: MakeJumpTargetInvalid((instruction_start - data), valid_targets); Are you going to make the instruction at the target of the jump an invalid instruction? I don't think so. Maybe "MakeInvalidJumpTarget" would be a slightly better name? http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:134: # Special %rbp modifications without required sandboxing This line begins a section that is extremely similar to the preceding but different. It needs a comment to make it clear why there is repetition and how this version is different. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:153: @{ if (restricted_register == REG_RSP) Why isn't this action commented? Please comment all actions. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:161: # naclcall or nacljmp. These are indirect-jump sequences. They include three Before getting into any specific instructions there ought to be a comment 'note to reader' that explains abbrevitions, Ragel format peculiarities etc. Move the defn. of abbreviations here. Also explain what '?' means. Also explain the general structure of things: sequence of byte patterns joined with '|' action http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:162: # commands: How about "These are three-instruction indirection-jump sequences." Note: instructions are not commands. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:163: # and “and $~0x1f, %eXX” Why repeat the opcode "and" helpful in these comments? Seems like extra clutter. The quotes are also unhelpful. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:167: # detect case where %rbp/%rsp is illegally modified. Who will detect this case? I don't understand the comment. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:169: # There are number of variants present which differ by the REX prefix usage: Where are the variants? http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:175: # instruction: with “0x01” and “0x03” opcode. Both are in use in the wild Please avoid colorful language such as "in use in the wild thus we can not support just one form". You could instead say "Both are allowed." Throughout please work on making your comments more concise. You make extra work for everybody when you write too much (or too little). http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:184: # R32: 32-bit counterparts for new amd64 registers (%r8d to %r14d) You have defined R32 twice. I think you mean "E32" above. I'm still not entirely thrilled with these names so if you have a better idea please suggest it. Maybe "E86/R86" instead of "E32/R32"? http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:197: #### INSTRUCTION THREE.c (three bytes plus optional REX prefix) three or two bytes? http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:200: #### INSTRUCTION THREE.j (three bytes plus optional REX prefix) Shouldn't 'three' be 'two' here too? Looks like 'j' is a typo. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:203: # This action first compares register numbers in three atomic instructions What do you mean by 'atomic instructions'? That means something very specific; better not to use it to refer to instructions that are part of a super-instruction. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:210: # the superinstruction. Please be consistent; use either 'super-instruction' or 'superinstruction' but not both. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:232: #### INSTRUCTION THREE.c (three bytes plus optional REX prefix) Here again I think you mean 'two' instead of 'three' http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:245: # the superinstruction. This action is identical to the previous action. Please make it a procedure and the action a procedure call. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:297: @{ This is much too repetitive. You are increasing my work by making me read this multiple times. As we discussed F2F, please make it a procedure call. Please remember 'premature optimization is the root of all evil.' http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:308: # This block encodes call and jump instructions of the form: Here I think you mean "super-instruction" not "instruction". Can you be careful to use 'super-instruction' or 'instruction' throughout as appropriate? http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:318: #### INSTRUCTION THREE.c (three bytes plus optional REX prefix) 'two'? http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:322: (REX_WRX? 0xff (0xe0|0xe1|0xe2|0xe3|0xe4|0xe5|0xe6|0xe7)))) | It's really hard to figure out what parenthetical is being closed here, and what the components of the or statement are for the '|' at the end of the line. You need go figure out a way to format or comment this stuff to make it readable. It looks to me like you are being sloppy with indentation. Remember code reviewers don't get to use emacs to check for paren matching. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:340: # This action first compares register numbers in three atomic instructions Every place you find yourself repeating a comment word-for-word like this it is almost surely a situation where you should be using a procedure, commenting it exactly once, and calling it throughout. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:403: # There are two variants which handle spurious REX prefixes. In what sense are the REX prefixes spurious? This comment is unclear. I think you maybe mean "There are two variants, one with the REX prefix, and one without." Why are you not using the '?' for optional bytes here? Seems inconsistent. It would be better for many reasons to avoid this repetition. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:410: (0x89 | 0x8b) 0xf6 . # mov %esi,%esi What does '.' mean here? How is it different from '|'? This would be good to cover in the lead comment above (where you explain abbreviations etc. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:413: # This action redefines the range of the super-instruction to include the This same identical comment is repeated at least four times. Not a good use of your time or my time. Please use a procedure, comment it once, and give it a useful name. When things are similar but repeated due to necessary differences, use comments that focus on the interesting differences, e.g. "This is like case 'foo' above except for blah blah blah."
I moved the code to separate functions. IMO it made it more cryptic, but if like it better, than I'm Ok with this. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... File src/trusted/validator_ragel/decoder.h (right): http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/decoder.h:121: #ifdef _MSC_VER On 2012/10/04 17:26:04, Brad Chen wrote: > Can you try to find a way to get this ifdef out of the defn of "struct > Instruction"? Is there no way to use a typedef above, and then here using > "foo_Bool" instead of "Bool"? I don't like how every one of these fields is > declared twice. Done: this makes decoder slightly slower, but we don't care about speed of the decoder since we only use it for tests. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/decoder.h:159: /* All possible CPUID features enabled. */ On 2012/10/04 17:26:04, Brad Chen wrote: > This comment is too terse. Please mention full_cpuid_features in the comment so > it's more explicit to what it applies. Done. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... File src/trusted/validator_ragel/decoder_internal.h (right): http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/decoder_internal.h:18: #define GET_VEX_PREFIX3() vex_prefix3 On 2012/10/04 17:26:04, Brad Chen wrote: > Please add a comment above this set of defines that explains where and how they > are used. Done. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/decoder_internal.h:45: enum { On 2012/10/04 17:26:04, Brad Chen wrote: > Please add a comment above this enum defn explaining where and how it is used. > Also below for "ImmediateMode" Actually this one is not used at all: it was used before but now we have set of functions {Base,Index,Register}ExtentionFromREX for that. Removed. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/decoder_internal.h:65: enum ImmediateMode { On 2012/10/04 17:26:04, Brad Chen wrote: > Please group enums together and then static function defns, unless there is a > really compelling reason to interleave them. Done. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... File src/trusted/validator_ragel/decoder_x86_32.rl (right): http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/decoder_x86_32.rl:18: #define GET_REX_PREFIX() On 2012/10/04 17:26:04, Brad Chen wrote: > These are weird-looking defines. A comment would be appropriate. Removed: they are no longer used in 32-bit mode at all. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... File src/trusted/validator_ragel/decoder_x86_64.rl (right): http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/decoder_x86_64.rl:19: #define GET_REX_PREFIX() instruction.prefix.rex On 2012/10/04 17:26:04, Brad Chen wrote: > I'm guessing these need to be defined differently for 32-bit vs. 64-bit builds? > Please make this explicit by adding appropriate comments. This used to be a case, but now we just don't use them at all in 32-bit mode. I've used "define-to-null" as a debug aid (when 32-bit version tried to reference them by mistake this triggered a compile-time error), but I don't think this small bit of protection warrants extensive discussion, comments, etc. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... File src/trusted/validator_ragel/decoding.h (right): http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/decoding.h:18: # define FORCEINLINE __forceinline On 2012/10/04 17:26:04, Brad Chen wrote: > This seems a bit heavy-handed. Why not let the compiler decide? Mostly because MSVC consistently makes wrong decision: apparently when function is large it stops inlining other functions into it even when said functions are so tiny inlining actually reduces the final size of code! Our *SAN (ASAN/MSAN/TSAN/etc) guys are constantly fighting with it, too - and so far have found that __forceinline is the only tool that works. GCC is better in that regard, but also sometimes decides not to inline these functions. FORCEINLINE is only used for tiny "accessor"-type functions, larger functions in validator_internal.h are marked with just plain INLINE decoration. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/decoding.h:98: * Note that these operations are safe but slightly unusual: they come very On 2012/10/04 17:26:04, Brad Chen wrote: > This comment is too long. Can you please stick to just the facts and eliminate > extra details and opinions? Sign extension is not that big a deal is it? Yes it is. If someone will replace "int32_t" with "int" here then s/he'll turn safe program into unsafe one: is it immediately obvious to you? Sure, this code works today and noone have seen any problems with it, but is it actually guaranteed? Once upon time handling of singed integer overflow was also easy and simple - till compiler writers found that this triggers undefined behaviour. We *don't* want such suprises in our TCB. I've spent about a hour reading the documentation and even asked Dmitry about this one line of code before we've found it that this code is supposed to work *if* it's compileable at all. As I've already said: the very same function may become incorrect in some hyphothetoical compiler if I will replace "intN_t" with plain "int", for example - because compiler may use not two's complement arythmetic, but ones'-complement one! Standard makes it very clear that C compiler may very well handle sign-extension incorrectly - but such compilers are not supposed to have "intN_t" types (they should use "int_leastN_t" and "int_fastN_t" types instead) which means that *if* compiler can compile this code *then* it works. I think it's entirely appropriate to leave the trail for the future readers with pointers to relevant pieces of the standard when such subtletities are involved. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... File src/trusted/validator_ragel/unreviewed/decoder_test.c (right): http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/decoder_test.c:23: * unreviewed/Makefile is eliminated. On 2012/10/04 17:26:04, Brad Chen wrote: > Will this change be in this CL or in a subsequent CL? It'll be in subsequent CL. Or maybe (most probably) even in a few different ones. decoder_test.c refactoring is way too big of a task to handle in this CL. This file is only here to make sure code still compiles after all these changes. The fact that this file is *NOT* moved out of "unreviewed" should give you a clue. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/decoder_test.c:34: static void CheckBounds(unsigned char *data, size_t data_size, On 2012/10/04 17:26:04, Brad Chen wrote: > I don't understand what this function is for. Please comment or delete. When is > it to be used? Looks like it is never used. It's used when we read the ELF file. But as I've said: refactoring of tests most definitely should not happen in this CL. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/decoder_test.c:88: Bool data16_prefix = instruction->prefix.data16; On 2012/10/04 17:26:04, Brad Chen wrote: > Again, it is really awful that you have to declare everything twice. Please > propose an alternative fix. Done. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... File src/trusted/validator_ragel/validator_x86_64.rl (right): http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:6: On 2012/10/04 17:26:04, Brad Chen wrote: > What is this file? Please add a comment. Done. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:108: # Special %rbp modifications without required sandboxing On 2012/10/04 17:26:04, Brad Chen wrote: > This comment is unclear. Why don't these special modifications require > sandboxing? Or maybe there should be a TODO(khim) comment that says you are > going to come back and add sandboxing someday? Please clarify. These special instructions don't require the sandboxing because they don't affect the SFI invariants even if they should if you'll only look on the arguments of these operations. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:117: #(b_0100_1xx0 0x83 0xe5 (0x80 .. 0xff)) # and $XXX,%rbp On 2012/10/04 17:26:04, Brad Chen wrote: > The strange positioning of '|' and '#' and the commented-out lines of ragel > input make this quite confusing. Why are these lines commented out? Why do the > comments look like code? Please make this clearer. Maybe start by aligning the > comments? These instructions are permitted in the similar block dedicated to %rsp handling, but not here. This is surprising. I left a commented out description as a visual clue, but apparently it does not clarify anything and just causes confusion. Deleted. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:131: MakeJumpTargetInvalid((instruction_start - data), valid_targets); On 2012/10/04 17:26:04, Brad Chen wrote: > Are you going to make the instruction at the target of the jump an invalid > instruction? I don't think so. Maybe "MakeInvalidJumpTarget" would be a slightly > better name? I have no opinion WRT to the best name. We clear the mark "this is a valid target for jump here", I'm not sure how this function should be called. Replaced to MakeInvalidJumpTarget. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:134: # Special %rbp modifications without required sandboxing On 2012/10/04 17:26:04, Brad Chen wrote: > This line begins a section that is extremely similar to the preceding but > different. It needs a comment to make it clear why there is repetition and how > this version is different. Actually this is just bad comment. Previous one described rbp_modifications and described instruction which change %r*B*p, this one is called rsp_modifications and as the name implies it handles %r*S*p. They are similar yet different because existing validator handled %rbp and %rsp in similar yet different manner. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:153: @{ if (restricted_register == REG_RSP) On 2012/10/04 17:26:04, Brad Chen wrote: > Why isn't this action commented? Please comment all actions. Done. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:161: # naclcall or nacljmp. These are indirect-jump sequences. They include three On 2012/10/04 17:26:04, Brad Chen wrote: > Before getting into any specific instructions there ought to be a comment 'note > to reader' that explains abbrevitions, Ragel format peculiarities etc. This is validator_internals.html. > Move the defn. of abbreviations here. Are you sure? They are only used in the definition of this one machine (naclcall_or_nacljmp), they are not used for anything else. > Also explain what '?' means. This is explained in validator_internals.html, too. > Also explain the general structure of things: > sequence of byte patterns joined with '|' > action Again: this is in validator_internals.html http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:162: # commands: On 2012/10/04 17:26:04, Brad Chen wrote: > How about "These are three-instruction indirection-jump sequences." Note: > instructions are not commands. Done. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:163: # and “and $~0x1f, %eXX” On 2012/10/04 17:26:04, Brad Chen wrote: > Why repeat the opcode "and" helpful in these comments? Seems like extra clutter. > The quotes are also unhelpful. Tried to make easier to see that third command may be call or jump and messed up with the formatting. Fixed. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:167: # detect case where %rbp/%rsp is illegally modified. On 2012/10/04 17:26:04, Brad Chen wrote: > Who will detect this case? I don't understand the comment. When this ragel machine will be combined with normal_instruction* machine additional action will be triggered after first instruction (not superinstruction!). This action will check if we can actually use this superinstruction in this particular case. Here is bad sequence which we want to detect: mov %0,%esp and $~0x1f,%ebp add %r15,%rbp jmpq *%rbp In this case code which is pointed to via %rbp will be executed with %rsp set to zero. Thankfully this situation will be detected after first two instructions: mov %0,%esp and $~0x1f,%ebp Second instruction will already detect the error and other two instructions can not clear it (once error is detected it's set for good and ValidateChunkAMD64 fails). http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:169: # There are number of variants present which differ by the REX prefix usage: On 2012/10/04 17:26:04, Brad Chen wrote: > Where are the variants? Below. Should I write that? http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:175: # instruction: with “0x01” and “0x03” opcode. Both are in use in the wild On 2012/10/04 17:26:04, Brad Chen wrote: > Please avoid colorful language such as "in use in the wild thus we can not > support just one form". You could instead say "Both are allowed." Throughout > please work on making your comments more concise. You make extra work for > everybody when you write too much (or too little). The only question is: how to determine if I wrote too much or too little. This particular comment is directed to future readers who'll want to simplify our ABI. They may decide to remove "0x03" version "because noone uses it" only to find out that Bastion now "mysteriously" fail (it uses "0x01" version in statically-supplied code, but uses "0x03" version in dynamically-generated code). http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:184: # R32: 32-bit counterparts for new amd64 registers (%r8d to %r14d) On 2012/10/04 17:26:04, Brad Chen wrote: > You have defined R32 twice. I think you mean "E32" above. I'm still not entirely > thrilled with these names so if you have a better idea please suggest it. Maybe > "E86/R86" instead of "E32/R32"? Yes, and done. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:197: #### INSTRUCTION THREE.c (three bytes plus optional REX prefix) On 2012/10/04 17:26:04, Brad Chen wrote: > three or two bytes? Done. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:200: #### INSTRUCTION THREE.j (three bytes plus optional REX prefix) On 2012/10/04 17:26:04, Brad Chen wrote: > Shouldn't 'three' be 'two' here too? Looks like 'j' is a typo. Replaced "j" with "jmp". Now it's "INSTRUCTION THREE: call" and "INSTRUCTION THREE: jmp". http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:203: # This action first compares register numbers in three atomic instructions On 2012/10/04 17:26:04, Brad Chen wrote: > What do you mean by 'atomic instructions'? That means something very specific; > better not to use it to refer to instructions that are part of a > super-instruction. atomic -> component. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:210: # the superinstruction. On 2012/10/04 17:26:04, Brad Chen wrote: > Please be consistent; use either 'super-instruction' or 'superinstruction' but > not both. Done. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:232: #### INSTRUCTION THREE.c (three bytes plus optional REX prefix) On 2012/10/04 17:26:04, Brad Chen wrote: > Here again I think you mean 'two' instead of 'three' Done. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:245: # the superinstruction. On 2012/10/04 17:26:04, Brad Chen wrote: > This action is identical to the previous action. Please make it a procedure and > the action a procedure call. All actions are different: where actions are identical or substantially similar they are already merged (or call the procedure). In particular: this one is not an exception. Previous one used "RMFromModRM(instruction_start[5])", this one uses "RegFromModRM(instruction_start[5])", next one uses RMFromModRM(instruction_start[6]), but offsets are not 3 and 6, but 4 and 7, etc. I'm not sure if function call will clarify anything here: we'll need to pass along: - Couple of boolean flags - Three pointers - Pointer to restricted register - Pointer to instruction_info_collected Ok, let's see if it's simpler or not. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:297: @{ On 2012/10/04 17:26:04, Brad Chen wrote: > This is much too repetitive. You are increasing my work by making me read this > multiple times. As we discussed F2F, please make it a procedure call. Please > remember 'premature optimization is the root of all evil.' You can not make the description of the instruction a procedure call, you can only turn action into one. Will this actually help readability? http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:308: # This block encodes call and jump instructions of the form: On 2012/10/04 17:26:04, Brad Chen wrote: > Here I think you mean "super-instruction" not "instruction". Can you be careful > to use 'super-instruction' or 'instruction' throughout as appropriate? Done. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:318: #### INSTRUCTION THREE.c (three bytes plus optional REX prefix) On 2012/10/04 17:26:04, Brad Chen wrote: > 'two'? Done. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:322: (REX_WRX? 0xff (0xe0|0xe1|0xe2|0xe3|0xe4|0xe5|0xe6|0xe7)))) | On 2012/10/04 17:26:04, Brad Chen wrote: > It's really hard to figure out what parenthetical is being closed here, and > what the components of the or statement are for the '|' at the end of the line. > You need go figure out a way to format or comment this stuff to make it > readable. It looks to me like you are being sloppy with indentation. Remember > code reviewers don't get to use emacs to check for paren matching. Indeed: indentation was mixed up, sorry. Fixed. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:340: # This action first compares register numbers in three atomic instructions On 2012/10/04 17:26:04, Brad Chen wrote: > Every place you find yourself repeating a comment word-for-word like this it is > almost surely a situation where you should be using a procedure, commenting it > exactly once, and calling it throughout. Done. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:403: # There are two variants which handle spurious REX prefixes. On 2012/10/04 17:26:04, Brad Chen wrote: > In what sense are the REX prefixes spurious? They don't change the meaning of the instruction but change it's length. And we need to know it's length to correctly handle instruction boundaries in the following action. This comment is unclear. I think > you maybe mean "There are two variants, one with the REX prefix, and one > without." Why are you not using the '?' for optional bytes here? Seems > inconsistent. It would be better for many reasons to avoid this repetition. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:410: (0x89 | 0x8b) 0xf6 . # mov %esi,%esi On 2012/10/04 17:26:04, Brad Chen wrote: > What does '.' mean here? How is it different from '|'? This would be good to > cover in the lead comment above (where you explain abbreviations etc. "." means concatenation which is default action, too. I've used it to just somehow separate the instructions, but if that's not clear then it's better to remove it. Done. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:413: # This action redefines the range of the super-instruction to include the On 2012/10/04 17:26:04, Brad Chen wrote: > This same identical comment is repeated at least four times. Not a good use of > your time or my time. Please use a procedure, comment it once, and give it a > useful name. When things are similar but repeated due to necessary differences, > use comments that focus on the interesting differences, e.g. "This is like case > 'foo' above except for blah blah blah." Done.
Many big improvements in the latest patch. Please keep it up! http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... File src/trusted/validator_ragel/unreviewed/decoder_test.c (right): http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/decoder_test.c:80: void ProcessInstruction(const uint8_t *begin, const uint8_t *end, I am still waiting for you to break up this large function. Mark would be complaining now about how not addressing all comments makes reviews longer... http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... File src/trusted/validator_ragel/unreviewed/parse_instruction.rl (right): http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/parse_instruction.rl:5: */ What is this file for? It's particularly important to know if this file is or is not used to build the production validator. At this point I think it probably makes sense for every file to have a brief comment similar to what you added for validator_x86_32.rl http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... File src/trusted/validator_ragel/unreviewed/validator.h (right): http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/validator.h:14: enum validation_callback_info { Looks like the style observations from other files haven't been applied here yet. CamelCase for enum type names? http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... File src/trusted/validator_ragel/unreviewed/validator_test.c (right): http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/validator_test.c:235: else if (raw_bitness == 32) { This looks like non-standard formatting for if statement. Please fix. I expected to see {} for all multi-line blocks (even if they are only one statement) and no newline for "} else {" http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... File src/trusted/validator_ragel/validator_internal.h (right): http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/validator_internal.h:327: * Mark the gived address as valid jump target address. "given"? http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/validator_internal.h:335: * Mark the gived address as invalid jump target address. "given"? http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/validator_internal.h:536: static INLINE void process_2_operands_zero_extends( All these non-standard procedure names need to be fixed. http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/validator_internal.h:587: static INLINE void ProcessNaclCallOrNaclJmp( This is a big improvement. Now it's much easier to see at the call site what is happening and where the same thing is happening in two places. Just as with prose, please eliminate repetitive text that doesn't enhance meaning, therefore: SandboxNaClCallOrJmp - drop second redundant nacl. "Sandbox" seems like a better description of what this routine will do than "Process". Do you have a better idea? SandboxRxiSuperInst - Rxi easily understood as Rsi or Rdi. Also understood is that sandboxing a superinst implies eliminating interior targets. The boolean parameters in this and the next function are readability issues especially at the call site. Please eliminate them and instead provide different versions of these routines, i.e. ProcessNaClCallOrJmpNoRex(...) ProcessNaClCallOrJmpWithREX(...) These new routines should call the existing version of the routine that takes the boolean parameter. http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... File src/trusted/validator_ragel/validator_x86_32.rl (right): http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_32.rl:62: action rel16_operand { Can you add comments to remind readers what kind of instructions rel8/rel16/rel32 correspond to? In general this file needs more comments... http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_32.rl:88: 0x83 0xe7 0xe0 0xff (0xd7|0xe7)) # naclcall/jmp %edi Note: This is a case where repetition seems okay to me. Note how the pattern and structure makes it easy to see what's different from line to line, and the comment makes it clear why one line is different from the next. http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_32.rl:102: ((data16 0xe8 rel16) | This looks like this file would benefit from the style of commenting in the other file. Fortunately there isn't as much to do in here. http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_32.rl:119: (VALIDATION_ERRORS_MASK | BAD_CALL_ALIGNMENT)) || This doesn't look like standard indentation. Boolean expressions should be indented to make there structure more obvious. http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... File src/trusted/validator_ragel/validator_x86_64.rl (right): http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:12: * “@{}” or “REX_WRX?” mean, etc. Thanks, this is helpful. http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:123: (b_0100_10x0 0x8b 0xec) # mov %rsp,%rbp This is much improved with the comments and '|' lined up. Please do this throughout this CL if you haven't already. http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:176: # “Normal” instructions detect an error when confronted with restricted I don't think you mean "detect an error when confronted". The validator, not the instruction, would detect the error, correct? Maybe "For most instructions, %rsp cannot be used." Is that what you mean? I'm not sure what "require this state" means. http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:194: # detect case where %rbp/%rsp is illegally modified when this machine will be Instead of "detect" do you mean "match"? http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:212: # E86: 32-bit counterparts for new amd64 registers (%r8d to %r14d) Please list these in this order: E86, R86, E64, R64. You listed E86 twice; please double check your defns. http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:254: ProcessNaclCallOrNaclJmp(&instruction_info_collected, &instruction_start, As I mentioned in the file where this procedure is defined, it's a bother that you can't recognize at the call site the important behavior the boolean parameters are causing. As an alternative to commenting every call site I suggest changing the procedure names.
http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... File src/trusted/validator_ragel/unreviewed/decoder_test.c (right): http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/decoder_test.c:80: void ProcessInstruction(const uint8_t *begin, const uint8_t *end, On 2012/10/05 16:47:21, Brad Chen wrote: > I am still waiting for you to break up this large function. It made no sense to break it up in the current form: major refactoring was needed. > Mark would be complaining now about how not addressing all comments makes reviews longer... Right. If the assumption is that "reviewer is always right" and "any and ALL agreements (verbal, written or otherwise) can be ignored by reviewer" then yes. I was under impression that you are not Mark and we *can* have some meaningful discussion about some subsets of the code. Now it looks like we can not. Oh, well. I've tried to refactor decoder_test.c. PTAL and suggest which parts should be moved to separate functions. http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... File src/trusted/validator_ragel/unreviewed/parse_instruction.rl (right): http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/parse_instruction.rl:5: */ On 2012/10/05 16:47:21, Brad Chen wrote: > What is this file for? It's particularly important to know if this file is or is > not used to build the production validator. At this point I think it probably > makes sense for every file to have a brief comment similar to what you added for > validator_x86_32.rl Done. http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... File src/trusted/validator_ragel/unreviewed/validator.h (right): http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/validator.h:14: enum validation_callback_info { On 2012/10/05 16:47:21, Brad Chen wrote: > Looks like the style observations from other files haven't been applied here > yet. CamelCase for enum type names? http://dev.chromium.org/developers/coding-style Though the Google C++ Style Guide now says to use kConstantNaming for enums, Chromium was written using MACRO_STYLE naming. Continue to use this style for consistency. http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... File src/trusted/validator_ragel/unreviewed/validator_test.c (right): http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/validator_test.c:235: else if (raw_bitness == 32) { On 2012/10/05 16:47:21, Brad Chen wrote: > This looks like non-standard formatting for if statement. Please fix. I expected > to see {} for all multi-line blocks (even if they are only one statement) and no > newline for "} else {" Actually the style guide contains few recommendations (not requirements), and one firm rule (if one part of an if-else statement uses curly braces, the other part must too) and it's violated here. Fixed http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... File src/trusted/validator_ragel/validator_internal.h (right): http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/validator_internal.h:327: * Mark the gived address as valid jump target address. On 2012/10/05 16:47:21, Brad Chen wrote: > "given"? Done. http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/validator_internal.h:335: * Mark the gived address as invalid jump target address. On 2012/10/05 16:47:21, Brad Chen wrote: > "given"? Done. http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/validator_internal.h:536: static INLINE void process_2_operands_zero_extends( On 2012/10/05 16:47:21, Brad Chen wrote: > All these non-standard procedure names need to be fixed. Done. http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/validator_internal.h:587: static INLINE void ProcessNaclCallOrNaclJmp( On 2012/10/05 16:47:21, Brad Chen wrote: > This is a big improvement. Now it's much easier to see at the call site what is > happening and where the same thing is happening in two places. > > Just as with prose, please eliminate repetitive text that doesn't enhance > meaning, therefore: > SandboxNaClCallOrJmp > - drop second redundant nacl. "Sandbox" seems like a better description of > what this routine will do than "Process". Do you have a better idea? > SandboxRxiSuperInst > - Rxi easily understood as Rsi or Rdi. Also understood is that sandboxing a > superinst implies eliminating interior targets. > > The boolean parameters in this and the next function are readability issues > especially at the call site. Please eliminate them and instead provide different > versions of these routines, i.e. > ProcessNaClCallOrJmpNoRex(...) > ProcessNaClCallOrJmpWithREX(...) > These new routines should call the existing version of the routine that takes > the boolean parameter. Done. http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... File src/trusted/validator_ragel/validator_x86_32.rl (right): http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_32.rl:62: action rel16_operand { On 2012/10/05 16:47:21, Brad Chen wrote: > Can you add comments to remind readers what kind of instructions > rel8/rel16/rel32 correspond to? > > In general this file needs more comments... I'm not so sure. I doubt you can ever understand these file if you'll try to just read through it without reading validator_internals.html (or, alternatively, Ragel documentation and AMD/Intel manuals). I agree that we may need to add some specific comments which explain minutiae details not mentioned in validator_internals.html, but I'm not sure it's possible to make it understandable for someone who does not want to read documentation. This task is similar to the task of commenting out C++ program enough to make it understandable for someone who only knows PHP and Visual Basic - I'm not sure if it's possible even in principle and I'm pretty sure I can not do that. http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_32.rl:88: 0x83 0xe7 0xe0 0xff (0xd7|0xe7)) # naclcall/jmp %edi On 2012/10/05 16:47:21, Brad Chen wrote: > Note: This is a case where repetition seems okay to me. Note how the pattern and > structure makes it easy to see what's different from line to line, and the > comment makes it clear why one line is different from the next. Even so: I've added one minor change to separate and and jmp/call. http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_32.rl:102: ((data16 0xe8 rel16) | On 2012/10/05 16:47:21, Brad Chen wrote: > This looks like this file would benefit from the style of commenting in the > other file. Fortunately there isn't as much to do in here. All the pieces in this file have analogues in validator_x86_64.rl, but validator_x86_64.rl contains many more corner cases. http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_32.rl:119: (VALIDATION_ERRORS_MASK | BAD_CALL_ALIGNMENT)) || On 2012/10/05 16:47:21, Brad Chen wrote: > This doesn't look like standard indentation. Boolean expressions should be > indented to make there structure more obvious. This was the goal :-( But this is moot point since BAD_CALL_ALIGNMENT is no longer checked here (similarly to the production validator it's not an error, but warning now), so this line fits in 80 characters limit. http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... File src/trusted/validator_ragel/validator_x86_64.rl (right): http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:123: (b_0100_10x0 0x8b 0xec) # mov %rsp,%rbp On 2012/10/05 16:47:21, Brad Chen wrote: > This is much improved with the comments and '|' lined up. Please do this > throughout this CL if you haven't already. This was the only case where they were not lined up to catch the attention of reader. I've lined up all of these comments. Not sure if it helps readability or not, though. http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:176: # “Normal” instructions detect an error when confronted with restricted On 2012/10/05 16:47:21, Brad Chen wrote: > I don't think you mean "detect an error when confronted". The validator, not the > instruction, would detect the error, correct? I mean normal_instruction ragel machine here, of course. But that machine described "normal" instructions, so... This is explained in details in my document here: https://src.chromium.org/viewvc/native_client/trunk/src/native_client/src/tru... > > Maybe "For most instructions, %rsp cannot be used." Is that what you mean? I'm > not sure what "require this state" means. After normal instruction %rsp, surprisingly enough, can be altered. E.g.: "mov $1, %esp" is perfectly valid fragment of NaCl program. But. If that happens then it must be immediately followed with the instruction which returns the status quo: mov $1, %esp add %r15,%rsp Is valid, but mov $1, %esp mov $2, %ebp is not. To achieve that we split instructions in three classes (there are more, but these three are involved here): • Normal instructions. • %rbp sandboxing instructions. • %rsp sandboxing instructions. Normal instructions can zero-extend %rbp or %rsp, but if they are encountered when validator is in "need %rsp sandboxing" (aka restricted_register == REG_RSP) or "need %rbp sandboxing" (aka restricted_register == REG_RBP) it's validation error. %rsp sandboxing instructions, on the other hand, *require* this "need %rsp sandboxing" (aka restricted_register == REG_RSP) state: it's an error to find in place where %rsp is not zero-extended. Similarly with %rbp sandboxing instructions. It's explained validator_internals.html, but here I just want to explain what I'm checking for and why. http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:194: # detect case where %rbp/%rsp is illegally modified when this machine will be On 2012/10/05 16:47:21, Brad Chen wrote: > Instead of "detect" do you mean "match"? No, detect. I mean the following: "0x83 b_11_100_xxx 0xe0" is first three bytes for naclcall_or_nacljmp, sure. But. They are ALSO a normal instruction (as comments says it's "and $~0x1f, E86"). normal_instruction (ragel machine) will check if %rbp and %rsp are in acceptable state (and it'll detect error if they are zero-extended). This is explained on toy examples in my document: https://src.chromium.org/viewvc/native_client/trunk/src/native_client/src/tru... http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:212: # E86: 32-bit counterparts for new amd64 registers (%r8d to %r14d) On 2012/10/05 16:47:21, Brad Chen wrote: > Please list these in this order: E86, R86, E64, R64. You listed E86 twice; > please double check your defns. Done. http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:254: ProcessNaclCallOrNaclJmp(&instruction_info_collected, &instruction_start, On 2012/10/05 16:47:21, Brad Chen wrote: > As I mentioned in the file where this procedure is defined, it's a bother that > you can't recognize at the call site the important behavior the boolean > parameters are causing. As an alternative to commenting every call site I > suggest changing the procedure names. Done.
This is getting pretty close. In the interest of fast turn-around I've tried to push us to closure on my last round of comments. Once you have dispatched all of that feedback I will make one final pass through everything, which should be easy if you bring the whole CL up to the level of the parts we have worked on in more detail. http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... File src/trusted/validator_ragel/unreviewed/decoder_test.c (right): http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/decoder_test.c:80: void ProcessInstruction(const uint8_t *begin, const uint8_t *end, Here are rules that I generally assume apply during code review: - By default reviewer is right. - For questions of implementation details, the author tends to know the code better than the reviewer and should use the review as an opportunity to teach the reviewer what's going on. As the reviewer learns it may impact his review. - For questions of style, the reviewer is generally not going to learn anything from the review. In this specific case, appropriate use of procedural abstraction is a broadly accepted style rule. I don't see much wiggle-room here. If you think this is unreasonable, I'd suggest talking F2F with Dmitry or Kostya about this, as that will make better use of your time. On 2012/10/15 16:38:57, khim wrote: > On 2012/10/05 16:47:21, Brad Chen wrote: > > I am still waiting for you to break up this large function. > > It made no sense to break it up in the current form: major refactoring was > needed. > > > Mark would be complaining now about how not addressing all comments makes > reviews longer... > > Right. If the assumption is that "reviewer is always right" and "any and ALL > agreements (verbal, written or otherwise) can be ignored by reviewer" then yes. > > I was under impression that you are not Mark and we *can* have some meaningful > discussion about some subsets of the code. Now it looks like we can not. > > Oh, well. I've tried to refactor decoder_test.c. PTAL and suggest which parts > should be moved to separate functions. http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... File src/trusted/validator_ragel/unreviewed/validator.h (right): http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/validator.h:14: enum validation_callback_info { NaCl source uses CamelCase; please be consistent with NaCl source. On 2012/10/15 16:38:57, khim wrote: > On 2012/10/05 16:47:21, Brad Chen wrote: > > Looks like the style observations from other files haven't been applied here > > yet. CamelCase for enum type names? > > http://dev.chromium.org/developers/coding-style > Though the Google C++ Style Guide now says to use kConstantNaming for enums, > Chromium was written using MACRO_STYLE naming. Continue to use this style for > consistency. http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... File src/trusted/validator_ragel/validator_x86_32.rl (right): http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_32.rl:62: action rel16_operand { I'm not sure either but I'd like you to try. Even for somebody who has read the documentation file, it is not reasonable to expect them to have memorized every detail. Comments in this file will help with that. Commenting is another style rule, not a technical question, hence not an area I think you should be anticipating a lot of flexibility from me. If done thoughtfully it will make the code better. That's something we should all be able to be happy about. On 2012/10/15 16:38:57, khim wrote: > On 2012/10/05 16:47:21, Brad Chen wrote: > > Can you add comments to remind readers what kind of instructions > > rel8/rel16/rel32 correspond to? > > > > In general this file needs more comments... > > I'm not so sure. I doubt you can ever understand these file if you'll try to > just read through it without reading validator_internals.html (or, > alternatively, Ragel documentation and AMD/Intel manuals). > > I agree that we may need to add some specific comments which explain minutiae > details not mentioned in validator_internals.html, but I'm not sure it's > possible to make it understandable for someone who does not want to read > documentation. This task is similar to the task of commenting out C++ program > enough to make it understandable for someone who only knows PHP and Visual Basic > - I'm not sure if it's possible even in principle and I'm pretty sure I can not > do that. http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_32.rl:102: ((data16 0xe8 rel16) | While what you state is true, it doesn't change the fact that this file needs comments. On 2012/10/15 16:38:57, khim wrote: > On 2012/10/05 16:47:21, Brad Chen wrote: > > This looks like this file would benefit from the style of commenting in the > > other file. Fortunately there isn't as much to do in here. > > All the pieces in this file have analogues in validator_x86_64.rl, but > validator_x86_64.rl contains many more corner cases. http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... File src/trusted/validator_ragel/validator_x86_64.rl (right): http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:123: (b_0100_10x0 0x8b 0xec) # mov %rsp,%rbp The way a code review works is I am a proxy for future readers. By giving me that responsibility, and applying style rules consistently, it saves us the trouble of having to ask many more people. Make sense? On 2012/10/15 16:38:57, khim wrote: > On 2012/10/05 16:47:21, Brad Chen wrote: > > This is much improved with the comments and '|' lined up. Please do this > > throughout this CL if you haven't already. > > This was the only case where they were not lined up to catch the attention of > reader. > > I've lined up all of these comments. Not sure if it helps readability or not, > though. http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:176: # “Normal” instructions detect an error when confronted with restricted As written the comment is unclear. Please clarify. On 2012/10/15 16:38:57, khim wrote: > On 2012/10/05 16:47:21, Brad Chen wrote: > > I don't think you mean "detect an error when confronted". The validator, not > the > > instruction, would detect the error, correct? > > I mean normal_instruction ragel machine here, of course. But that machine > described "normal" instructions, so... > > This is explained in details in my document here: > https://src.chromium.org/viewvc/native_client/trunk/src/native_client/src/tru... > > > > > Maybe "For most instructions, %rsp cannot be used." Is that what you mean? I'm > > not sure what "require this state" means. > > After normal instruction %rsp, surprisingly enough, can be altered. > > E.g.: "mov $1, %esp" is perfectly valid fragment of NaCl program. > > But. If that happens then it must be immediately followed with the instruction > which returns the status quo: > mov $1, %esp > add %r15,%rsp > Is valid, but > mov $1, %esp > mov $2, %ebp > is not. > > To achieve that we split instructions in three classes (there are more, but > these three are involved here): > • Normal instructions. > • %rbp sandboxing instructions. > • %rsp sandboxing instructions. > > Normal instructions can zero-extend %rbp or %rsp, but if they are encountered > when validator is in "need %rsp sandboxing" (aka restricted_register == REG_RSP) > or "need %rbp sandboxing" (aka restricted_register == REG_RBP) it's validation > error. > > %rsp sandboxing instructions, on the other hand, *require* this "need %rsp > sandboxing" (aka restricted_register == REG_RSP) state: it's an error to find in > place where %rsp is not zero-extended. Similarly with %rbp sandboxing > instructions. > > It's explained validator_internals.html, but here I just want to explain what > I'm checking for and why. http://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_64.rl:194: # detect case where %rbp/%rsp is illegally modified when this machine will be Let me try to reword my comment to make it clearer. I'm reading this as "match", not "detect" because match would be normal English usage here. 'Detect' is unusual usage, being more active than 'match'. If you really mean "detect" and not "match" please explain in the comment what additional activity you mean to express by using "detect" instead of "match". This is a minor issue. Your usage of "machine" here is also peculiar and demands explanation. By "machine" maybe you mean "state machine". In that case, simply use "state machine" as that is clear and unambiguous. On 2012/10/15 16:38:57, khim wrote: > On 2012/10/05 16:47:21, Brad Chen wrote: > > Instead of "detect" do you mean "match"? > > No, detect. I mean the following: "0x83 b_11_100_xxx 0xe0" is first three bytes > for naclcall_or_nacljmp, sure. > > But. They are ALSO a normal instruction (as comments says it's "and $~0x1f, > E86"). normal_instruction (ragel machine) will check if %rbp and %rsp are in > acceptable state (and it'll detect error if they are zero-extended). > > This is explained on toy examples in my document: > https://src.chromium.org/viewvc/native_client/trunk/src/native_client/src/tru...
https://chromiumcodereview.appspot.com/11000033/diff/32010/src/trusted/valida... File src/trusted/validator_ragel/unreviewed/validator.h (right): https://chromiumcodereview.appspot.com/11000033/diff/32010/src/trusted/valida... src/trusted/validator_ragel/unreviewed/validator.h:14: enum validation_callback_info { On 2012/10/16 00:16:55, Brad Chen wrote: > NaCl source uses CamelCase; please be consistent with NaCl source. > > On 2012/10/15 16:38:57, khim wrote: > > On 2012/10/05 16:47:21, Brad Chen wrote: > > > Looks like the style observations from other files haven't been applied here > > > yet. CamelCase for enum type names? > > > > http://dev.chromium.org/developers/coding-style > > Though the Google C++ Style Guide now says to use kConstantNaming for enums, > > Chromium was written using MACRO_STYLE naming. Continue to use this style for > > consistency. > Chromium code search results for query "enum file:native_client/src/trusted" give an impression that CamelCase for enumeration constants is only used in old x86 validator. C++-style-guide-recommended kConstantName occurs slighly more often, and CONSTANT_NAME style is prevaling. So consistency argument can possibly be in favor of CONSTANT_NAME or kConstantName, I don't know, but it's certainly against CamelCase.
Have a look here: src/trusted/service_runtime/sel_ldr.h src/trusted/service_runtime/sel_mem.h src/trusted/service_runtime/nacl_app_thread.h src/trusted/service_runtime/nacl_signal.h src/trusted/debug_stub/abi.h src/trusted/debug_stub/target.h src/trusted/desc/nacl_desc_wrapper.h src/trusted/desc/nacl_desc_base.h As far as I can remember we have always intended to follow Google3 style which uses CamelCase. On 2012/10/19 12:55:52, Vlad Shcherbina wrote: > https://chromiumcodereview.appspot.com/11000033/diff/32010/src/trusted/valida... > File src/trusted/validator_ragel/unreviewed/validator.h (right): > > https://chromiumcodereview.appspot.com/11000033/diff/32010/src/trusted/valida... > src/trusted/validator_ragel/unreviewed/validator.h:14: enum > validation_callback_info { > On 2012/10/16 00:16:55, Brad Chen wrote: > > NaCl source uses CamelCase; please be consistent with NaCl source. > > > > On 2012/10/15 16:38:57, khim wrote: > > > On 2012/10/05 16:47:21, Brad Chen wrote: > > > > Looks like the style observations from other files haven't been applied > here > > > > yet. CamelCase for enum type names? > > > > > > http://dev.chromium.org/developers/coding-style > > > Though the Google C++ Style Guide now says to use kConstantNaming for enums, > > > Chromium was written using MACRO_STYLE naming. Continue to use this style > for > > > consistency. > > > > Chromium code search results for query "enum file:native_client/src/trusted" > give an impression that CamelCase for enumeration constants is only used in old > x86 validator. C++-style-guide-recommended kConstantName occurs slighly more > often, and CONSTANT_NAME style is prevaling. > > So consistency argument can possibly be in favor of CONSTANT_NAME or > kConstantName, I don't know, but it's certainly against CamelCase.
On Sat, Oct 20, 2012 at 3:29 AM, <bradchen@google.com> wrote: > Have a look here: > src/trusted/service_runtime/**sel_ldr.h > enum NaClResourcePhase { NACL_RESOURCE_PHASE_START, NACL_RESOURCE_PHASE_REV_CHAN }; > src/trusted/service_runtime/**sel_mem.h > enum NaClVmmapEntryType { NACL_VMMAP_ENTRY_ANONYMOUS = 1, NACL_VMMAP_ENTRY_MAPPED }; > src/trusted/service_runtime/**nacl_app_thread.h > enum NaClSuspendState { NACL_APP_THREAD_UNTRUSTED = 1, NACL_APP_THREAD_TRUSTED = 2 #if NACL_LINUX , NACL_APP_THREAD_SUSPENDING = 4, NACL_APP_THREAD_SUSPENDED = 8 #endif }; > src/trusted/service_runtime/**nacl_signal.h > enum NaClSignalResult { NACL_SIGNAL_SEARCH, /* Try our handler or OS */ NACL_SIGNAL_SKIP, /* Skip our handlers and try OS */ NACL_SIGNAL_RETURN /* Skip all other handlers and return */ }; > src/trusted/debug_stub/abi.h > enum RegType { GENERAL, READ_ONLY, X86_64_TRUSTED_PTR, REG_TYPE_CNT }; > src/trusted/debug_stub/target.**h > enum ErrDef { NONE = 0, BAD_FORMAT = 1, BAD_ARGS = 2, FAILED = 3 }; > src/trusted/desc/nacl_desc_**wrapper.h > no enums > src/trusted/desc/nacl_desc_**base.h > > enum NaClDescTypeTag { NACL_DESC_INVALID, NACL_DESC_DIR, NACL_DESC_HOST_IO, NACL_DESC_CONN_CAP, NACL_DESC_CONN_CAP_FD, NACL_DESC_BOUND_SOCKET, NACL_DESC_CONNECTED_SOCKET, NACL_DESC_SHM, NACL_DESC_SYSV_SHM, NACL_DESC_MUTEX, NACL_DESC_CONDVAR, NACL_DESC_SEMAPHORE, NACL_DESC_SYNC_SOCKET, NACL_DESC_TRANSFERABLE_DATA_SOCKET, NACL_DESC_IMC_SOCKET, NACL_DESC_QUOTA, NACL_DESC_DEVICE_RNG, NACL_DESC_DEVICE_POSTMESSAGE, NACL_DESC_CUSTOM, NACL_DESC_NULL /* * Add new NaClDesc subclasses here. * * NB: when we add new tag types, NaClDescInternalize[] **MUST** * also be updated to add new internalization functions. */ }; > As far as I can remember we have always intended to follow Google3 style > which > uses CamelCase. > > Yet somehow all the files you cite use Chrome style with uppercase letters and underscores. Only validator uses CamelCase without "k" prefix. > > On 2012/10/19 12:55:52, Vlad Shcherbina wrote: > > https://chromiumcodereview.**appspot.com/11000033/diff/** > 32010/src/trusted/validator_**ragel/unreviewed/validator.h<https://chromiumcodereview.appspot.com/11000033/diff/32010/src/trusted/validator_ragel/unreviewed/validator.h> > >> File src/trusted/validator_ragel/**unreviewed/validator.h (right): >> > > > https://chromiumcodereview.**appspot.com/11000033/diff/** > 32010/src/trusted/validator_**ragel/unreviewed/validator.h#**newcode14<https://chromiumcodereview.appspot.com/11000033/diff/32010/src/trusted/validator_ragel/unreviewed/validator.h#newcode14> > >> src/trusted/validator_ragel/**unreviewed/validator.h:14: enum >> validation_callback_info { >> On 2012/10/16 00:16:55, Brad Chen wrote: >> > NaCl source uses CamelCase; please be consistent with NaCl source. >> > >> > On 2012/10/15 16:38:57, khim wrote: >> > > On 2012/10/05 16:47:21, Brad Chen wrote: >> > > > Looks like the style observations from other files haven't been >> applied >> here >> > > > yet. CamelCase for enum type names? >> > > >> > > http://dev.chromium.org/**developers/coding-style<http://dev.chromium.org/dev... >> > > Though the Google C++ Style Guide now says to use kConstantNaming for >> > enums, > >> > > Chromium was written using MACRO_STYLE naming. Continue to use this >> style >> for >> > > consistency. >> > >> > > Chromium code search results for query "enum file:native_client/src/** >> trusted" >> give an impression that CamelCase for enumeration constants is only used >> in >> > old > >> x86 validator. C++-style-guide-recommended kConstantName occurs slighly >> more >> often, and CONSTANT_NAME style is prevaling. >> > > So consistency argument can possibly be in favor of CONSTANT_NAME or >> kConstantName, I don't know, but it's certainly against CamelCase. >> > > > https://chromiumcodereview.**appspot.com/11000033/<https://chromiumcodereview... > -- You received this message because you are subscribed to the Google Groups "Native-Client-Reviews" group. To post to this group, send email to native-client-reviews@googlegroups.com. To unsubscribe from this group, send email to native-client-reviews+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/native-client-reviews?hl=en.
Maybe there is a simple misunderstanding here, I'm asking for the name of the enum, i.e. 'NaClResourcePhase', to be CamelCase, not NACL_RESOURCE_PHASE_START. I'm sorry; I was unclear. On Fri, Oct 19, 2012 at 5:01 PM, Victor Khimenko <khim@chromium.org> wrote: > > > On Sat, Oct 20, 2012 at 3:29 AM, <bradchen@google.com> wrote: > >> Have a look here: >> src/trusted/service_runtime/**sel_ldr.h >> > > enum NaClResourcePhase { > NACL_RESOURCE_PHASE_START, > NACL_RESOURCE_PHASE_REV_CHAN > }; > > >> src/trusted/service_runtime/**sel_mem.h >> > > enum NaClVmmapEntryType { > NACL_VMMAP_ENTRY_ANONYMOUS = 1, > NACL_VMMAP_ENTRY_MAPPED > }; > > >> src/trusted/service_runtime/**nacl_app_thread.h >> > > enum NaClSuspendState { > NACL_APP_THREAD_UNTRUSTED = 1, > NACL_APP_THREAD_TRUSTED = 2 > #if NACL_LINUX > , NACL_APP_THREAD_SUSPENDING = 4, > NACL_APP_THREAD_SUSPENDED = 8 > #endif > }; > > >> src/trusted/service_runtime/**nacl_signal.h >> > > enum NaClSignalResult { > NACL_SIGNAL_SEARCH, /* Try our handler or OS */ > NACL_SIGNAL_SKIP, /* Skip our handlers and try OS */ > NACL_SIGNAL_RETURN /* Skip all other handlers and return */ > }; > > >> src/trusted/debug_stub/abi.h >> > > enum RegType { > GENERAL, > READ_ONLY, > X86_64_TRUSTED_PTR, > REG_TYPE_CNT > }; > > >> src/trusted/debug_stub/target.**h >> > > enum ErrDef { > NONE = 0, > BAD_FORMAT = 1, > BAD_ARGS = 2, > FAILED = 3 > }; > > >> src/trusted/desc/nacl_desc_**wrapper.h >> > > no enums > > >> src/trusted/desc/nacl_desc_**base.h >> >> enum NaClDescTypeTag { > NACL_DESC_INVALID, > NACL_DESC_DIR, > NACL_DESC_HOST_IO, > NACL_DESC_CONN_CAP, > NACL_DESC_CONN_CAP_FD, > NACL_DESC_BOUND_SOCKET, > NACL_DESC_CONNECTED_SOCKET, > NACL_DESC_SHM, > NACL_DESC_SYSV_SHM, > NACL_DESC_MUTEX, > NACL_DESC_CONDVAR, > NACL_DESC_SEMAPHORE, > NACL_DESC_SYNC_SOCKET, > NACL_DESC_TRANSFERABLE_DATA_SOCKET, > NACL_DESC_IMC_SOCKET, > NACL_DESC_QUOTA, > NACL_DESC_DEVICE_RNG, > NACL_DESC_DEVICE_POSTMESSAGE, > NACL_DESC_CUSTOM, > NACL_DESC_NULL > /* > * Add new NaClDesc subclasses here. > * > * NB: when we add new tag types, NaClDescInternalize[] **MUST** > * also be updated to add new internalization functions. > */ > }; > > >> As far as I can remember we have always intended to follow Google3 style >> which >> uses CamelCase. >> >> Yet somehow all the files you cite use Chrome style with uppercase > letters and underscores. Only validator uses CamelCase without "k" prefix. > > >> >> On 2012/10/19 12:55:52, Vlad Shcherbina wrote: >> >> https://chromiumcodereview.**appspot.com/11000033/diff/** >> 32010/src/trusted/validator_**ragel/unreviewed/validator.h<https://chromiumcodereview.appspot.com/11000033/diff/32010/src/trusted/validator_ragel/unreviewed/validator.h> >> >>> File src/trusted/validator_ragel/**unreviewed/validator.h (right): >>> >> >> >> https://chromiumcodereview.**appspot.com/11000033/diff/** >> 32010/src/trusted/validator_**ragel/unreviewed/validator.h#**newcode14<https://chromiumcodereview.appspot.com/11000033/diff/32010/src/trusted/validator_ragel/unreviewed/validator.h#newcode14> >> >>> src/trusted/validator_ragel/**unreviewed/validator.h:14: enum >>> validation_callback_info { >>> On 2012/10/16 00:16:55, Brad Chen wrote: >>> > NaCl source uses CamelCase; please be consistent with NaCl source. >>> > >>> > On 2012/10/15 16:38:57, khim wrote: >>> > > On 2012/10/05 16:47:21, Brad Chen wrote: >>> > > > Looks like the style observations from other files haven't been >>> applied >>> here >>> > > > yet. CamelCase for enum type names? >>> > > >>> > > http://dev.chromium.org/**developers/coding-style<http://dev.chromium.org/dev... >>> > > Though the Google C++ Style Guide now says to use kConstantNaming for >>> >> enums, >> >>> > > Chromium was written using MACRO_STYLE naming. Continue to use this >>> style >>> for >>> > > consistency. >>> > >>> >> >> Chromium code search results for query "enum file:native_client/src/** >>> trusted" >>> give an impression that CamelCase for enumeration constants is only used >>> in >>> >> old >> >>> x86 validator. C++-style-guide-recommended kConstantName occurs slighly >>> more >>> often, and CONSTANT_NAME style is prevaling. >>> >> >> So consistency argument can possibly be in favor of CONSTANT_NAME or >>> kConstantName, I don't know, but it's certainly against CamelCase. >>> >> >> >> https://chromiumcodereview.**appspot.com/11000033/<https://chromiumcodereview... >> > > -- You received this message because you are subscribed to the Google Groups "Native-Client-Reviews" group. To post to this group, send email to native-client-reviews@googlegroups.com. To unsubscribe from this group, send email to native-client-reviews+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/native-client-reviews?hl=en.
On Sat, Oct 20, 2012 at 4:26 AM, Brad Chen <bradchen@google.com> wrote: > Maybe there is a simple misunderstanding here, I'm asking for the name of > the enum, i.e. 'NaClResourcePhase', to be CamelCase, not > NACL_RESOURCE_PHASE_START. I'm sorry; I was unclear. > > Well, this changes things, obviously. I was under impression that you talk about enum names because: 1. Old validator indeed uses this style. 2. Last message on subject was sent after the point when I've uploaded this patchset four days ago: https://chromiumcodereview.appspot.com/11000033/diff/62001/src/trusted/valida... Since the change in question already uses CamelCase for the name of enum (ValidationCallbackInfo) in said patchset the last remaining thing to do was to rename enum names. > > On Fri, Oct 19, 2012 at 5:01 PM, Victor Khimenko <khim@chromium.org>wrote: > >> >> >> On Sat, Oct 20, 2012 at 3:29 AM, <bradchen@google.com> wrote: >> >>> Have a look here: >>> src/trusted/service_runtime/**sel_ldr.h >>> >> >> enum NaClResourcePhase { >> NACL_RESOURCE_PHASE_START, >> NACL_RESOURCE_PHASE_REV_CHAN >> }; >> >> >>> src/trusted/service_runtime/**sel_mem.h >>> >> >> enum NaClVmmapEntryType { >> NACL_VMMAP_ENTRY_ANONYMOUS = 1, >> NACL_VMMAP_ENTRY_MAPPED >> }; >> >> >>> src/trusted/service_runtime/**nacl_app_thread.h >>> >> >> enum NaClSuspendState { >> NACL_APP_THREAD_UNTRUSTED = 1, >> NACL_APP_THREAD_TRUSTED = 2 >> #if NACL_LINUX >> , NACL_APP_THREAD_SUSPENDING = 4, >> NACL_APP_THREAD_SUSPENDED = 8 >> #endif >> }; >> >> >>> src/trusted/service_runtime/**nacl_signal.h >>> >> >> enum NaClSignalResult { >> NACL_SIGNAL_SEARCH, /* Try our handler or OS */ >> NACL_SIGNAL_SKIP, /* Skip our handlers and try OS */ >> NACL_SIGNAL_RETURN /* Skip all other handlers and return */ >> }; >> >> >>> src/trusted/debug_stub/abi.h >>> >> >> enum RegType { >> GENERAL, >> READ_ONLY, >> X86_64_TRUSTED_PTR, >> REG_TYPE_CNT >> }; >> >> >>> src/trusted/debug_stub/target.**h >>> >> >> enum ErrDef { >> NONE = 0, >> BAD_FORMAT = 1, >> BAD_ARGS = 2, >> FAILED = 3 >> }; >> >> >>> src/trusted/desc/nacl_desc_**wrapper.h >>> >> >> no enums >> >> >>> src/trusted/desc/nacl_desc_**base.h >>> >>> enum NaClDescTypeTag { >> NACL_DESC_INVALID, >> NACL_DESC_DIR, >> NACL_DESC_HOST_IO, >> NACL_DESC_CONN_CAP, >> NACL_DESC_CONN_CAP_FD, >> NACL_DESC_BOUND_SOCKET, >> NACL_DESC_CONNECTED_SOCKET, >> NACL_DESC_SHM, >> NACL_DESC_SYSV_SHM, >> NACL_DESC_MUTEX, >> NACL_DESC_CONDVAR, >> NACL_DESC_SEMAPHORE, >> NACL_DESC_SYNC_SOCKET, >> NACL_DESC_TRANSFERABLE_DATA_SOCKET, >> NACL_DESC_IMC_SOCKET, >> NACL_DESC_QUOTA, >> NACL_DESC_DEVICE_RNG, >> NACL_DESC_DEVICE_POSTMESSAGE, >> NACL_DESC_CUSTOM, >> NACL_DESC_NULL >> /* >> * Add new NaClDesc subclasses here. >> * >> * NB: when we add new tag types, NaClDescInternalize[] **MUST** >> * also be updated to add new internalization functions. >> */ >> }; >> >> >>> As far as I can remember we have always intended to follow Google3 style >>> which >>> uses CamelCase. >>> >>> Yet somehow all the files you cite use Chrome style with uppercase >> letters and underscores. Only validator uses CamelCase without "k" prefix. >> >> >>> >>> On 2012/10/19 12:55:52, Vlad Shcherbina wrote: >>> >>> https://chromiumcodereview.**appspot.com/11000033/diff/** >>> 32010/src/trusted/validator_**ragel/unreviewed/validator.h<https://chromiumcodereview.appspot.com/11000033/diff/32010/src/trusted/validator_ragel/unreviewed/validator.h> >>> >>>> File src/trusted/validator_ragel/**unreviewed/validator.h (right): >>>> >>> >>> >>> https://chromiumcodereview.**appspot.com/11000033/diff/** >>> 32010/src/trusted/validator_**ragel/unreviewed/validator.h#**newcode14<https://chromiumcodereview.appspot.com/11000033/diff/32010/src/trusted/validator_ragel/unreviewed/validator.h#newcode14> >>> >>>> src/trusted/validator_ragel/**unreviewed/validator.h:14: enum >>>> validation_callback_info { >>>> On 2012/10/16 00:16:55, Brad Chen wrote: >>>> > NaCl source uses CamelCase; please be consistent with NaCl source. >>>> > >>>> > On 2012/10/15 16:38:57, khim wrote: >>>> > > On 2012/10/05 16:47:21, Brad Chen wrote: >>>> > > > Looks like the style observations from other files haven't been >>>> applied >>>> here >>>> > > > yet. CamelCase for enum type names? >>>> > > >>>> > > http://dev.chromium.org/**developers/coding-style<http://dev.chromium.org/dev... >>>> > > Though the Google C++ Style Guide now says to use kConstantNaming >>>> for >>>> >>> enums, >>> >>>> > > Chromium was written using MACRO_STYLE naming. Continue to use >>>> this style >>>> for >>>> > > consistency. >>>> > >>>> >>> >>> Chromium code search results for query "enum file:native_client/src/** >>>> trusted" >>>> give an impression that CamelCase for enumeration constants is only >>>> used in >>>> >>> old >>> >>>> x86 validator. C++-style-guide-recommended kConstantName occurs slighly >>>> more >>>> often, and CONSTANT_NAME style is prevaling. >>>> >>> >>> So consistency argument can possibly be in favor of CONSTANT_NAME or >>>> kConstantName, I don't know, but it's certainly against CamelCase. >>>> >>> >>> >>> https://chromiumcodereview.**appspot.com/11000033/<https://chromiumcodereview... >>> >> >> > -- You received this message because you are subscribed to the Google Groups "Native-Client-Reviews" group. To post to this group, send email to native-client-reviews@googlegroups.com. To unsubscribe from this group, send email to native-client-reviews+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/native-client-reviews?hl=en.
I think this may be my last set of comments. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... File src/trusted/validator_ragel/unreviewed/decoder_test.c (right): http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/decoder_test.c:80: void ProcessInstruction(const uint8_t *begin, const uint8_t *end, Looks to me like you didn't take care of this requested change. I noticed you pulled out fwait processing, which is a move in the right direction, but mostly this procedure is still too hard to understand and to hard to maintain. Your thoughts? On 2012/10/04 17:26:04, Brad Chen wrote: > This procedure is way too long. Please factor it into multiple functions; > ideally when you are done this will read as a sequence of procedure calls, eg. > void ProcessInstruction(...) { > DoFirstImportantStep() > DoSecondImportantStep() > ... > } > > Please apply this notion throughout this CL. A CL is not ready for review if it > includes procedures like this. http://codereview.chromium.org/11000033/diff/20001/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/decoder_test.c:410: if ((instruction->operands[i].type == OPERAND_TYPE_64_BIT) && The disappearance of this next large block of code is an example of why procedures would make it much easier for other people to help maintain this code. I note that references to the string "callq" have disappeared in your last patch, but I have not idea where they went. If this code were in a procedure with a useful name like "spurious_rex_prefix = IdentifySpuriousRexPrefixCases(...) reading and understanding might be easier. Does this make sense? Does it seem reasonable? http://codereview.chromium.org/11000033/diff/62001/src/trusted/validator_rage... File src/trusted/validator_ragel/decoder_x86_64.rl (right): http://codereview.chromium.org/11000033/diff/62001/src/trusted/validator_rage... src/trusted/validator_ragel/decoder_x86_64.rl:8: * Full-blown decoder for amd64 case. Can be used to decode instruction sequence nit: line length? http://codereview.chromium.org/11000033/diff/62001/src/trusted/validator_rage... File src/trusted/validator_ragel/unreviewed/decoder_test.c (right): http://codereview.chromium.org/11000033/diff/62001/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/decoder_test.c:134: Bool IsNameInList(const char *name, ...) { Could you use an array rather than varargs here? I'm worried about fragility of varargs. Can you make this function static, assuming it is not used outside this file? http://codereview.chromium.org/11000033/diff/62001/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/decoder_test.c:279: #define print_name(x) (printf((x)), shown_name += strlen((x))) What is your goal in making this a macro rather than a function? I notice you #undef it below as well. Macros generally make code harder to understand and harder to maintain, better to avoid. If you use one you should have a really good reason and document it clearly. http://codereview.chromium.org/11000033/diff/62001/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/decoder_test.c:492: * Non-empty REX prefix is shown if and only if if it's spurious. if if http://codereview.chromium.org/11000033/diff/62001/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/decoder_test.c:515: /* In some cases AT&T instruction uses suffix to show the size of operand. */ It looks to me like it may be the case that the structure of this procedure is a sequence of handlers for cases of work-arounds for unexpected behavior from objdump. I would be very happy if I could look at a one page procedure and that this structure would be obvious. In the end, this is just one test, and probably not worth stopping the entire train. So if it truly is more than you can deal with right now then you could propose something like filing a bug and adding a TODO() comment. More important to me than the actual changes is that you have understood and internalized the how such changes make the system better. http://codereview.chromium.org/11000033/diff/62001/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/decoder_test.c:783: int DecodeFile(const char *filename, int repeat_count) { The two if statements that follow the calls to CheckBounds() strike me as opportunities for procedural abstraction. http://codereview.chromium.org/11000033/diff/62001/src/trusted/validator_rage... File src/trusted/validator_ragel/validator_internal.h (right): http://codereview.chromium.org/11000033/diff/62001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_internal.h:600: static INLINE void ProcessNaclCallOrJmpAddToRMNoRex( Nit: empty line between function decls. http://codereview.chromium.org/11000033/diff/62001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_internal.h:605: current_position, data, valid_targets, FALSE, FALSE); As noted elsewhere I'm not happy with these booleans used like this as I can't tell from the call-site what effect they have. Can you please find a way to make things more readable? Options: use a enum instead of a bool, incorporate bool into procedure name, or use comments. http://codereview.chromium.org/11000033/diff/62001/src/trusted/validator_rage... File src/trusted/validator_ragel/validator_x86_32.rl (right): http://codereview.chromium.org/11000033/diff/62001/src/trusted/validator_rage... src/trusted/validator_ragel/validator_x86_32.rl:93: # ↓↓↓↓↓↓↓↓↓↓ What character are you using to generate the up and down arrows? Can you please stick to ASCII? vvv^^^
My comments are just suggestions how to make code better when we reach agreement on major design questions. http://codereview.chromium.org/11000033/diff/62001/src/trusted/validator_rage... File src/trusted/validator_ragel/unreviewed/decoder_test.c (right): http://codereview.chromium.org/11000033/diff/62001/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/decoder_test.c:1: /* I think this test need to be converted to C++ and simplified. See some hints below. http://codereview.chromium.org/11000033/diff/62001/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/decoder_test.c:134: Bool IsNameInList(const char *name, ...) { This function can be converted to function with two arguments and va_args can be passed as single string with some separator like ';' or '|'. http://codereview.chromium.org/11000033/diff/62001/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/decoder_test.c:160: static Bool ProcessFWait(const uint8_t **begin, const uint8_t *end, Too many arguments, class with state should solve this issue (see below). http://codereview.chromium.org/11000033/diff/62001/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/decoder_test.c:265: void ProcessInstruction(const uint8_t *begin, const uint8_t *end, I think it should be converted to so class InstructionParser that will have state of parsing with methods to parse prefixes, suffixes, etc. so you could avoid passing tons of arguments and will pass only 'this' + really needed params. http://codereview.chromium.org/11000033/diff/62001/src/trusted/validator_rage... src/trusted/validator_ragel/unreviewed/decoder_test.c:894: repeat_count = atoi(argv[2]), Please don't use comma to separate statements. This economy doesn't give us anything.
https://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/d... File src/trusted/validator_ragel/decoder_x86_32.rl (right): https://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/d... src/trusted/validator_ragel/decoder_x86_32.rl:116: #define GET_VEX_PREFIX3() vex_prefix3 On 2012/09/28 20:41:01, Brad Chen wrote: > It looks like this macro is defined identically in multiple files. Please use a > .h file and a single defn. Done. https://codereview.chromium.org/11000033/diff/1/src/trusted/validator_ragel/d... src/trusted/validator_ragel/decoder_x86_32.rl:185: %% write init; On 2012/09/28 20:41:01, Brad Chen wrote: > Can you add comments to make it obvious what these will expand to? I see from > the gen files that the stuff above in main := (one_instruction ...) ends up in > here someplace, but it would be nice not to have to reverse engineer such > details. > > Comments before each %% would also make it easier to distinguish between the > replacement for %% write init; and %% write exec; > > Am I misunderstanding these? Even more of a reason for comments. Done. https://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rag... File src/trusted/validator_ragel/validator_x86_64.rl (right): https://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rag... src/trusted/validator_ragel/validator_x86_64.rl:176: # “Normal” instructions detect an error when confronted with restricted On 2012/10/16 00:16:55, Brad Chen wrote: > As written the comment is unclear. Please clarify. > > On 2012/10/15 16:38:57, khim wrote: > > On 2012/10/05 16:47:21, Brad Chen wrote: > > > I don't think you mean "detect an error when confronted". The validator, not > > the > > > instruction, would detect the error, correct? > > > > I mean normal_instruction ragel machine here, of course. But that machine > > described "normal" instructions, so... > > > > This is explained in details in my document here: > > > https://src.chromium.org/viewvc/native_client/trunk/src/native_client/src/tru... > > > > > > > > Maybe "For most instructions, %rsp cannot be used." Is that what you mean? > I'm > > > not sure what "require this state" means. > > > > After normal instruction %rsp, surprisingly enough, can be altered. > > > > E.g.: "mov $1, %esp" is perfectly valid fragment of NaCl program. > > > > But. If that happens then it must be immediately followed with the instruction > > which returns the status quo: > > mov $1, %esp > > add %r15,%rsp > > Is valid, but > > mov $1, %esp > > mov $2, %ebp > > is not. > > > > To achieve that we split instructions in three classes (there are more, but > > these three are involved here): > > • Normal instructions. > > • %rbp sandboxing instructions. > > • %rsp sandboxing instructions. > > > > Normal instructions can zero-extend %rbp or %rsp, but if they are encountered > > when validator is in "need %rsp sandboxing" (aka restricted_register == > REG_RSP) > > or "need %rbp sandboxing" (aka restricted_register == REG_RBP) it's validation > > error. > > > > %rsp sandboxing instructions, on the other hand, *require* this "need %rsp > > sandboxing" (aka restricted_register == REG_RSP) state: it's an error to find > in > > place where %rsp is not zero-extended. Similarly with %rbp sandboxing > > instructions. > > > > It's explained validator_internals.html, but here I just want to explain what > > I'm checking for and why. > Done. https://codereview.chromium.org/11000033/diff/32010/src/trusted/validator_rag... src/trusted/validator_ragel/validator_x86_64.rl:194: # detect case where %rbp/%rsp is illegally modified when this machine will be On 2012/10/16 00:16:55, Brad Chen wrote: > Let me try to reword my comment to make it clearer. > > I'm reading this as "match", not "detect" because match would be normal English > usage here. 'Detect' is unusual usage, being more active than 'match'. If you > really mean "detect" and not "match" please explain in the comment what > additional activity you mean to express by using "detect" instead of "match". > This is a minor issue. > > Your usage of "machine" here is also peculiar and demands explanation. By > "machine" maybe you mean "state machine". In that case, simply use "state > machine" as that is clear and unambiguous. > > On 2012/10/15 16:38:57, khim wrote: > > On 2012/10/05 16:47:21, Brad Chen wrote: > > > Instead of "detect" do you mean "match"? > > > > No, detect. I mean the following: "0x83 b_11_100_xxx 0xe0" is first three > bytes > > for naclcall_or_nacljmp, sure. > > > > But. They are ALSO a normal instruction (as comments says it's "and $~0x1f, > > E86"). normal_instruction (ragel machine) will check if %rbp and %rsp are in > > acceptable state (and it'll detect error if they are zero-extended). > > > > This is explained on toy examples in my document: > > > https://src.chromium.org/viewvc/native_client/trunk/src/native_client/src/tru... > Tried to reword the meaning to avoid ambiguous words. https://codereview.chromium.org/11000033/diff/62001/src/trusted/validator_rag... File src/trusted/validator_ragel/decoder_x86_64.rl (right): https://codereview.chromium.org/11000033/diff/62001/src/trusted/validator_rag... src/trusted/validator_ragel/decoder_x86_64.rl:8: * Full-blown decoder for amd64 case. Can be used to decode instruction sequence On 2012/10/22 21:29:05, Brad Chen wrote: > nit: line length? Done. https://codereview.chromium.org/11000033/diff/62001/src/trusted/validator_rag... File src/trusted/validator_ragel/validator_internal.h (right): https://codereview.chromium.org/11000033/diff/62001/src/trusted/validator_rag... src/trusted/validator_ragel/validator_internal.h:600: static INLINE void ProcessNaclCallOrJmpAddToRMNoRex( On 2012/10/22 21:29:05, Brad Chen wrote: > Nit: empty line between function decls. Done. https://codereview.chromium.org/11000033/diff/62001/src/trusted/validator_rag... src/trusted/validator_ragel/validator_internal.h:605: current_position, data, valid_targets, FALSE, FALSE); On 2012/10/22 21:29:05, Brad Chen wrote: > As noted elsewhere I'm not happy with these booleans used like this as I can't > tell from the call-site what effect they have. Can you please find a way to make > things more readable? Options: use a enum instead of a bool, incorporate bool > into procedure name, or use comments. Reworked the procedures to exclude bools and added comments to clarify the work of the procedures. I think I'll declare this the final attempt to simplify that code. If it's still not clear then I'm ready do declare defeat and scrap it along with the rest of the validator. https://codereview.chromium.org/11000033/diff/62001/src/trusted/validator_rag... File src/trusted/validator_ragel/validator_x86_32.rl (right): https://codereview.chromium.org/11000033/diff/62001/src/trusted/validator_rag... src/trusted/validator_ragel/validator_x86_32.rl:93: # ↓↓↓↓↓↓↓↓↓↓ On 2012/10/22 21:29:05, Brad Chen wrote: > What character are you using to generate the up and down arrows? Can you please > stick to ASCII? vvv^^^ Done.
https://codereview.chromium.org/11000033/diff/125001/src/trusted/validator_ra... File src/trusted/validator_ragel/decoder.h (right): https://codereview.chromium.org/11000033/diff/125001/src/trusted/validator_ra... src/trusted/validator_ragel/decoder.h:113: REG_DS_RBX, /* Fox xlat: %ds:(%rbx). */ Fox -> For https://codereview.chromium.org/11000033/diff/125001/src/trusted/validator_ra... File src/trusted/validator_ragel/decoder_internal.h (right): https://codereview.chromium.org/11000033/diff/125001/src/trusted/validator_ra... src/trusted/validator_ragel/decoder_internal.h:21: #define GET_REX_PREFIX() instruction.prefix.rex What is the difference between S, N, P and T?
https://chromiumcodereview.appspot.com/11000033/diff/151007/src/trusted/valid... File src/trusted/validator_ragel/decoding.h (right): https://chromiumcodereview.appspot.com/11000033/diff/151007/src/trusted/valid... src/trusted/validator_ragel/decoding.h:9: * functions and defines). Copy-paste! Please be more explicit about what is in decoder and what's in decodING.
https://codereview.chromium.org/11000033/diff/125001/src/trusted/validator_ra... File src/trusted/validator_ragel/decoder_x86_32.rl (right): https://codereview.chromium.org/11000033/diff/125001/src/trusted/validator_ra... src/trusted/validator_ragel/decoder_x86_32.rl:28: */ Remove this defines by duplicating end_of_instruction_cleanup.
https://codereview.chromium.org/11000033/diff/125001/src/trusted/validator_ra... File src/trusted/validator_ragel/decoder.h (right): https://codereview.chromium.org/11000033/diff/125001/src/trusted/validator_ra... src/trusted/validator_ragel/decoder.h:8: * Data structures for decoding instructions. Includes definitions which are which are what? https://codereview.chromium.org/11000033/diff/125001/src/trusted/validator_ra... File src/trusted/validator_ragel/decoder_x86_32.rl (right): https://codereview.chromium.org/11000033/diff/125001/src/trusted/validator_ra... src/trusted/validator_ragel/decoder_x86_32.rl:112: instruction.prefix.rex = 0; use memset to clear instruction.
https://codereview.chromium.org/11000033/diff/125001/src/trusted/validator_ra... File src/trusted/validator_ragel/decoder.h (right): https://codereview.chromium.org/11000033/diff/125001/src/trusted/validator_ra... src/trusted/validator_ragel/decoder.h:199: int DecodeChunkAMD64(const uint8_t *data, size_t size, Comment what this function returns. https://codereview.chromium.org/11000033/diff/125001/src/trusted/validator_ra... File src/trusted/validator_ragel/decoder_x86_64.rl (right): https://codereview.chromium.org/11000033/diff/125001/src/trusted/validator_ra... src/trusted/validator_ragel/decoder_x86_64.rl:27: * These prefixes are only useful in AMD64 mode, but they will "cleaned up" by Change this comment. https://codereview.chromium.org/11000033/diff/125001/src/trusted/validator_ra... src/trusted/validator_ragel/decoder_x86_64.rl:118: uint8_t vex_prefix2 = 0xe0; 0xe0 -> VEX_R | VEX_X | VEX_B
https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... File src/trusted/validator_ragel/decoding.h (right): https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/decoding.h:11: * We only include simple schematic diagrams here. For full description see See full description in AMD/Intel manuals. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/decoding.h:31: * ┌───────┬───────┬───────┬───────┬───────┬───────┬───────┬───────┒ 0-2 bits: register number 3-7 bits: opcode https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/decoding.h:41: * 0-2 bits: r/m 3-5 bits: reg 6-7 bits: mod https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/decoding.h:52: return (modrm & 0x38) >> 3; (modrm >> 3) & 0x07 is easier to understand. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/decoding.h:61: * 0-2 bits: base 3-5 bits: index 6-7 bits: scale https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/decoding.h:72: return (sib & 0x38) >> 3; (sib >> 3) & 0x07 is easier to understand. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/decoding.h:81: * 0 bit: B 1 bit: X 2 bit: R 3 bit: W 4-7 bits: 0x4 https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/decoding.h:95: static FORCEINLINE uint8_t BaseExtentionFromREX(uint8_t rex) { /* How much to add to base register number. */ the same below. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/decoding.h:109: * 0-4 bits: opcode selector 5 bit: inverted B 6 bit: inverted X 7 bit: inverted R https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/decoding.h:143: * 0-1 bits: pp 2 bit: L 3-6 bits: inverted register number 7 bit: W https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/decoding.h:151: return ((~vex3) & 0x38) >> 3; ((~vex3) >> 3) & 0x07 is easier to understand https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/decoding.h:155: return ((~vex3) & 0x78) >> 3; ((~vex3) >> 3) & 0x0f is easier to understand https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/decoding.h:160: * 0-1 bits: imm2 or zero 2-3 bits: 0 4-7 bits: register number https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/decoding.h:177: * Smaller values can be obtained by restricting this value further (which is Return values can be restricted to smaller unsigned type https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/decoding.h:183: * but they stay on the "safe" side of this boundary. Specifically: this this (conversion to intXX_t) https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/decoding.h:196: * compilers for all known platforms are actually doing. Mention that conversion from intXX_t to uint64_t is always safe.
https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... File src/trusted/validator_ragel/validator_internal.h (right): https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_internal.h:23: /* Macroses to suppport CPUID handling. */ misprint: support. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_internal.h:24: #define SET_CPU_FEATURE(F) \ Comment what is F and F_Allowed (variable? constant? macro? enum?). https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_internal.h:185: #define SET_REX_PREFIX(P) rex_prefix = (P) P->PREFIX https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_internal.h:190: #define SET_MODRM_BASE(N) base = (N) N->REG? https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_internal.h:194: #define SET_DATA16_PREFIX(S) What is S and P below? https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_internal.h:207: #define SET_DISP_TYPE(T) SET_DISP_TYPE_##T T->TYPE https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_internal.h:212: /* imm2 field is a flag, not accumulator, like with other immediates */ like other immediates https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_internal.h:218: #define SET_IMM2_TYPE(T) SET_IMM2_TYPE_##T SET_SECOND_IMM_TYPE, T->TYPE https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_internal.h:225: * to jump outside given code region, the target address must be aligned. jump target outside of given code region must be aligned.
https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... File src/trusted/validator_ragel/validator_internal.h (right): https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_internal.h:268: static INLINE Bool ProcessInvalidJumpTargets( /* Compare valid_targets and jump_dests and call callback for any address in jump_dests which is not present in valid_targets. */ https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_internal.h:269: const uint8_t *data, data -> code https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_internal.h:279: for (i = 0; i < elements ; i++) { extra space https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_internal.h:308: int8_t offset = (uint8_t) (rip[-1]); remove uint8_t https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_internal.h:328: rip[-2] + 256U * ((uint32_t) rip[-1])))); remove uint32_t
https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... File src/trusted/validator_ragel/validator_internal.h (right): https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_internal.h:39: #define CPUFeature_AESAVX CPUFeature_AES && CPUFeature_AVX Currently these 'composite' features only refer to atomic features, but I suggest putting parenthesis just in case. If for example CPUFeature_AVX were defined as 'CPUFeature_foo || CPUFeature_bar', this line would be incorrect. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_internal.h:87: #define CPUFeature_3DNOW_Allowed \ Currently macros CPUFeature_<whatever>_Allowed completely duplicate macros CPUFeature_<whatever>. It can be avoided if these macros take cpu_features (or maybe even cpu_features->data as argument).
https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... File src/trusted/validator_ragel/validator_x86_32.rl (right): https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_x86_32.rl:29: #define SET_VEX_PREFIX3(P) Check if these lines are necessary. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_x86_32.rl:69: action last_byte_is_not_immediate { Think on removing code duplication here. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_x86_32.rl:97: # Ragel machine which checks if call is properly aligned. Ragel machine that accepts one call instruction or call superinstruction and checks if call is properly aligned. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_x86_32.rl:106: (0xe8 rel32))) | direct_call = (data16 ...) https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_x86_32.rl:109: (any* data16? 0xff ((opcode_2 | opcode_3) any* & indirect_call_by_register = data16? 0xff (opcode_2 & modrm_registers) https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_x86_32.rl:118: # This action calls user's callback (if needed) and cleans up validator's user's -> user, validator's -> internal validator state. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_x86_32.rl:121: # We call the user callback if there are validation errors or if the We call the user callback either on validation errors or on every instruction, depending on CALL_USER_CALLBACK_ON_EACH_INSTRUTION option. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_x86_32.rl:125: # only used in the processing of a single instruction (prefixes, operand which are not used in superinstruction and sandboxing processing. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_x86_32.rl:141: * causing error. */ We may set instruction_begin at the first byte of the instruction instead of here but in the case of incorrect one byte instructions user callback may be called before instruction_begin is set.
https://codereview.chromium.org/11000033/diff/125001/src/trusted/validator_ra... File src/trusted/validator_ragel/decoder.h (right): https://codereview.chromium.org/11000033/diff/125001/src/trusted/validator_ra... src/trusted/validator_ragel/decoder.h:8: * Data structures for decoding instructions. Includes definitions which are On 2013/03/13 15:41:36, halyavin wrote: > which are what? Done. https://codereview.chromium.org/11000033/diff/125001/src/trusted/validator_ra... src/trusted/validator_ragel/decoder.h:113: REG_DS_RBX, /* Fox xlat: %ds:(%rbx). */ On 2013/03/13 14:01:06, halyavin wrote: > Fox -> For Done. https://codereview.chromium.org/11000033/diff/125001/src/trusted/validator_ra... src/trusted/validator_ragel/decoder.h:199: int DecodeChunkAMD64(const uint8_t *data, size_t size, On 2013/03/13 15:48:32, halyavin wrote: > Comment what this function returns. Done. https://codereview.chromium.org/11000033/diff/125001/src/trusted/validator_ra... File src/trusted/validator_ragel/decoder_internal.h (right): https://codereview.chromium.org/11000033/diff/125001/src/trusted/validator_ra... src/trusted/validator_ragel/decoder_internal.h:21: #define GET_REX_PREFIX() instruction.prefix.rex On 2013/03/13 14:01:06, halyavin wrote: > What is the difference between S, N, P and T? Just a first name of the argument. Replaced with X (and N where there are numeric argument). https://codereview.chromium.org/11000033/diff/125001/src/trusted/validator_ra... File src/trusted/validator_ragel/decoder_x86_32.rl (right): https://codereview.chromium.org/11000033/diff/125001/src/trusted/validator_ra... src/trusted/validator_ragel/decoder_x86_32.rl:28: */ On 2013/03/13 15:25:33, halyavin wrote: > Remove this defines by duplicating end_of_instruction_cleanup. Done: https://codereview.chromium.org/12716018 https://codereview.chromium.org/11000033/diff/125001/src/trusted/validator_ra... src/trusted/validator_ragel/decoder_x86_32.rl:112: instruction.prefix.rex = 0; On 2013/03/13 15:41:36, halyavin wrote: > use memset to clear instruction. Done: https://codereview.chromium.org/12716018 https://codereview.chromium.org/11000033/diff/125001/src/trusted/validator_ra... File src/trusted/validator_ragel/decoder_x86_64.rl (right): https://codereview.chromium.org/11000033/diff/125001/src/trusted/validator_ra... src/trusted/validator_ragel/decoder_x86_64.rl:27: * These prefixes are only useful in AMD64 mode, but they will "cleaned up" by On 2013/03/13 15:48:32, halyavin wrote: > Change this comment. Done. https://codereview.chromium.org/11000033/diff/125001/src/trusted/validator_ra... src/trusted/validator_ragel/decoder_x86_64.rl:118: uint8_t vex_prefix2 = 0xe0; On 2013/03/13 15:48:32, halyavin wrote: > 0xe0 -> VEX_R | VEX_X | VEX_B Done: https://codereview.chromium.org/12716018 https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... File src/trusted/validator_ragel/decoding.h (right): https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/decoding.h:9: * functions and defines). On 2013/03/13 15:17:15, Vlad Shcherbina wrote: > Copy-paste! Please be more explicit about what is in decoder and what's in > decodING. Done. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/decoding.h:11: * We only include simple schematic diagrams here. For full description see On 2013/03/14 13:48:44, halyavin wrote: > See full description in AMD/Intel manuals. Done. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/decoding.h:31: * ┌───────┬───────┬───────┬───────┬───────┬───────┬───────┬───────┒ On 2013/03/14 13:48:44, halyavin wrote: > 0-2 bits: register number > 3-7 bits: opcode Done. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/decoding.h:41: * On 2013/03/14 13:48:44, halyavin wrote: > 0-2 bits: r/m > 3-5 bits: reg > 6-7 bits: mod Done. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/decoding.h:52: return (modrm & 0x38) >> 3; On 2013/03/14 13:48:44, halyavin wrote: > (modrm >> 3) & 0x07 is easier to understand. Done. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/decoding.h:61: * On 2013/03/14 13:48:44, halyavin wrote: > 0-2 bits: base > 3-5 bits: index > 6-7 bits: scale Done. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/decoding.h:72: return (sib & 0x38) >> 3; On 2013/03/14 13:48:44, halyavin wrote: > (sib >> 3) & 0x07 is easier to understand. Done. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/decoding.h:81: * On 2013/03/14 13:48:44, halyavin wrote: > 0 bit: B > 1 bit: X > 2 bit: R > 3 bit: W > 4-7 bits: 0x4 Done. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/decoding.h:95: static FORCEINLINE uint8_t BaseExtentionFromREX(uint8_t rex) { On 2013/03/14 13:48:44, halyavin wrote: > /* How much to add to base register number. */ > the same below. Done. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/decoding.h:109: * On 2013/03/14 13:48:44, halyavin wrote: > 0-4 bits: opcode selector > 5 bit: inverted B > 6 bit: inverted X > 7 bit: inverted R Done. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/decoding.h:143: * On 2013/03/14 13:48:44, halyavin wrote: > 0-1 bits: pp > 2 bit: L > 3-6 bits: inverted register number > 7 bit: W Done. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/decoding.h:151: return ((~vex3) & 0x38) >> 3; On 2013/03/14 13:48:44, halyavin wrote: > ((~vex3) >> 3) & 0x07 is easier to understand Done. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/decoding.h:155: return ((~vex3) & 0x78) >> 3; On 2013/03/14 13:48:44, halyavin wrote: > ((~vex3) >> 3) & 0x0f is easier to understand Done. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/decoding.h:160: * On 2013/03/14 13:48:44, halyavin wrote: > 0-1 bits: imm2 or zero > 2-3 bits: 0 > 4-7 bits: register number Done. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/decoding.h:177: * Smaller values can be obtained by restricting this value further (which is On 2013/03/14 13:48:44, halyavin wrote: > Return values can be restricted to smaller unsigned type Done. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/decoding.h:183: * but they stay on the "safe" side of this boundary. Specifically: this On 2013/03/14 13:48:44, halyavin wrote: > this (conversion to intXX_t) Done. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/decoding.h:196: * compilers for all known platforms are actually doing. On 2013/03/14 13:48:44, halyavin wrote: > Mention that conversion from intXX_t to uint64_t is always safe. Done. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... File src/trusted/validator_ragel/validator_internal.h (right): https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_internal.h:23: /* Macroses to suppport CPUID handling. */ On 2013/03/18 11:58:44, halyavin wrote: > misprint: support. Done. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_internal.h:24: #define SET_CPU_FEATURE(F) \ On 2013/03/18 11:58:44, halyavin wrote: > Comment what is F and F_Allowed (variable? constant? macro? enum?). If it'll be anything but macro then "macroses" in the previous line will be a lie, isn't it? https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_internal.h:39: #define CPUFeature_AESAVX CPUFeature_AES && CPUFeature_AVX On 2013/03/18 15:15:18, Vlad Shcherbina wrote: > Currently these 'composite' features only refer to atomic features, but I > suggest putting parenthesis just in case. If for example CPUFeature_AVX were > defined as 'CPUFeature_foo || CPUFeature_bar', this line would be incorrect. Done. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_internal.h:87: #define CPUFeature_3DNOW_Allowed \ On 2013/03/18 15:15:18, Vlad Shcherbina wrote: > Currently macros CPUFeature_<whatever>_Allowed completely duplicate macros > CPUFeature_<whatever>. It can be avoided if these macros take cpu_features (or > maybe even cpu_features->data as argument). Done. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_internal.h:185: #define SET_REX_PREFIX(P) rex_prefix = (P) On 2013/03/18 11:58:44, halyavin wrote: > P->PREFIX Long live cryptic error messages! But as I've said before: I don't care anymore. You ask me to mangle the code - I do that. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_internal.h:190: #define SET_MODRM_BASE(N) base = (N) On 2013/03/18 11:58:44, halyavin wrote: > N->REG? Done. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_internal.h:194: #define SET_DATA16_PREFIX(S) On 2013/03/18 11:58:44, halyavin wrote: > What is S and P below? Expanded (expect cryptic error messages in the future here as well). https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_internal.h:207: #define SET_DISP_TYPE(T) SET_DISP_TYPE_##T On 2013/03/18 11:58:44, halyavin wrote: > T->TYPE Ditto. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_internal.h:212: /* imm2 field is a flag, not accumulator, like with other immediates */ On 2013/03/18 11:58:44, halyavin wrote: > like other immediates Done. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_internal.h:218: #define SET_IMM2_TYPE(T) SET_IMM2_TYPE_##T On 2013/03/18 11:58:44, halyavin wrote: > SET_SECOND_IMM_TYPE, T->TYPE Done. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_internal.h:225: * to jump outside given code region, the target address must be aligned. On 2013/03/18 11:58:44, halyavin wrote: > jump target outside of given code region must be aligned. Done. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_internal.h:268: static INLINE Bool ProcessInvalidJumpTargets( On 2013/03/18 14:23:12, halyavin wrote: > /* Compare valid_targets and jump_dests and call callback for any address in > jump_dests which is not present in valid_targets. */ Done. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_internal.h:269: const uint8_t *data, On 2013/03/18 14:23:12, halyavin wrote: > data -> code Done. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_internal.h:279: for (i = 0; i < elements ; i++) { On 2013/03/18 14:23:12, halyavin wrote: > extra space Done. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_internal.h:308: int8_t offset = (uint8_t) (rip[-1]); On 2013/03/18 14:23:12, halyavin wrote: > remove uint8_t Done. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_internal.h:328: rip[-2] + 256U * ((uint32_t) rip[-1])))); On 2013/03/18 14:23:12, halyavin wrote: > remove uint32_t Done. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... File src/trusted/validator_ragel/validator_x86_32.rl (right): https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_x86_32.rl:29: #define SET_VEX_PREFIX3(P) On 2013/03/19 13:44:44, halyavin wrote: > Check if these lines are necessary. Yes, they are. Added TODO. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_x86_32.rl:69: action last_byte_is_not_immediate { On 2013/03/19 13:44:44, halyavin wrote: > Think on removing code duplication here. Done. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_x86_32.rl:97: # Ragel machine which checks if call is properly aligned. On 2013/03/19 13:44:44, halyavin wrote: > Ragel machine that accepts one call instruction or call superinstruction and > checks if call is properly aligned. Done. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_x86_32.rl:106: (0xe8 rel32))) | On 2013/03/19 13:44:44, halyavin wrote: > direct_call = (data16 ...) Done. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_x86_32.rl:109: (any* data16? 0xff ((opcode_2 | opcode_3) any* & On 2013/03/19 13:44:44, halyavin wrote: > indirect_call_by_register = data16? 0xff (opcode_2 & modrm_registers) Done. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_x86_32.rl:118: # This action calls user's callback (if needed) and cleans up validator's On 2013/03/19 13:44:44, halyavin wrote: > user's -> user, validator's -> internal validator state. Done. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_x86_32.rl:121: # We call the user callback if there are validation errors or if the On 2013/03/19 13:44:44, halyavin wrote: > We call the user callback either on validation errors or on every instruction, > depending on CALL_USER_CALLBACK_ON_EACH_INSTRUTION option. Done. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_x86_32.rl:125: # only used in the processing of a single instruction (prefixes, operand On 2013/03/19 13:44:44, halyavin wrote: > which are not used in superinstruction and sandboxing processing. I'm not sure that's clarification: I say that I clean up all the variables which are used for one instruction only, you say that we clear all the variables except for variables used in superinstruction and sandboxing processing which plethora of questions. https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_x86_32.rl:141: * causing error. */ On 2013/03/19 13:44:44, halyavin wrote: > We may set instruction_begin at the first byte of the instruction instead of > here but in the case of incorrect one byte instructions user callback may be > called before instruction_begin is set. Done.
https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... File src/trusted/validator_ragel/validator_internal.h (right): https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_internal.h:24: #define SET_CPU_FEATURE(F) \ On 2013/03/19 14:54:46, khim wrote: > On 2013/03/18 11:58:44, halyavin wrote: > > Comment what is F and F_Allowed (variable? constant? macro? enum?). > > If it'll be anything but macro then "macroses" in the previous line will be a > lie, isn't it? No. You can create variables with names F and F_Allowed somewhere later and then use SET_CPU_FEATURE(F).
https://codereview.chromium.org/11000033/diff/159001/src/trusted/validator_ra... File src/trusted/validator_ragel/validator_x86_32.rl (right): https://codereview.chromium.org/11000033/diff/159001/src/trusted/validator_ra... src/trusted/validator_ragel/validator_x86_32.rl:109: # This action calls users callback (if needed) and cleans up validators user, not users
https://codereview.chromium.org/11000033/diff/159001/src/trusted/validator_ra... File src/trusted/validator_ragel/validator_internal.h (right): https://codereview.chromium.org/11000033/diff/159001/src/trusted/validator_ra... src/trusted/validator_ragel/validator_internal.h:100: #define SET_MODRM_INDEX(REG_NUMBER) index = (REG) REG->REG_NUMBER
https://chromiumcodereview.appspot.com/11000033/diff/93013/src/trusted/valida... File src/trusted/validator_ragel/validator_x86_64.rl (right): https://chromiumcodereview.appspot.com/11000033/diff/93013/src/trusted/valida... src/trusted/validator_ragel/validator_x86_64.rl:67: CheckAccess(instruction_begin - codeblock, CheckMemoryAccess? https://chromiumcodereview.appspot.com/11000033/diff/93013/src/trusted/valida... src/trusted/validator_ragel/validator_x86_64.rl:220: # just as part of the naclcall/nacljmp, but also as a standalene instruction). standalone https://chromiumcodereview.appspot.com/11000033/diff/93013/src/trusted/valida... src/trusted/validator_ragel/validator_x86_64.rl:240: # See AMD/Intel manual for clarification about “add” instruction encoding. why are you adding unicode quotes? https://chromiumcodereview.appspot.com/11000033/diff/93013/src/trusted/valida... src/trusted/validator_ragel/validator_x86_64.rl:265: @{ where instruction_begin points to in this action? https://chromiumcodereview.appspot.com/11000033/diff/93013/src/trusted/valida... src/trusted/validator_ragel/validator_x86_64.rl:481: REX_X (0x89 | 0x8b) 0xff . # mov %edi,%edi what this dot means? https://chromiumcodereview.appspot.com/11000033/diff/93013/src/trusted/valida... src/trusted/validator_ragel/validator_x86_64.rl:552: # they must be in the range %r15...%r15+4294967295 except momentarily they they=instructions or registers? https://chromiumcodereview.appspot.com/11000033/diff/93013/src/trusted/valida... src/trusted/validator_ragel/validator_x86_64.rl:568: # support. This comment is incorrect. We do use it for dynamic code modification support. https://chromiumcodereview.appspot.com/11000033/diff/93013/src/trusted/valida... src/trusted/validator_ragel/validator_x86_64.rl:606: # states and instruction_info_collected). Comment duplication. https://chromiumcodereview.appspot.com/11000033/diff/93013/src/trusted/valida... src/trusted/validator_ragel/validator_x86_64.rl:629: * to be able to report the new offset as the start of instruction Change as in 32-bit validator. https://chromiumcodereview.appspot.com/11000033/diff/93013/src/trusted/valida... src/trusted/validator_ragel/validator_x86_64.rl:690: * Currently we do not distinguish 8bit and 16bit modifications from modifications of what? https://chromiumcodereview.appspot.com/11000033/diff/93013/src/trusted/valida... src/trusted/validator_ragel/validator_x86_64.rl:701: #define SET_OPERAND_NAME(N, S) operand_states |= ((S) << ((N) << 3)) Replace N, S and T with more descriptive names or add comments that explain them. https://chromiumcodereview.appspot.com/11000033/diff/93013/src/trusted/valida... src/trusted/validator_ragel/validator_x86_64.rl:737: /* Restricted %rsp or %rbp must be processed by appropriate nacl-special Maybe extract to separate function. https://chromiumcodereview.appspot.com/11000033/diff/93013/src/trusted/valida... src/trusted/validator_ragel/validator_x86_64.rl:744: *restricted_register = NO_REG; /* Every instruction clears restricted register even if it is not modified. */
https://codereview.chromium.org/11000033/diff/159001/src/trusted/validator_ra... File src/trusted/validator_ragel/validator_x86_64.rl (right): https://codereview.chromium.org/11000033/diff/159001/src/trusted/validator_ra... src/trusted/validator_ragel/validator_x86_64.rl:1247: uint32_t operand_states = 0; Move this comment to SET_OPERAND macros and write here: /* Contains register number and type of register modification (see OperandKind enum) for each operand that is changed in the instruction. Information about read-only and memory operands is not saved in 64-bit mode. */ https://codereview.chromium.org/11000033/diff/93013/src/trusted/validator_rag... File src/trusted/validator_ragel/validator_x86_64.rl (right): https://codereview.chromium.org/11000033/diff/93013/src/trusted/validator_rag... src/trusted/validator_ragel/validator_x86_64.rl:690: * Currently we do not distinguish 8bit and 16bit modifications from 8bit->64bit https://codereview.chromium.org/11000033/diff/93013/src/trusted/validator_rag... src/trusted/validator_ragel/validator_x86_64.rl:696: OPERAND_SANDBOX_8BIT, 8bit register that is modified by instruction. https://codereview.chromium.org/11000033/diff/93013/src/trusted/validator_rag... src/trusted/validator_ragel/validator_x86_64.rl:697: OPERAND_SANDBOX_RESTRICTED, 32-bit register that is modified by instruction. The high 32-bit bits of the corresponding 64-bit register are set to zero. https://codereview.chromium.org/11000033/diff/93013/src/trusted/validator_rag... src/trusted/validator_ragel/validator_x86_64.rl:698: OPERAND_SANDBOX_UNRESTRICTED 64-bit or 16-bit register that is modified by instruction. https://codereview.chromium.org/11000033/diff/93013/src/trusted/validator_rag... src/trusted/validator_ragel/validator_x86_64.rl:766: *instruction_info_collected |= BPL_MODIFIED; BPL_MODIFIED->BP_MODIFIED
https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... File src/trusted/validator_ragel/validator_internal.h (right): https://codereview.chromium.org/11000033/diff/151007/src/trusted/validator_ra... src/trusted/validator_ragel/validator_internal.h:24: #define SET_CPU_FEATURE(F) \ On 2013/03/19 15:02:30, halyavin wrote: > On 2013/03/19 14:54:46, khim wrote: > > On 2013/03/18 11:58:44, halyavin wrote: > > > Comment what is F and F_Allowed (variable? constant? macro? enum?). > > > > If it'll be anything but macro then "macroses" in the previous line will be a > > lie, isn't it? > > No. You can create variables with names F and F_Allowed somewhere later and then > use SET_CPU_FEATURE(F). This will be a lie since it'll be macro then, not macroSES. https://codereview.chromium.org/11000033/diff/159001/src/trusted/validator_ra... File src/trusted/validator_ragel/validator_internal.h (right): https://codereview.chromium.org/11000033/diff/159001/src/trusted/validator_ra... src/trusted/validator_ragel/validator_internal.h:100: #define SET_MODRM_INDEX(REG_NUMBER) index = (REG) On 2013/03/19 15:08:09, halyavin wrote: > REG->REG_NUMBER Done. https://codereview.chromium.org/11000033/diff/159001/src/trusted/validator_ra... File src/trusted/validator_ragel/validator_x86_32.rl (right): https://codereview.chromium.org/11000033/diff/159001/src/trusted/validator_ra... src/trusted/validator_ragel/validator_x86_32.rl:109: # This action calls users callback (if needed) and cleans up validators On 2013/03/19 15:03:54, halyavin wrote: > user, not users Done. https://codereview.chromium.org/11000033/diff/159001/src/trusted/validator_ra... File src/trusted/validator_ragel/validator_x86_64.rl (right): https://codereview.chromium.org/11000033/diff/159001/src/trusted/validator_ra... src/trusted/validator_ragel/validator_x86_64.rl:1247: uint32_t operand_states = 0; On 2013/03/20 15:19:19, halyavin wrote: > Move this comment to SET_OPERAND macros and write here: > /* Contains register number and type of register modification (see OperandKind > enum) for each operand that is changed in the instruction. Information about > read-only and memory operands is not saved in 64-bit mode. */ Done. https://codereview.chromium.org/11000033/diff/93013/src/trusted/validator_rag... File src/trusted/validator_ragel/validator_x86_64.rl (right): https://codereview.chromium.org/11000033/diff/93013/src/trusted/validator_rag... src/trusted/validator_ragel/validator_x86_64.rl:67: CheckAccess(instruction_begin - codeblock, On 2013/03/20 09:07:22, halyavin wrote: > CheckMemoryAccess? Done. https://codereview.chromium.org/11000033/diff/93013/src/trusted/validator_rag... src/trusted/validator_ragel/validator_x86_64.rl:220: # just as part of the naclcall/nacljmp, but also as a standalene instruction). On 2013/03/20 09:07:22, halyavin wrote: > standalone Done. https://codereview.chromium.org/11000033/diff/93013/src/trusted/validator_rag... src/trusted/validator_ragel/validator_x86_64.rl:240: # See AMD/Intel manual for clarification about “add” instruction encoding. On 2013/03/20 09:07:22, halyavin wrote: > why are you adding unicode quotes? Fixed. https://codereview.chromium.org/11000033/diff/93013/src/trusted/validator_rag... src/trusted/validator_ragel/validator_x86_64.rl:265: @{ On 2013/03/20 09:07:22, halyavin wrote: > where instruction_begin points to in this action? It points to the start of the last instruction, obviously. Added a comment. https://codereview.chromium.org/11000033/diff/93013/src/trusted/validator_rag... src/trusted/validator_ragel/validator_x86_64.rl:481: REX_X (0x89 | 0x8b) 0xff . # mov %edi,%edi On 2013/03/20 09:07:22, halyavin wrote: > what this dot means? Concatenation. Removed since apparently it makes code less clear, not more clear. https://codereview.chromium.org/11000033/diff/93013/src/trusted/validator_rag... src/trusted/validator_ragel/validator_x86_64.rl:552: # they must be in the range %r15...%r15+4294967295 except momentarily they On 2013/03/20 09:07:22, halyavin wrote: > they=instructions or registers? Reformulated more clearly. https://codereview.chromium.org/11000033/diff/93013/src/trusted/validator_rag... src/trusted/validator_ragel/validator_x86_64.rl:568: # support. On 2013/03/20 09:07:22, halyavin wrote: > This comment is incorrect. We do use it for dynamic code modification support. This comment is correct. We don't use this flag in x86-64 case, we only use it in ia32 case. x86-64 case only uses it in dfacheckdecoder. https://codereview.chromium.org/11000033/diff/93013/src/trusted/validator_rag... src/trusted/validator_ragel/validator_x86_64.rl:606: # states and instruction_info_collected). On 2013/03/20 09:07:22, halyavin wrote: > Comment duplication. Oops. Fixed. https://codereview.chromium.org/11000033/diff/93013/src/trusted/validator_rag... src/trusted/validator_ragel/validator_x86_64.rl:629: * to be able to report the new offset as the start of instruction On 2013/03/20 09:07:22, halyavin wrote: > Change as in 32-bit validator. Done. https://codereview.chromium.org/11000033/diff/93013/src/trusted/validator_rag... src/trusted/validator_ragel/validator_x86_64.rl:696: OPERAND_SANDBOX_8BIT, On 2013/03/20 15:19:19, halyavin wrote: > 8bit register that is modified by instruction. Done. https://codereview.chromium.org/11000033/diff/93013/src/trusted/validator_rag... src/trusted/validator_ragel/validator_x86_64.rl:697: OPERAND_SANDBOX_RESTRICTED, On 2013/03/20 15:19:19, halyavin wrote: > 32-bit register that is modified by instruction. The high 32-bit bits of the > corresponding 64-bit register are set to zero. Done. https://codereview.chromium.org/11000033/diff/93013/src/trusted/validator_rag... src/trusted/validator_ragel/validator_x86_64.rl:698: OPERAND_SANDBOX_UNRESTRICTED On 2013/03/20 15:19:19, halyavin wrote: > 64-bit or 16-bit register that is modified by instruction. Done. https://codereview.chromium.org/11000033/diff/93013/src/trusted/validator_rag... src/trusted/validator_ragel/validator_x86_64.rl:701: #define SET_OPERAND_NAME(N, S) operand_states |= ((S) << ((N) << 3)) On 2013/03/20 09:07:22, halyavin wrote: > Replace N, S and T with more descriptive names or add comments that explain > them. Done. https://codereview.chromium.org/11000033/diff/93013/src/trusted/validator_rag... src/trusted/validator_ragel/validator_x86_64.rl:737: /* Restricted %rsp or %rbp must be processed by appropriate nacl-special On 2013/03/20 09:07:22, halyavin wrote: > Maybe extract to separate function. Done. https://codereview.chromium.org/11000033/diff/93013/src/trusted/validator_rag... src/trusted/validator_ragel/validator_x86_64.rl:744: *restricted_register = NO_REG; On 2013/03/20 09:07:22, halyavin wrote: > /* Every instruction clears restricted register even if it is not modified. */ Done. https://codereview.chromium.org/11000033/diff/93013/src/trusted/validator_rag... src/trusted/validator_ragel/validator_x86_64.rl:766: *instruction_info_collected |= BPL_MODIFIED; On 2013/03/20 15:19:19, halyavin wrote: > BPL_MODIFIED->BP_MODIFIED Done.
https://chromiumcodereview.appspot.com/11000033/diff/310001/src/trusted/valid... File src/trusted/validator_ragel/decoder.h (right): https://chromiumcodereview.appspot.com/11000033/diff/310001/src/trusted/valid... src/trusted/validator_ragel/decoder.h:176: * Returns False If ragel machine does not accept piece of code. Returns true Returns false, if https://chromiumcodereview.appspot.com/11000033/diff/310001/src/trusted/valid... File src/trusted/validator_ragel/decoder_internal.h (right): https://chromiumcodereview.appspot.com/11000033/diff/310001/src/trusted/valid... src/trusted/validator_ragel/decoder_internal.h:32: #define SET_OPERAND_NAME(INDEX, NAME) instruction.operands[(INDEX)].name = (NAME) more than 80 symbols, use continuation. https://chromiumcodereview.appspot.com/11000033/diff/310001/src/trusted/valid... File src/trusted/validator_ragel/decoder_x86_32.rl (right): https://chromiumcodereview.appspot.com/11000033/diff/310001/src/trusted/valid... src/trusted/validator_ragel/decoder_x86_32.rl:70: SET_DISPLACEMNT_FORMAT(DISPNONE); DISPLACEMENT https://chromiumcodereview.appspot.com/11000033/diff/310001/src/trusted/valid... File src/trusted/validator_ragel/decoding.h (right): https://chromiumcodereview.appspot.com/11000033/diff/310001/src/trusted/valid... src/trusted/validator_ragel/decoding.h:127: VEX_W = 0x80 Add a comment explaining that VEX_W is in the third VEX byte. https://chromiumcodereview.appspot.com/11000033/diff/310001/src/trusted/valid... src/trusted/validator_ragel/decoding.h:130: static FORCEINLINE uint8_t BaseExtentionFromVEX(uint8_t vex2) { Add comments similar to comments for REX prefixes. https://chromiumcodereview.appspot.com/11000033/diff/310001/src/trusted/valid... File src/trusted/validator_ragel/validator_x86_64.rl (right): https://chromiumcodereview.appspot.com/11000033/diff/310001/src/trusted/valid... src/trusted/validator_ragel/validator_x86_64.rl:248: # "call" or "jmp" instruction and instruction_begin points to it's end. current_position points to its end. https://chromiumcodereview.appspot.com/11000033/diff/310001/src/trusted/valid... src/trusted/validator_ragel/validator_x86_64.rl:553: # operatein on them are special because registers must be in the range operatein->operate https://chromiumcodereview.appspot.com/11000033/diff/310001/src/trusted/valid... src/trusted/validator_ragel/validator_x86_64.rl:710: * 2 bits for register kinds, the first 5 bits are used for register numbers (16 regs plus RIZ), the next 2 bits are used for register kinds. https://chromiumcodereview.appspot.com/11000033/diff/310001/src/trusted/valid... src/trusted/validator_ragel/validator_x86_64.rl:728: ((REGISTER_NAME | (KIND << 5)) << ((INDEX) << 3))) It is more natural to reverse order: (((KIND) << 5) | (REGISTER_NAME)) (Also note that you omitted a couple of brackets)
https://chromiumcodereview.appspot.com/11000033/diff/310001/src/trusted/valid... File src/trusted/validator_ragel/decoder.h (right): https://chromiumcodereview.appspot.com/11000033/diff/310001/src/trusted/valid... src/trusted/validator_ragel/decoder.h:176: * Returns False If ragel machine does not accept piece of code. On 2013/03/25 09:43:41, halyavin wrote: > Returns true > Returns false, if Oops. That's C, not python. TRUE/FALSE, then. https://chromiumcodereview.appspot.com/11000033/diff/310001/src/trusted/valid... File src/trusted/validator_ragel/decoder_internal.h (right): https://chromiumcodereview.appspot.com/11000033/diff/310001/src/trusted/valid... src/trusted/validator_ragel/decoder_internal.h:32: #define SET_OPERAND_NAME(INDEX, NAME) instruction.operands[(INDEX)].name = (NAME) On 2013/03/25 09:43:41, halyavin wrote: > more than 80 symbols, use continuation. Indeed. Strange that presubmit script is silent. Done. https://chromiumcodereview.appspot.com/11000033/diff/310001/src/trusted/valid... File src/trusted/validator_ragel/decoder_x86_32.rl (right): https://chromiumcodereview.appspot.com/11000033/diff/310001/src/trusted/valid... src/trusted/validator_ragel/decoder_x86_32.rl:70: SET_DISPLACEMNT_FORMAT(DISPNONE); On 2013/03/25 09:43:41, halyavin wrote: > DISPLACEMENT Done. https://chromiumcodereview.appspot.com/11000033/diff/310001/src/trusted/valid... File src/trusted/validator_ragel/decoding.h (right): https://chromiumcodereview.appspot.com/11000033/diff/310001/src/trusted/valid... src/trusted/validator_ragel/decoding.h:127: VEX_W = 0x80 On 2013/03/25 09:43:41, halyavin wrote: > Add a comment explaining that VEX_W is in the third VEX byte. Added separate enum for 3rd byte instead. https://chromiumcodereview.appspot.com/11000033/diff/310001/src/trusted/valid... src/trusted/validator_ragel/decoding.h:130: static FORCEINLINE uint8_t BaseExtentionFromVEX(uint8_t vex2) { On 2013/03/25 09:43:41, halyavin wrote: > Add comments similar to comments for REX prefixes. Done. https://chromiumcodereview.appspot.com/11000033/diff/310001/src/trusted/valid... File src/trusted/validator_ragel/validator_x86_64.rl (right): https://chromiumcodereview.appspot.com/11000033/diff/310001/src/trusted/valid... src/trusted/validator_ragel/validator_x86_64.rl:248: # "call" or "jmp" instruction and instruction_begin points to it's end. On 2013/03/25 09:43:41, halyavin wrote: > current_position points to its end. Done. https://chromiumcodereview.appspot.com/11000033/diff/310001/src/trusted/valid... src/trusted/validator_ragel/validator_x86_64.rl:553: # operatein on them are special because registers must be in the range On 2013/03/25 09:43:41, halyavin wrote: > operatein->operate Done. https://chromiumcodereview.appspot.com/11000033/diff/310001/src/trusted/valid... src/trusted/validator_ragel/validator_x86_64.rl:710: * 2 bits for register kinds, On 2013/03/25 09:43:41, halyavin wrote: > the first 5 bits are used for register numbers (16 regs plus RIZ), > the next 2 bits are used for register kinds. Done. https://chromiumcodereview.appspot.com/11000033/diff/310001/src/trusted/valid... src/trusted/validator_ragel/validator_x86_64.rl:728: ((REGISTER_NAME | (KIND << 5)) << ((INDEX) << 3))) On 2013/03/25 09:43:41, halyavin wrote: > It is more natural to reverse order: (((KIND) << 5) | (REGISTER_NAME)) > (Also note that you omitted a couple of brackets) Done.
https://chromiumcodereview.appspot.com/11000033/diff/321007/src/trusted/valid... File src/trusted/validator_ragel/validator_x86_64.rl (right): https://chromiumcodereview.appspot.com/11000033/diff/321007/src/trusted/valid... src/trusted/validator_ragel/validator_x86_64.rl:753: /* Restricted %rsp or %rbp must be processed by appropriate nacl-special %rsp or %rbp must be restored from zero-extension https://chromiumcodereview.appspot.com/11000033/diff/321007/src/trusted/valid... src/trusted/validator_ragel/validator_x86_64.rl:776: if (CHECK_OPERAND(0, REG_R15, OPERAND_SANDBOX_8BIT) || Suggested macro names to remove duplications: CHECK_OPERAND_R15_MODIFIED(OPERAND), CHECK_OPERAND_RBP_MODIFIED(OPERAND), CHECK_OPERAND_RSP_MODIFIED(OPERAND), CHECK_OPERAND_RBP_INVALID_MODIFICATION(OPERAND), CHECK_OPERAND_RSP_INVALID_MODIFICATION(OPERAND). Place this macroses just above Process0Operands and mention that they use operand_states and rex_prefix params.
https://chromiumcodereview.appspot.com/11000033/diff/321007/src/trusted/valid... File src/trusted/validator_ragel/validator_x86_64.rl (right): https://chromiumcodereview.appspot.com/11000033/diff/321007/src/trusted/valid... src/trusted/validator_ragel/validator_x86_64.rl:814: } else if ((operand_states & 0x70) == (OPERAND_SANDBOX_RESTRICTED << 5)) { CHECK_OPERAND_RESTRICTED(OPERAND) https://chromiumcodereview.appspot.com/11000033/diff/321007/src/trusted/valid... src/trusted/validator_ragel/validator_x86_64.rl:815: *restricted_register = operand_states & 0x0f; *restricted_register = GET_OPERAND_NAME(INDEX) & 0x0f; GET_OPERAND_NAME should do shift first to help compiler optimizations. Alternatively, you can omit & 0x0f at all. https://chromiumcodereview.appspot.com/11000033/diff/321007/src/trusted/valid... src/trusted/validator_ragel/validator_x86_64.rl:880: *restricted_register = operand_states & 0x0f; /* If both operands are sandboxed, the second one doesn't count. We can't ignore it completely though, since it can modify rsp or rbp which follow special rules. In this case NaCl forbids the instruction. */
https://chromiumcodereview.appspot.com/11000033/diff/321007/src/trusted/valid... File src/trusted/validator_ragel/validator_x86_64.rl (right): https://chromiumcodereview.appspot.com/11000033/diff/321007/src/trusted/valid... src/trusted/validator_ragel/validator_x86_64.rl:753: /* Restricted %rsp or %rbp must be processed by appropriate nacl-special On 2013/03/25 12:00:46, halyavin wrote: > %rsp or %rbp must be restored from zero-extension Done. https://chromiumcodereview.appspot.com/11000033/diff/321007/src/trusted/valid... src/trusted/validator_ragel/validator_x86_64.rl:776: if (CHECK_OPERAND(0, REG_R15, OPERAND_SANDBOX_8BIT) || On 2013/03/25 12:00:46, halyavin wrote: > Suggested macro names to remove duplications: > CHECK_OPERAND_R15_MODIFIED(OPERAND), > CHECK_OPERAND_RBP_MODIFIED(OPERAND), > CHECK_OPERAND_RSP_MODIFIED(OPERAND), > CHECK_OPERAND_RBP_INVALID_MODIFICATION(OPERAND), > CHECK_OPERAND_RSP_INVALID_MODIFICATION(OPERAND). > Place this macroses just above Process0Operands and mention that they use > operand_states and rex_prefix params. Done. https://chromiumcodereview.appspot.com/11000033/diff/321007/src/trusted/valid... src/trusted/validator_ragel/validator_x86_64.rl:814: } else if ((operand_states & 0x70) == (OPERAND_SANDBOX_RESTRICTED << 5)) { On 2013/03/25 12:24:58, halyavin wrote: > CHECK_OPERAND_RESTRICTED(OPERAND) Done. https://chromiumcodereview.appspot.com/11000033/diff/321007/src/trusted/valid... src/trusted/validator_ragel/validator_x86_64.rl:815: *restricted_register = operand_states & 0x0f; On 2013/03/25 12:24:58, halyavin wrote: > *restricted_register = GET_OPERAND_NAME(INDEX) & 0x0f; > GET_OPERAND_NAME should do shift first to help compiler optimizations. > Alternatively, you can omit & 0x0f at all. Done. https://chromiumcodereview.appspot.com/11000033/diff/321007/src/trusted/valid... src/trusted/validator_ragel/validator_x86_64.rl:880: *restricted_register = operand_states & 0x0f; On 2013/03/25 12:24:58, halyavin wrote: > /* If both operands are sandboxed, the second one doesn't count. We can't > ignore it completely though, since it can modify rsp or rbp which follow special > rules. In this case NaCl forbids the instruction. */ Done.
https://chromiumcodereview.appspot.com/11000033/diff/323033/src/trusted/valid... File src/trusted/validator_ragel/validator_x86_64.rl (right): https://chromiumcodereview.appspot.com/11000033/diff/323033/src/trusted/valid... src/trusted/validator_ragel/validator_x86_64.rl:871: *instruction_info_collected |= R15_MODIFIED; BP_MODIFIED https://chromiumcodereview.appspot.com/11000033/diff/323033/src/trusted/valid... src/trusted/validator_ragel/validator_x86_64.rl:874: *instruction_info_collected |= R15_MODIFIED; SP_MODIFIED
https://chromiumcodereview.appspot.com/11000033/diff/323033/src/trusted/valid... File src/trusted/validator_ragel/validator_x86_64.rl (right): https://chromiumcodereview.appspot.com/11000033/diff/323033/src/trusted/valid... src/trusted/validator_ragel/validator_x86_64.rl:871: *instruction_info_collected |= R15_MODIFIED; On 2013/03/25 14:18:36, halyavin wrote: > BP_MODIFIED Done. https://chromiumcodereview.appspot.com/11000033/diff/323033/src/trusted/valid... src/trusted/validator_ragel/validator_x86_64.rl:874: *instruction_info_collected |= R15_MODIFIED; On 2013/03/25 14:18:36, halyavin wrote: > SP_MODIFIED Done.
lgtm
Message was sent while issue was closed.
Committed patchset #13 manually as r11063 (presubmit successful). |