src/trusted/validator_ragel/unreviewed/validator_internals.html - Issue 10883051: Add documentation for the dynamic code modifications.

Side by Side Diff: src/trusted/validator_ragel/unreviewed/validator_internals.html

Issue 10883051: Add documentation for the dynamic code modifications. (Closed) Base URL: svn://svn.chromium.org/native_client/trunk/src/native_client/

Patch Set: Created 8 years, 3 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View unified diff | Download patch | Annotate | Revision Log

OLD	NEW
1 <head>	1 <head>

2 <title>Validator structure</title>	2 <title>Validator structure</title>

3 <meta http-equiv="content-type" content="text/html; charset=utf-8" />	3 <meta http-equiv="content-type" content="text/html; charset=utf-8" />

4 </head>	4 </head>

5 <body>	5 <body>

	6 <div>

	7 <div style="width:20%; float:left; padding-right:5%;"><a href="http://en.wikiped ia.org/wiki/File:Duesenberg.jpg"><img border="0" src="http://upload.wikimedia.or g/wikipedia/commons/thumb/3/3a/Duesenberg.jpg/800px-Duesenberg.jpg" width="100%" /></a><br /><center><span style="font-size:50%">Source: <a href="http://en.wiki pedia.org/wiki/File:Duesenberg.jpg">http://en.wikipedia.org/wiki/File:Duesenberg .jpg</a></span></center></div><div style="width:33%; float:right; padding-left:5 %;"><a href="http://en.wikipedia.org/wiki/File:Felipe_Massa_2011_Malaysia_FP1.jp g"><img border="0" src="http://upload.wikimedia.org/wikipedia/commons/thumb/5/58 /Felipe_Massa_2011_Malaysia_FP1.jpg/800px-Felipe_Massa_2011_Malaysia_FP1.jpg" wi dth="100%" /></a><center><span style="font-size:50%">Source: <a href="http://upl oad.wikimedia.org/wikipedia/commons/thumb/5/58/Felipe_Massa_2011_Malaysia_FP1.jp g/800px-Felipe_Massa_2011_Malaysia_FP1.jpg">http://upload.wikimedia.org/wikipedi a/commons/thumb/5/58/Felipe_Massa_2011_Malaysia_FP1.jpg/800px-Felipe_Massa_2011_ Malaysia_FP1.jpg</a></span></center></div>

	8 <h1>New, DFA-based validator with 5-10x speed of the original one, or…<br />

	9 <div style="text-align:right;">Luxury car to F1 car.</div></h1>

	10 <div style="position:relative; width:55%; left:10%;">Trust me: every problem in computer science may be solved by an indirection, but those indirections are <b> expensive</b>. Pointer chasing is just about the most expensive thing you can do on modern CPU's.<br /><a href="http://lwn.net/Articles/509416/"><i>—Linus Torva lds</i></a></div>

	11 <div>

6 <a name="TOC"></a>	12 <a name="TOC"></a>

	13 <ol style="clear:both;">

	14 <li><a href="#1">DFA, Ragel, macro and inline functions, oh my…</a></li>

	15 <li><a href="#2">What is ragel and how it works.</a></li>

7 <ol>	16 <ol>

8 <li><a href="#1">DFA, Ragel, macroses and inline functions, oh my…</a></li>	17 <li><a href="#2-1">Ragel actions.</a></li>

9 <li><a href="#2">“Special” instructions.</a></li>	18 </ol>

10 <li><a href="#3">“No so special” instructions.</a></li>	19 <li><a href="#3">“Special” instructions.</a></li>

11 <li><a href="#4">Features beyond minimal validation.</a></li>	20 <li><a href="#4">“No so special” instructions.</a></li>

	21 <li><a href="#5">Features beyond minimal validation.</a></li>

12 <ol>	22 <ol>

13 <li><a href="#4-1"><code>CPUID</code> support.</a></li>	23 <li><a href="#5-1"><code>CPUID</code> support.</a></li>

14 <li><a href="#4-2">Dynamic code creation support.</a></li>	24 <li><a href="#5-2">Dynamic code modification support.</a></li>

15 <li><a href="#4-3">Dynamic code modification support.</a></li>	25 <ol>

	26 <li><a href="#5-2-1">Replacement validation.</a></li>

	27 <li><a href="#5-2-2">Replacement copying.</a></li>

16 </ol>	28 </ol>

17 <li><a href="#5">Validation for x86-64 mode.</a></li>	29 </ol>

	30 <li><a href="#6">Validation for x86-64 mode.</a></li>

18 <ol>	31 <ol>

19 <li><a href="#5-1">“Secondary” states.</a></li>	32 <li><a href="#6-1">“Secondary” states.</a></li>

20 <li><a href="#5-2">“Normal” instructions.</a></li>	33 <li><a href="#6-2">“Normal” instructions.</a></li>

21 <li><a href="#5-3">Operands handling.</a></li>	34 <li><a href="#6-3">Operands handling.</a></li>

	35 <li><a href="#6-4">Dynamic code modification support.</a></li>

	36 <ol>

	37 <li><a href="#6-4-1">Replacement validation.</a></li>

	38 <li><a href="#6-4-2">Replacement copying.</a></li>

22 </ol>	39 </ol>

23 <li><a href="#6">Decoders.</a></li>

24 </ol>	40 </ol>

25 <h2><div style="float:right"><a href="#TOC">▲</a></div><a name="1">1. DFA, Ragel , macroses and inline functions, oh my…</a></h2>	41 <li><a href="#7">Decoders.</a></li>

26 <p>To understand how DFA-based validators work it's best to start from function <code>ValidateChunkIA32</code> in <code>validator_x86_32.rl</code>. Said functio n is very short and “simple”: it allocates couple of arrays (<code>valid_targets </code> and <code>jump_dests</code>), then cycles over code passed to it (proces sing it in bundle-sized chunks) and at the end it compares valid jump targets an d collected jump destinations… that's it. Oh, and it also includes couple of cry ptic lines right in the middle of innermost cycle:<hr />	42 </ol>

	43 <h2><div style="float:right"><a href="#TOC">▲</a></div><a name="1">1. DFA, Ragel , macro and inline functions, oh my…</a></h2>

	44

	45 <p>Contemporary computer systems are extremely powerful and most complex compone nts and libraries are built like a <a href="http://en.wikipedia.org/wiki/Luxury_ vehicle">luxury car</a>: they include a lot of comfort and safety technologies w hich are designed to improve live of the user of said components. This also faci litates <a href="http://en.wikipedia.org/wiki/Code_reuse">code reuse</a> via <a href="http://en.wikipedia.org/wiki/Modular_programming">modular programming</a> and generally improves <a href="http://en.wikipedia.org/wiki/Maintainability">ma intainability</a>.</p>

	46

	47 <p>Unfortunately these complex structures, improved comfort for the library user and commendable flexibility have a flip side: they lead to a lot of additional work in runtime! You first fill and then parse complex data structures—and this takes time. You often produce a lot of information on the low levels which is ju st not used on higher levels—and this work is also not free.</p>

	48

	49 <p>New validator is built differently. It only keep around the indispensable min imum of the information needed to prove (or disprove) that code is safe. Similar ly to how <a href="http://en.wikipedia.org/wiki/Formula_One_car">F1 car</a> uses <a href="http://www.youtube.com/watch?v=NsvWnGgT7Ok">custom-designed car seats< /a> we use custom-designed data structures to push the data from one point of va lidator to another one. <span title="Actually we collect slightly more then the bare minimum to make testing possible.">We only collect the bare minimum of the information</span>—and if the requirements are changing we often change all the pieces: from <code>gen_dfa</code> input data format to the highest-level <code>d fa_validate_32.c</code>/<code>dfa_validate_64.c</code> external API adapters.</p >

	50

	51 <p>This streamlining was one of the most important design goals of a new validat or. And indeed the code which reaches the CPU is very simple: it does not contai n complex data structures and multilayered functions while all the previous vali dators had many layers and quite a few complex data structures. How can it be? W ere all these structures superfluous and unnecessary? Well… not really. New vali dator throws away all that complexity and trades it for a few comparisons and ju mps. <b>Tens of thousands comparisons and similar number of jumps</b>, to be exa ct. In a <b>single flat function</b>. Basically we trade runtime complexity for build-time complexity. As you can guess it's practically not possible to write s uch a function by hand—and even if someone will be able to write tens of thousan ds of lines of code by hand it'll be impossible to inderstand and review. People s are not CPUs! They can keep track of millions lines of code in complex project s if these are organized in modules and are nicely separated, but give then fift y thousand lines of homogeneous code—and they'll be totally lost. But this is ba sically what we have here in the end product—because CPU loves such code. To sol ve this dilemma we employ three levels of filters to create the final code.</p>

	52

	53 <center><img src="files32.svg" height="90%"/><br />Gray elements are hand-writte n, white elements are generated and dark-gray are aforementioned mixers.</center ><br />

	54

	55 <p>To understand how validator works it's best to start from function <code>Vali dateChunkIA32</code> in <code>validator_x86_32.rl</code>. Said function is very short and “simple”: it allocates couple of arrays (<code>valid_targets</code> an d <code>jump_dests</code>), then cycles over code passed to it (processing it in bundle-sized chunks) and at the end it compares valid jump targets and collecte d jump destinations… that's it. Oh, and it also includes couple of cryptic lines right in the middle of innermost cycle:<hr />

27     <code>%% write init;</code><br />	56     <code>%% write init;</code><br />

28     <code>%% write exec;</code><hr />	57     <code>%% write exec;</code><hr />

29 Apparently collection of valid jump targets and actual target destinations happe ns here. How?</p>	58 Apparently collection of valid jump targets and actual target destinations happe ns here. How?</p>

30 <a name="ragel"></a><blockquote style="background:lightgray; font-size:90%;">	59 <h2><div style="float:right"><a href="#TOC">▲</a></div><a name="2">2. What is ra gel and how it works.</a></h2>

	60

	61 <blockquote style="background:lightgray; font-size:90%;">

	62

31 <p>To understand that you need to know a little about DFA and Ragel. I'll not ex plain what the DFA is (it's explained in CS course you've heard years back… or y ou can refresh you knowleadge on <a href="http://en.wikipedia.org/wiki/Determini stic_finite_automaton">Wikipedia</a>). But I'll explain a little about Ragel. Ex tensive documentation with all the gory details is <a href="http://www.complang. org/ragel/">on Ragel's site</a>, but while it explains <b>how</b> to use Ragel i t does not explain <b>what</b> it is and <b>why</b> you may want to use it.</p>	63 <p>To understand that you need to know a little about DFA and Ragel. I'll not ex plain what the DFA is (it's explained in CS course you've heard years back… or y ou can refresh you knowleadge on <a href="http://en.wikipedia.org/wiki/Determini stic_finite_automaton">Wikipedia</a>). But I'll explain a little about Ragel. Ex tensive documentation with all the gory details is <a href="http://www.complang. org/ragel/">on Ragel's site</a>, but while it explains <b>how</b> to use Ragel i t does not explain <b>what</b> it is and <b>why</b> you may want to use it.</p>

32	64

33 <p>Let's start with the first question: <b>what</b> it is. Ragel is compiler of DFA machines… but with a twist. You describe DFA structure using simple <a href= "http://en.wikipedia.org/wiki/Regular_expression">RE</a>-style format and Ragel generates the corresponding code in C (D/Go/Java/Ruby/etc: Ragel supports a lot of laguages, but we are interested in C here). When you describe the DFA you jus t write acceptable bytes and then use the following operations: concatenation (“ 1 . 2” will accept either “1” followed by “2”), union (“1 \| 2” will accept eithe r “1” or “2”), intersection (“('a'..'n') & ('m'..'z')” will accept either “m” or “n”), difference (“('a'..'n') - ('m'..'z')” will accept everything between “a” and “l”, but will not accept either “m” or “n”) and kleene star (“(1 \| 2)*” will accept any number of “1” or “2”).</p>	65 <p>Let's start with the first question: <b>what</b> it is. Ragel is compiler of DFA machines… but with a twist. You describe DFA structure using simple <a href= "http://en.wikipedia.org/wiki/Regular_expression">RE</a>-style format and Ragel generates the corresponding code in C (D/Go/Java/Ruby/etc: Ragel supports a lot of laguages, but we are interested in C here). When you describe the DFA you jus t write acceptable bytes and then use the following operations: concatenation (“ 1 . 2” will accept either “1” followed by “2”), union (“1 \| 2” will accept eithe r “1” or “2”), intersection (“('a'..'n') & ('m'..'z')” will accept either “m” or “n”), difference (“('a'..'n') - ('m'..'z')” will accept everything between “a” and “l”, but will not accept either “m” or “n”) and kleene star (“(1 \| 2)*” will accept any number of “1” or “2”).</p>

34	66

35 <p>These operations can produce quite non-trivial result: e.g. “("b" . ("aa"+ \| "aaa"+))*” will produce the following DFA:</p>	67 <p>These operations can produce quite non-trivial result: e.g. “("b" . ("aa"+ \| "aaa"+))*” will produce the following DFA:</p>

36 <center><img src="sample1.svg" width="100%"/></center><br />	68 <center><img src="sample1.svg" width="100%"/></center><br />

37 <p>If, instead of “("aa"+ \| "aaa"+)” in the example above you'll use something l ike “("a"{5}+ \| "a"{7}+ \| "a"{11}+)” then the resulting DFA will include almost four hundreds nodes and over five hundreds transitions! This limits applicabilit y of DFA technology: e.g. it's possible to describe "valid code sequence" (inclu ding bundles, "restricted registers" and everything else) as a DFA, but… said DF A will include millions of nodes and billions of transitions!</p>	69 <p>If, instead of “("aa"+ \| "aaa"+)” in the example above you'll use something l ike “("a"{5}+ \| "a"{7}+ \| "a"{11}+)” then the resulting DFA will include almost four hundreds nodes and over five hundreds transitions! This limits applicabilit y of DFA technology: e.g. it's possible to describe "valid code sequence" (inclu ding bundles, "restricted registers" and everything else) as a DFA, but… said DF A will include millions of nodes and billions of transitions!</p>

38	70

39 <p><a name="actions">To overcome this problem Ragel offers so-called "actions": pieces of code which are called when certain pieces in DFA are reached. E.g. we can mark begin and end of “aa” (or “aaa”) in the example above—“("b" . (("aa" >b egin @end)+ \| ("aaa" >begin @end)+ ))*” produces the following DFA:</a></p>	71 <h3><div style="float:right"><a href="#TOC">▲</a></div><a name="2-1">2.1. Ragel actions.</a></h3>

	72

	73 <p>To overcome this problem Ragel offers so-called "actions": pieces of code whi ch are called when certain pieces in DFA are reached. E.g. we can mark begin and end of “aa” (or “aaa”) in the example above—“("b" . (("aa" >begin @end)+ \| ("aa a" >begin @end)+ ))*” produces the following DFA:</p>

40 <center><img src="sample2.svg" width="100%"/></center>	74 <center><img src="sample2.svg" width="100%"/></center>

41 <p style="margin-bottom:0px;">Let's see what happens if we'll feed it with “baaa aaaaaa” sequence:</p>	75 <p style="margin-bottom:0px;">Let's see what happens if we'll feed it with “baaa aaaaaa” sequence:</p>

42 <ul style="margin-top:0px;">	76 <ul style="margin-top:0px;">

43 <li><i>offset 0</i>: <i>nothing</i></li>	77 <li><i>offset 0</i>: <i>nothing</i></li>

44 <li><i>offset 1</i>: <code>begin</code></li>	78 <li><i>offset 1</i>: <code>begin</code></li>

45 <li><i>offset 2</i>: <code>end</code></li>	79 <li><i>offset 2</i>: <code>end</code></li>

46 <li><i>offset 3</i>: <code>begin</code> then <code>end</code></li>	80 <li><i>offset 3</i>: <code>begin</code> then <code>end</code></li>

47 <li><i>offset 4</i>: <code>end</code> then <code>begin</code></li>	81 <li><i>offset 4</i>: <code>end</code> then <code>begin</code></li>

48 <li><i>offset 5</i>: <code>begin</code></li>	82 <li><i>offset 5</i>: <code>begin</code></li>

49 <li><i>offset 6</i>: <code>end</code></li>	83 <li><i>offset 6</i>: <code>end</code></li>

(...skipping 15 matching lines...) Expand all Loading...
65 <li><i>offset 7</i>: <code>begin2</code> then <code>begin3</code></li>	99 <li><i>offset 7</i>: <code>begin2</code> then <code>begin3</code></li>

66 <li><i>offset 8</i>: <code>end2</code></li>	100 <li><i>offset 8</i>: <code>end2</code></li>

67 <li><i>offset 9</i>: <code>begin2</code> then <code>end3</code></li>	101 <li><i>offset 9</i>: <code>begin2</code> then <code>end3</code></li>

68 </ul>	102 </ul>

69 <p style="margin-bottom:0px;">Ah-ha. Now everything is clear. DFA is DFA: it doe s not support memory and it does not support rollbacks. This means that our DFA it processing two branches simultaneously—both “"aa"+” and “"aaa"+”. We'll need to keep this in mind. Couple of another observations: </p>	103 <p style="margin-bottom:0px;">Ah-ha. Now everything is clear. DFA is DFA: it doe s not support memory and it does not support rollbacks. This means that our DFA it processing two branches simultaneously—both “"aa"+” and “"aaa"+”. We'll need to keep this in mind. Couple of another observations: </p>

70 <ol style="margin-top:0px;">	104 <ol style="margin-top:0px;">

71 <li>When we used just <code>begin</code> action action <code>begin</code> was ca lled once, but when we split it in two (<code>begin2</code> and <code>begin3</co de>) both are called! By default Ragel merges actions.</li>	105 <li>When we used just <code>begin</code> action action <code>begin</code> was ca lled once, but when we split it in two (<code>begin2</code> and <code>begin3</co de>) both are called! By default Ragel merges actions.</li>

72 <li>Actions are called in non-random order—take a look on <i>offset 4</i>: <code >end2</code> is called before <code>begin3</code>. That's because <code>begin3</ code> has lower priority than <code>end2</code>! Note that in previous example t his same effect was observed, but it was quite mysterious there. The closer the action is to the beginning of the source file the higher it's priority is.</li>	106 <li>Actions are called in non-random order—take a look on <i>offset 4</i>: <code >end2</code> is called before <code>begin3</code>. That's because <code>begin3</ code> has lower priority than <code>end2</code>! Note that in previous example t his same effect was observed, but it was quite mysterious there. The closer the action is to the beginning of the source file the higher it's priority is.</li>

73 </ol>	107 </ol>

74 </blockquote>	108 </blockquote>

75 <h2><div style="float:right"><a href="#TOC">▲</a></div><a name="2">2. “Special” instructions.</a></h2>	109 <h2><div style="float:right"><a href="#TOC">▲</a></div><a name="3">3. “Special” instructions.</a></h2>

76 <p>Now we can go back to machine description. Our main DFA is the same in all ca ses, it's “<code>(one_instruction \| special_instruction)*</code>”—i.e. it accept s sequence of “normal” instructions and “special” instructions.</p>	110 <p>Now we can go back to machine description. Our main DFA is the same in all ca ses, it's “<code>(one_instruction \| special_instruction)*</code>”—i.e. it accept s sequence of “normal” instructions and “special” instructions.</p>

77	111

78 <p>Also, just like in example above there are two actions: first one is triggere d at the beginning of the <code>instruction</code> (“normal” or “special”)—it's used to remember the beginning of the instruction, to clear the list of <code>er rors_detected</code>, and to mark the first byte of the instruction as valid tar get for the direct jump; second one is triggered at the final byte of the <code> instruction</code> (“normal” or “special”)—and is used to report errors. And the re are also one additional action which is declared as “<code>$err</code>”. This is <i>error fallback action</i>: it's triggered whenever our machine rejects so me byte (which means we've hit either forbidden instruction like <code>lgdt</cod e> or some undefined byte sequence… in both cases <code> UNRECOGNIZED_INSTRUCTIO N</code> error is reported and processing is stopped).</p>	112 <p>Also, just like in example above there are two actions: first one is triggere d at the beginning of the <code>instruction</code> (“normal” or “special”)—it's used to remember the beginning of the instruction, to clear the <code>instructio n_info_collected</code>, and to mark the first byte of the instruction as valid target for the direct jump; second one is triggered at the final byte of the <co de>instruction</code> (“normal” or “special”)—and is used to report errors. And there are also one additional action which is declared as “<code>$err</code>”. T his is <i>error fallback action</i>: it's triggered whenever our machine rejects some byte (which means we've hit either forbidden instruction like <code>lgdt</ code> or some undefined byte sequence… in both cases <code> UNRECOGNIZED_INSTRUC TION</code> error is reported and processing is stopped).</p>

79	113

80 <p>There are three “special” instructions in IA32 case: <code>naclcall</code>, < code>nacljmp</code> and <code title="mov %gs:0x0,%reg is part of public ABI, mov %gs:0x4,%reg is used in IRT">mov %gs:0x0/0x4,%reg</code>. The last one is declared as “special” instruction to simplify the validation logic (and DFA, to o): instead of accepting all versions of <code>mov %gs:<i>something</i>,%reg</co de> instruction followed by additional logic which rejects most possibilities (o nly plain vanialla “zero” is allowed here as per ABI) we only describe this one version of the instruction and ragel does the rest. <code>naclcall</code> and <c ode>nacljmp</code> include special action which clears the “valid destination ad dress” bit (remember the story with <code>begin</code> and <code>end</code> acti ons above? when first byte of a second half of <code>naclcall</code>/<code>naclj mp</code> is processed it's processed as <b>both</b> part of the <code>naclcall< /code>/<code>nacljmp</code> <b>and</b> as a start of a regular instruction, too) .</p>	114 <p>There are three “special” instructions in IA32 case: <code>naclcall</code>, < code>nacljmp</code> and <code title="mov %gs:0x0,%reg is part of public ABI, mov %gs:0x4,%reg is used in IRT">mov %gs:0x0/0x4,%reg</code>. The last one is declared as “special” instruction to simplify the validation logic (and DFA, to o): instead of accepting all versions of <code>mov %gs:<i>something</i>,%reg</co de> instruction followed by additional logic which rejects most possibilities (o nly plain vanialla “zero” is allowed here as per ABI) we only describe this one version of the instruction and ragel does the rest. <code>naclcall</code> and <c ode>nacljmp</code> include special action which clears the “valid destination ad dress” bit (remember the story with <code>begin</code> and <code>end</code> acti ons above? when first byte of a second half of <code>naclcall</code>/<code>naclj mp</code> is processed it's processed as <b>both</b> part of the <code>naclcall< /code>/<code>nacljmp</code> <b>and</b> as a start of a regular instruction, too) .</p>

81	115

82 <p>This explains how <code>valid_targets</code> array is filled and invalid inst ructions are rejected.</p>	116 <p>This explains how <code>valid_targets</code> array is filled and invalid inst ructions are rejected.</p>

83 <h2><div style="float:right"><a href="#TOC">▲</a></div><a name="3">3. “Not so sp ecial” instructions.</a></h2>	117 <h2><div style="float:right"><a href="#TOC">▲</a></div><a name="4">4. “Not so sp ecial” instructions.</a></h2>

84 <p>But of course there are <code>jump_dests</code>, too. Special instructions do n't touch it, but something obviously fills the array, isn't it. This can only b e result of processing of normal instructions, thus we need to go deeper. Where it all comes from? To understand that we need to look on [autogenerated] <code>v alidator_x86_32_instruction.rl</code> file. The file looks like this:<hr />	118 <p>But of course there are <code>jump_dests</code>, too. Special instructions do n't touch it, but something obviously fills the array, isn't it. This can only b e result of processing of normal instructions, thus we need to go deeper. Where it all comes from? To understand that we need to look on [autogenerated] <code>v alidator_x86_32_instruction.rl</code> file. The file looks like this:<hr />

85     ⋮<br />	119     ⋮<br />

86   <i>Semi-manual simple helper machines and a ctions</i><br />	120   <i>Semi-manual simple helper machines and a ctions</i><br />

87     ⋮<br />	121     ⋮<br />

88   <code>one_instruction =</code><br />	122   <code>one_instruction =</code><br />

89       ⋮<br />	123       ⋮<br />

90     <code>(branch_hint? 0x77 rel8) \|</code><b r />	124     <code>(branch_hint? 0x77 rel8) \|</code><b r />

91     <code>(branch_hint? (0x0f 0x87) rel32)&nb sp;\|</code><br />	125     <code>(branch_hint? (0x0f 0x87) rel32)&nb sp;\|</code><br />

92       ⋮<br />	126       ⋮<br />

93     <code>((0x0f 0x01 0xd0) @CPUFeature_FXSR) </code>;<hr />	127     <code>((0x0f 0x01 0xd0) @CPUFeature_FXSR) </code>;<hr />

94 </p>	128 </p>

95 <code>0x77</code> and <code>0x0f 0x87</code> are opcodes for <code>ja</code > (aka <code>jnbe</code>) instruction, but what are <code>branch_hint?</code> an d <code>rel8</code>/<code>rel32</code> are doing here? Well, “<code>?</code>” me ans “optional” (like in most <a href="http://en.wikipedia.org/wiki/Regular_expre ssion">RE</a>-engines) and both <code>branch_hint</code> and <code>rel8</code>/< code>rel32</code> definitions are references to machines defined in the <i>semi- manual simple helper machines and actions</i> part of <code>validator_x86_32_ins truction.rl</code> file. The whole construct describes part of the DFA which is designed to accept <code>ja</code> (aka <code>jnbe</code>) instruction—complete with optional P4-inspired branch prediction prefix. Definition of <code>branch_h int</code> is trivial and obvious (“<code>branch_hint = 0x2e \| 0x3e;</code>” if you want to know), but <code>rel8</code>/<code>rel32</code> are somewhat more “i nteresting”:<hr />	129 <code>0x77</code> and <code>0x0f 0x87</code> are opcodes for <code>ja</code > (aka <code>jnbe</code>) instruction, but what are <code>branch_hint?</code> an d <code>rel8</code>/<code>rel32</code> are doing here? Well, “<code>?</code>” me ans “optional” (like in most <a href="http://en.wikipedia.org/wiki/Regular_expre ssion">RE</a>-engines) and both <code>branch_hint</code> and <code>rel8</code>/< code>rel32</code> definitions are references to machines defined in the <i>semi- manual simple helper machines and actions</i> part of <code>validator_x86_32_ins truction.rl</code> file. The whole construct describes part of the DFA which is designed to accept <code>ja</code> (aka <code>jnbe</code>) instruction—complete with optional P4-inspired branch prediction prefix. Definition of <code>branch_h int</code> is trivial and obvious (“<code>branch_hint = 0x2e \| 0x3e;</code>” if you want to know), but <code>rel8</code>/<code>rel32</code> are somewhat more “i nteresting”:<hr />

96     <code>rel8 = any @rel8_operand;</code><br />	130     <code>rel8 = any @rel8_operand;</code><br />

97     <code>rel32 = any{4} @rel32_operand;</cod e><hr />	131     <code>rel32 = any{4} @rel32_operand;</cod e><hr />

98 It's "more interesting not because it's complex or non-obvious. The interesting part here is the fact that actions <code>rel8_operand</code>/<code>rel32_operand </code> are <b>not</b> present in <code>validator_x86_32_instruction.rl</code>, they are in <code>validator_x86_32.rl</code> file! But the definition itself is pretty trivial:<hr />	132 It's "more interesting not because it's complex or non-obvious. The interesting part here is the fact that actions <code>rel8_operand</code>/<code>rel32_operand </code> are <b>not</b> present in <code>validator_x86_32_instruction.rl</code>, they are in <code>validator_x86_32.rl</code> file! But the definition itself is pretty trivial:<hr />

99   <code>action rel8_operand {</code><br />	133   <code>action rel8_operand {</code><br />

100     <code>int8_t offset = (uint8_t) (p[0 ]);</code><br />	134     <code>int8_t offset = (uint8_t) (p[0 ]);</code><br />

101     <code>size_t jump_dest = offset +&nb sp;(p - data) + 1;</code><br /><br />	135     <code>size_t jump_dest = offset +&nb sp;(p - data) + 1;</code><br /><br />

102     <code>if (!MarkJumpTarget(jump_dest, jump_dest s, size)) {</code><br />	136     <code>if (!MarkJumpTarget(jump_dest, jump_dest s, size)) {</code><br />

103       <code>errors_detected \|= DIRECT_JU MP_OUT_OF_RANGE;</code><br />	137       <code>instruction_info_collected \|=&nbs p;DIRECT_JUMP_OUT_OF_RANGE;</code><br />

104     <code>}</code><br />	138     <code>}</code><br />

105   <code>}</code><br />	139   <code>}</code><br />

106   <code>action rel32_operand {</code><br />	140   <code>action rel32_operand {</code><br />

107     <code>int32_t offset =</code><br />	141     <code>int32_t offset =</code><br />

108         <code>(p[-3] + 256U&nb sp;* (p[-2] + 256U * (p[-1] + 256U *&nbs p;((uint32_t) p[0]))));</code><br />	142         <code>(p[-3] + 256U&nb sp;* (p[-2] + 256U * (p[-1] + 256U *&nbs p;((uint32_t) p[0]))));</code><br />

109     <code>size_t jump_dest = offset +&nb sp;(p - data) + 1;</code><br /><br />	143     <code>size_t jump_dest = offset +&nb sp;(p - data) + 1;</code><br /><br />

110     <code>if (!MarkJumpTarget(jump_dest, jump_dest s, size)) {</code><br />	144     <code>if (!MarkJumpTarget(jump_dest, jump_dest s, size)) {</code><br />

111       <code>errors_detected \|= DIRECT_JU MP_OUT_OF_RANGE;</code><br />	145       <code>instruction_info_collected \|=&nbs p;DIRECT_JUMP_OUT_OF_RANGE;</code><br />

112     <code>}</code><br />	146     <code>}</code><br />

113   <code>}</code><hr />	147   <code>}</code><hr />

114 We just check if jump target passes preliminary check (direct jump to the outsid e of the region is always invalid) and that's not so then we detect error <code> DIRECT_JUMP_OUT_OF_RANGE</code>.</p>	148 We just check if jump target passes preliminary check (direct jump to the outsid e of the region is always invalid) and that's not so then we detect error <code> DIRECT_JUMP_OUT_OF_RANGE</code>.</p>

115	149

116 <h2><div style="float:right"><a href="#TOC">▲</a></div><a name="4">4. Features b eyond minimal validation.</a></h2>	150 <h2><div style="float:right"><a href="#TOC">▲</a></div><a name="5">5. Features b eyond minimal validation.</a></h2>

117 <p style="margin-bottom:0px;">This covers most of the functionality of the valid ator (we'll discuss the generation of <code>validator_x86_32_instruction.rl</cod e> file later), but there are still some details not covered here:</p>	151 <p style="margin-bottom:0px;">This covers most of the functionality of the valid ator (we'll discuss the generation of <code>validator_x86_32_instruction.rl</cod e> file later), but there are still some details not covered here:</p>

118 <ol style="margin-top:0px;">	152 <ol style="margin-top:0px;">

119 <li><a href="#4-1"><code>CPUID</code> support.</a></li>	153 <li><a href="#5-1"><code>CPUID</code> support.</a></li>

120 <li><a href="#4-2">Dynamic code creation support.</a></li>	154 <li><a href="#5-2">Dynamic code modification support.</a></li>

121 <li><a href="#4-3">Dynamic code modification support.</a></li>	155 <ol>

	156 <li><a href="#5-2-1">Replacement validation.</a></li>

	157 <li><a href="#5-2-2">Replacement copying.</a></li>

	158 </ol>

122 </ol>	159 </ol>

123	160

124 <h3><div style="float:right"><a href="#TOC">▲</a></div><a name="4-1">4.1. <code> CPUID</code> support.</a></h3>	161

	162 <h3><div style="float:right"><a href="#TOC">▲</a></div><a name="5-1">5.1. <code> CPUID</code> support.</a></h3>

125	163

126 <p><code>CPUID</code> support is implemented using large set of actions embedded in definition of instructions (see, e.g. <code>@CPUFeature_FXSR</code> in the l ine for instruction <code>0x0f 0x01 0xd0</code> AKA <code>xgetbv</code>). CPUID- related actions are triggered when we know the identity of the instruction (whic h happens at different times for different instructions: some instructions are d etected when opcode is read, some use <i>opcode extension</i>, etc—AMD/Intel man uals contain all the gory details), but the definition for said actions in <code >validator_x86_32_instruction.rl</code> are very simple<hr />	164 <p><code>CPUID</code> support is implemented using large set of actions embedded in definition of instructions (see, e.g. <code>@CPUFeature_FXSR</code> in the l ine for instruction <code>0x0f 0x01 0xd0</code> AKA <code>xgetbv</code>). CPUID- related actions are triggered when we know the identity of the instruction (whic h happens at different times for different instructions: some instructions are d etected when opcode is read, some use <i>opcode extension</i>, etc—AMD/Intel man uals contain all the gory details), but the definition for said actions in <code >validator_x86_32_instruction.rl</code> are very simple<hr />

127   <code>action CPUFeature_FXSR {</code><br />	165   <code>action CPUFeature_FXSR {</code><br />

128     <code>SET_CPU_FEATURE(CPUFeature_FXSR);</code><br />	166     <code>SET_CPU_FEATURE(CPUFeature_FXSR);</code><br />

129   <code>}</code><hr />	167   <code>}</code><hr />

130 This time magic is in <code>validator_internal.h</code>. <code>SET_CPU_FEATURE</ code> is defined as<hr />	168 This time magic is in <code>validator_internal.h</code>. <code>SET_CPU_FEATURE</ code> is defined as<hr />

131 <code>#define SET_CPU_FEATURE(F) \</code><br />	169   <code>if (!(F##_Allowed)) { \</code><br />

132   <code>if (!(F)) { \</code><br />	170     <code>instruction_info_collected \|= UNRECOGNIZED_INSTRUC TION; \</code><br />

133     <code>errors_detected \|= CPUID_UNSUPPORTED_INS TRUCTION; \</code><br />	171   <code>} \</code><br />

	172   <code>if (!(F)) { \</code><br />

	173     <code>instruction_info_collected \|= CPUID_UNSUPPORTED_IN STRUCTION; \</code><br />

134   <code>}</code><hr />	174   <code>}</code><hr />

135 IOW: it's pretty straighforward and simple, but there are a twist: <code>CPUFeat ure_FXSR</code> is not the name of variable, but the name of macrodefinition. Th is is needed to handle special cases where <code>CPUFeature</code> does not corr espond to a single <code>CPUID</code> bit. E.g. <code>prefetch</code> instructio n is available when <b>any one</b> of three bits are set: <code>3DNnow!</code> b it, deficated <code>Prefetch instruction</code> bit or <code>LongMode</code> bit . On the other hand <code>vaesenc</code> is available when <b>both</b> <code>AES </code> and <code>AVX</code> bits are set. And our ABI <a href="http://code.goog le.com/p/nativeclient/issues/detail?id=2869">permits <code>lzcnt</code> and <cod e>tzcnt</code> uncoditionally</a> (thus <code>CPUFeature_LZCNT</code> does not c heck for anything but just returns <code>TRUE</code> in all cases).	175 IOW: it's pretty straighforward and simple, but there are a twist: <code>CPUFeat ure_FXSR</code> is not the name of variable, but the name of macrodefinition. Th is is needed to handle special cases where <code>CPUFeature</code> does not corr espond to a single <code>CPUID</code> bit. E.g. <code>prefetch</code> instructio n is available when <b>any one</b> of two bits are set: <span title="AMD documtn tation also claims it's always available if LongMode bit is set but Intel docume ntation does not support this assertion."><code>3DNnow!</code> bit or deficated <code>Prefetch instruction</code> bit</span>. On the other hand <code>vaesenc</c ode> is available when <b>both</b> <code>AES</code> and <code>AVX</code> bits ar e set. And our ABI <a href="http://code.google.com/p/nativeclient/issues/detail? id=2869">permits <code>lzcnt</code> and <code>tzcnt</code> uncoditionally</a> (t hus <code>CPUFeature_LZCNT</code> does not check for anything but just returns < code>TRUE</code> in all cases).</p>

136 </p>

137	176

138 <h3><div style="float:right"><a href="#TOC">▲</a></div><a name="4-2">4.2. Dynami c code creation support.</a></h3>	177 <p>Note: there are two CPUID masks: hardcoded one (it can be replaced if you lin k in different definition of <code>validator_cpuid_features</code> global variab le in your program) and runtime-supplied one (usually obtained from actual <code >CPUID</code> call in production, but hardcoded in tests). New instructions are first added in “production disabled” mode and must pass a security review before they can be used in Chrome.</p>

139	178

140 <p>TBD</p>	179 <h3><div style="float:right"><a href="#TOC">▲</a></div><a name="5-2">5.2. Dynami c code modification support.</a></h3>

141	180

142 <h3><div style="float:right"><a href="#TOC">▲</a></div><a name="4-3">4.3. Dynami c code modification support.</a></h3>	181 <p>Dynamic code modification support is implemented with the help of <code>CALL_ USER_CALLBACK_ON_EACH_INSTRUCTION</code> option. Normally user callback is only used when some kind of error is detected, but if this option is used then callba ck is called after <b>each</b> instruction. When that happend callback have all the information needed to process the instruction: collected errors, information about immediates, etc.</p>

143	182

144 <p>TBD</p>	183 <p>All that information is squeezed in <code>instruction_info_collected</code> v ariable. <span title="Note that half of the information does not make sense for ia32 mode and is not collected by ValidateChunkIA32. It's included for completen ess.">It has the following format</code>:</p>

145	184

146 <h2><div style="float:right"><a href="#TOC">▲</a></div><a name="5">5. Validation for x86-64 mode.</a></h2>	185 <table width="100%"><tr><td align="left">31</td><td align="left">30</td><td alig n="left">29</td><td align="left">28</td><td align="left">27</td><td align="left" >26</td><td align="left">25</td><td align="left">24</td><td align="left">23</td> <td align="left">22</td><td align="left">21</td><td align="left">20</td><td alig n="left">19</td><td align="left">18</td><td align="left">17</td><td align="left" >16</td><td align="left">15</td><td align="left">14</td><td align="left">13</td> <td align="left">12</td><td align="right">8</td><td align="left">7</td><td align ="left">6</td><td align="right">5</td><td align="left">4</td><td align="left">3< /td><td align="right">0</td></tr>

	186 <tr><td align="left"> </td><td align="left"> </td><td align="left">&nb sp;</td><td align="left"> </td><td align="left"> </td><td align="left" > </td><td colspan="12" align="left" style="border: thin solid black;"><tab le width="100%"><tr><td align="left">⇤</td><td align="center"><code>VALIDATION_E RRORS_MASK</code></td><td align="right">⇥</td></table></td><td align="left">&nbs p;</td><td colspan="2" align="left" width="1%" style="border: thin solid black; background:lightgray;"><table width="100%"><tr><td align="left">⇤</td><td width= "1%" align="center"><code>RESTRICTED_REGISTER_MASK</code></td><td align="right"> ⇥</td></table></td><td align="left"> </td><td colspan="2" align="left" widt h="1%" style="border: thin solid black;"><table width="100%"><tr><td align="left ">⇤</td><td width="1%" align="center"><code>RESTRICTED_REGISTER_MASK</code></td> <td align="right">⇥</td></table></td><td align="left"> </td><td colspan="2" align="left" width="1%" style="border: thin solid black;"><table width="100%">< tr><td align="left">⇤</td><td width="1%" align="center"><code>IMMEDIATES_SIZE_MA SK</code></td><td align="right">⇥</td></table></td></tr>

	187 <tr><td style="border: thin solid black; background: gray;" width="1%" align="ce nter"> 0 </td><td style="border: thin solid black;" width="1%" align=" center">   </td><td style="border: thin solid black;" width="1%" align="center">   </td><td width="1%" style="border: thin solid b lack;" align="center">   </td><td style="border: thin solid black ; background: lightgray;" width="1%" align="center">   </td><td s tyle="border: thin solid black;" width="1%" align="center">   </t d><td style="border: thin solid black; background: lightgray;" width="1%" align= "center">   </td><td style="border: thin solid black; background: lightgray;" width="1%" align="center">   </td><td style="border: thin solid black; background: lightgray;" width="1%" align="center"> &nbsp ; </td><td style="border: thin solid black; background: lightgray;" width=" 1%" align="center">   </td><td style="border: thin solid black; b ackground: lightgray;" width="1%" align="center">   </td><td styl e="border: thin solid black; background: lightgray;" width="1%" align="center">& nbsp;  </td><td style="border: thin solid black; background: lightgray ;" width="1%" align="center">   </td><td style="border: thin soli d black; background: lightgray;" width="1%" align="center">   </t d><td style="border: thin solid black; background: lightgray;" width="1%" align= "center">   </td><td style="border: thin solid black;" width="1%" align="center">   </td><td style="border: thin solid black;" wid th="1%" align="center">   </td><td style="border: thin solid blac k;" width="1%" align="center">   </td><td style="border: thin sol id black; background: lightgray;" width="1%" align="center">   </ td><td colspan="2" style="border: thin solid black; background: lightgray;" alig n="center">   </td><td style="border: thin solid black;" width="1 %" align="center">   </td><td colspan="2" style="border: thin sol id black;" align="center">   </td><td style="border: thin solid b lack;" width="1%" align="center">   </td><td colspan="2" style="b order: thin solid black;" align="center">   </td><td>    </td></tr>

	188 <tr><td align="left">↑</td><td align="left">↑</td><td align="left">↑</td><td ali gn="left">↑</td><td align="left">↑</td><td align="left">↑</td><td align="left">↑ </td><td align="left">↑</td><td align="left">↑</td><td align="left">↑</td><td al ign="left">↑</td><td align="left">↑</td><td align="left">↑</td><td align="left"> ↑</td><td align="left">↑</td><td align="left">↑</td><td align="left">↑</td><td a lign="left">↑</td><td align="left">↑</td><td align="left">↑</td><td align="left" > </td><td align="left">↑</td><td align="left">↑</td><td align="left">&nbsp ;</td><td align="left">↑</td><td align="left">↑</td><td align="left"> </td> <td align="left"> </td></tr>

	189 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊ </td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left"> ┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td a lign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left" > </td><td align="left">┊</td><td align="left">┊</td><td align="left">&nbsp ;</td><td align="left">┊</td><td align="left" colspan="100" >└ Cumulutive s ize of <i title="Immediates, displacements, relative offsets.">anyfields</i>.</t d></tr>

	190 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊ </td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left"> ┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td a lign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left" > </td><td align="left">┊</td><td align="left">┊</td><td align="left">&nbsp ;</td><td align="left" colspan="100" >└ <span title="enter, extrq, insertq" >Instruction has two immediates</a>.</td></tr>

	191 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊ </td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left"> ┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td a lign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left" > </td><td align="left">┊</td><td align="left" colspan="100" >└ <span title="00 == 0 bytes, 01 == 1 bytes, 10 = 2 bytes, 11 = 4 bytes">Instruction dis placement size</span>.</td></tr>

	192 <!--<tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="lef t">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><t d align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="le ft">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td>< td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="l eft"> </td><td align="left">┊</td><td align="left" colspan="100" >└ <s pan title="Top half of a last byte of an instruction is fourth register operand, two remaining bytes are reserved.">Instruction has 2bit immediate operation.</s pan></td></tr>-->

	193 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊ </td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left"> ┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td a lign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left" > </td><td align="left" colspan="100" >└ Instruction has relative offs et.</td></tr>

	194 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊ </td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left"> ┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td a lign="left">┊</td><td align="left">┊</td><td align="left" colspan="100" >└  <span style="background: lightgray;">ia32 mode: reserved;</span> amd64 mode: <sp an title="NO_REG if instruction does not zero-extending one">register, zero-exte nded by the instruction.</span></td></tr>

	195 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊ </td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left"> ┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td a lign="left">┊</td><td align="left" colspan="100" >└ <span style="background : lightgray;">ia32 mode: reserved;</span> amd64 mode: <span title="This means th at start of this instruction is not a valid jump target.">instruction is valid, but it accesses memory using register which is zero-extended by previous instruc tion.</span></td></tr>

	196 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊ </td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left"> ┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td a lign="left" colspan="100" >└ <span title="Note that all unsupported instruc tions trigger this error. This includes mov by absolute 64bit address, system in structions like lidt or even call and jmp used not as part of superinstruction. If combined with CPUID_UNSUPPORTED_INSTRUCTION it means that instruction is not yet enabled in validator.">DFA error: invalid instruction. Validation then resum es from the next bundle.</span></td></tr>

	197 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊ </td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left"> ┊</td><td align="left">┊</td><td align="left">┊</td><td align="left" colspan="10 0" >└ Unaligned direct jump to address outside of given region.</td></tr>

	198 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊ </td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left"> ┊</td><td align="left">┊</td><td align="left" colspan="100" >└ Instruction is not supported for a given CPUID mask.</td></tr>

	199 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊ </td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left"> ┊</td><td align="left" colspan="100" >└ <span style="background: lightgray; ">ia32 mode: reserved;</span> amd64 mode: base register is not <code>%rbp</code> , <code>%rsp</code>, or <code>%r15</code>.</td></tr>

	200 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊ </td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left" colspan="100" >└ <span style="background: lightgray;">ia32 mode: reserved;< /span> amd64 mode: index register is not zero-extended by previous instruction.< /td></tr>

	201 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊ </td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al ign="left">┊</td><td align="left">┊</td><td align="left" colspan="100" >└ < span style="background: lightgray;">ia32 mode: reserved;</span> amd64 mode: inst ruction which zero-extends <code>%rbp</code> must be followed by <code>add %r15, %rbp</code>, <code>lea (%rbp,%r15,1),%rbp</code>, or <code>lea 0x0(%rbp,%r15,1), %rbp</code>.</td></tr>

	202 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊ </td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al ign="left">┊</td><td align="left" colspan="100" >└ <span style="background: lightgray;">ia32 mode: reserved;</span> amd64 mode: <code>add %r15,%rbp</code>, <code>lea (%rbp,%r15,1),%rbp</code>, or <code>lea 0x0(%rbp,%r15,1),%rbp</code> is used after instruction which does not zero-extend <code>%rbp</code>.</td></tr >

	203 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊ </td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al ign="left" colspan="100" >└ <span style="background: lightgray;">ia32 mode: reserved;</span> amd64 mode: instruction which zero-extends <code>%rsp</code> m ust be followed by <code>add %r15,%rsp</code> or <code>lea (%rsp,%r15,1),%rsp</c ode>.</td></tr>

	204 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊ </td><td align="left">┊</td><td align="left">┊</td><td align="left" colspan="100 " >└ <span style="background: lightgray;">ia32 mode: reserved;</span> amd64 mode: <code>add %r15,%rsp</code> or <code>lea (%rsp,%r15,1),%rsp</code> is used after instruction which does not zero-extend <code>%rsp</code>.</td></tr>

	205 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊ </td><td align="left">┊</td><td align="left" colspan="100" >└ <span style=" background: lightgray;">ia32 mode: reserved;</span> amd64 mode: <code>%r15b</cod e>, <code>%r15w</code>, <code>%r15d</code>, or <code>%r15</code> is modified. <c ode>%r15</code> is untouchable in amd64 mode.</td></tr>

	206 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊ </td><td align="left" colspan="100" >└ <span style="background: lightgray;" >ia32 mode: reserved;</span> amd64 mode: <span title="Note that %ebp is not ment ioned. It can be modified by a regular instruction. But NEXT instruction must be special if that happened."><code>%bpl</code>, <code>%bp</code>, or <code>%rbp</ code> is incorrectly modified. Only <code>%rbp</code> can be modified and then o nly by special instruction.</span></td></tr>

	207 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left" c olspan="100" >└ <span style="background: lightgray;">ia32 mode: reserved;</ span> amd64 mode: <span title="Note that %esp is not mentioned. It can be modifi ed by a regular instruction. But NEXT instruction must be special if that happen ed."><code>%spl</code>, <code>%sp</code>, or <code>%rsp</code> is incorrectly mo dified. Only <code>%rsp</code> can be modified and then only by special instruct ion.</span></td></tr>

	208 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left" colspan="100">└ Bad <code>call</code> alignment: <code>call</code> must end at the end of the bundl e, since <code>nacljmp</code> only can jump to aligned address.</span></td></tr>

	209 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left" colspan="100">└ <span style="background: l ightgray;">ia32 mode: reserved;</span> amd64 mode: <span title="Note: in ia32 mo de all non-special instructions are modifiable.">instruction is modifiable.</spa n></td></tr>

	210 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left" colspan="100">└ Special instruction (uses different validation ru les from the regular instruction). Can not be changed in ia32bit mode.</td></tr>

	211 <tr><td align="left">┊</td><td align="left">┊</td><td align="left" colspan="100" >└ Last byte is not immediate. It's either <span title="3DNow! instructions .">opcode</span>, <span title="Some AVX, FMA4, XOP instructions.">register numbe r</span> or <span title="vpermil2pd and vpermil2ps">register number and two-bit immediate</span>.</td></tr>

	212 <tr><td align="left">┊</td><td align="left" colspan="100">└ Invalid jump ta rget. When this flag is set <code>instruction_start</code> and <code>instructio n_end</code> both point to the <b>jump target</b> instruction, not to the <b>jum p</b> instruction itself.</td></tr>

	213 <tr><td align="left" colspan="100">└ Reserved.</td></tr>

	214 </table>

	215

	216 <p>Using this information you can determine if the given instruction follows <sp an title="Only “naclcall” and “nacljmp” in ia32 mode.">special rules</span>, if it includes <span title="Commands like jcc, jmp, loopcc, or call.">relative offs ets</span>, <span title="Most commands which access memory support displacements .">displacements</span>, or <span title="Immediates are support by many differen t commands. They can be combined with displacement if command accesses memory." >immediates</span>. Tests way use the information collected to precisely separat e different <i title="Immediates, displacements, relative offsets.">anyfields</i >, but in production only few bits are used to determine if the instruction can be changed or not: in ia32 mode only <span title="naclcall and nacljmp">special instructions</span> can not be changed, while in amd64 situation is the opposite : <span title="Only “call” and “mov” can be changed.">most instructions can not be changed</span>.</p>

	217

	218 <h4><div style="float:right"><a href="#TOC">▲</a></div><a name="5-2-1">5.2.1. Re placement validation.</a></h4>

	219

	220 <p>As was said <a href="#5-2">above</a> code replacement is not supported by <co de>ValidateChunkIA32</code> function directly. Instead it's done by higher-level function in <code>dfa_validate_32.c</code>.</p>

	221

	222 <p>It uses <code>CALL_USER_CALLBACK_ON_EACH_INSTRUCTION</code> option to compare lengths of instructions in two fragments in callback and <code>SPECIAL_INSTRUCT ION</code> flag passed to callback to make sure special instructions will be unc hanged.</p>

	223

	224 <p>One tricky thing there is handling of relative jumps and calls: if relative j ump (or call) triggers <code>DIRECT_JUMP_OUT_OF_RANGE</code> <b>but</b> is bit-t o-bit identical to the original instruction it's accepted anyway: this means tha t this particular <code>jump</code> (or <code>call</code>) jumps (or calls) some valid position outside of a given range. If it must be changed then you need to pass bigger region to the <code>ValidatorCodeReplacement_x86_32</code> function —<span title="This is, of course, not needed if landing point is bundle-aligned. ">this way validator will have a chance to check the landing place for validity< /span>.</p>

	225

	226 <h4><div style="float:right"><a href="#TOC">▲</a></div><a name="5-2-2">5.2.2. Re placement copying.</a></h4>

	227

	228 <p>As was said <a href="#5-2">above</a> code replacement is not supported by <co de>ValidateChunkIA32</code> function directly. Instead it's done by higher-level function in <code>dfa_validate_32.c</code>.</p>

	229

	230 <p>This is done by very simple function which uses <code>CALL_USER_CALLBACK_ON_E ACH_INSTRUCTION</code> option to process instructions one-after-another.</p>

	231

	232 <h2><div style="float:right"><a href="#TOC">▲</a></div><a name="6">6. Validation for x86-64 mode.</a></h2>

147	233

148 <p>While validator for ia32 mode is very simple and short (it also produces pret ty compact code) validator for x86-64 mode is different. It still has all the sa me properties validator for ia32 mode had (<code>valid_targets</code> and <code> jump_dests</code> arrays, “normal” and “special” instructions, bundles and <code >rel8_operand</code>/<code>rel32_operand</code> actions), but it adds quite a fe w additional twists to the whole scheme.</p>	234 <p>While validator for ia32 mode is very simple and short (it also produces pret ty compact code) validator for x86-64 mode is different. It still has all the sa me properties validator for ia32 mode had (<code>valid_targets</code> and <code> jump_dests</code> arrays, “normal” and “special” instructions, bundles and <code >rel8_operand</code>/<code>rel32_operand</code> actions), but it adds quite a fe w additional twists to the whole scheme.</p>

149	235

150 <h3><div style="float:right"><a href="#TOC">▲</a></div><a name="5-1">5.1. “Secon dary” states.</a></h3>	236 <p>It's created in a process which is similar to the process which creates the i a32 validator.</p>

151	237

152 <p>First of all: ia32 mode validator had one DFA in it and two arrays which kept track of the instruction boundaries but x86-64 has few more state variables. Mo st of them (<code>rex_prefix</code>, <code>vex_prefix2</code>, <code>vex_prefix3 </code>, <code>operand_states</code>, <code>base</code>, and <code>index</code>) keep track of the instruction parts (and thus they are cleared before each inst ruction), but one variable called <code>restricted_register</code> is used to ti e different instructions together. It keeps track of the <code>restricted_regist er</code> (if any). Note that not all restricted registers are born equal: most registers can be restricted and then forgotten (if you write to <code>%eax</code > and do nothing with the value before <code>call</code>), but <code>%esp</code> and <code>%ebp</code> are exceptions. If you write to the <code>%esp</code> the n the very next instruction must be <code>add %r15,%rbp</code> or <code>lea (%r1 5,%rbp,1),%rbp</code>. This means that if at the end of a bundle restricted regi ster is <code>%rsp</code> or <code>%rbp</code> then program is inavlid. For the same reason if then at beginning of a normal instruction (this includes first in struction in the “compound”) we see restricted <code>%rsp</code> or restricted < code>%rbp</code> then it's an error, too. On the other hand few rare special ins tructions which are used to restore the SFI invariant WRT <code>%rsp</code> or < code>%rbp</code> will only be accepted if restricted register is <code>%rsp</cod e> xor <code>%rbp</code> (depending on special instruction).</p>	238 <center><img src="files64.svg" height="90%"/><br />Gray elements are hand-writte n, white elements are generated and dark-gray are mixers.</center><br />

153	239

154 <h3><div style="float:right"><a href="#TOC">▲</a></div><a name="5-2">5.2. “Norma l” instructions.</a></h3>	240 <h3><div style="float:right"><a href="#TOC">▲</a></div><a name="6-1">6.1. “Secon dary” states.</a></h3>

155	241

156 <p>The hard part is, as before, in the DFA. First of all, main machine is simila r to what we had in ia32 mode, but subtly different: it's “<code>(normal_instruc tion \| special_instruction)*</code>” now. I.e.: <code>one_instruction</code> is replaced with <code>normal_instruction</code>. And what is <code>normal_instruct ion</code>? Why, it's “<code>one_instruction - special_instruction</code>”, of c ourse! Well… this is unexpected: why will we want to remove <code>special_instru ction</code>s from <code>normal_instruction</code>s only to add them back? The a nswer is related to actions: recall how <a href="#actions">actions</a> work. Whe n we remove <code>special_instruction</code> from <code>one_instruction</code> w e also remove the associated actions. This important in x86-64 case because some special instructions are just a normal instructions which are permitted to viol ate the usual rules! E.g. “special” instruction <code>and $~0x1f,%rsp</code> (wh ich is used to align the stack pointer) changes the <code>%rsp</code> directly w hich is usually forbidden, but because of properties of <code>and $xxx,…</code> (for any <code>$xxx</code> < <code>0</code>) we know that invariants will not be violated.</p>	242 <p>First of all: ia32 mode validator had one DFA in it and two arrays which kept track of the instruction boundaries but x86-64 has few more state variables. Mo st of them (<code>rex_prefix</code>, <code>vex_prefix2</code>, <code>vex_prefix3 </code>, <code>operand_states</code>, <code>base</code>, and <code>index</code>) keep track of the instruction parts (and thus they are cleared before each inst ruction), but one variable called <code>restricted_register</code> is used to ti e different instructions together. As the name implies it keeps track of the res tricted register (if any). Note that not all restricted registers are born equal : most registers can be restricted and then forgotten (if you write to <code>%ea x</code> and do nothing with the value before <code>call</code>), but <code>%esp </code> and <code>%ebp</code> are exceptions. If you write to the <code>%esp</co de> then the very next instruction must be <code>add %r15,%rsp</code> or <code>l ea (%r15,%rsp,1),%rsp</code>—and <code>%rbp</code> has similar requirements. Thi s means that if at the end of a bundle restricted register is <code>%rsp</code> or <code>%rbp</code> then program is invalid. For the same reason if at beginnin g of a normal instruction (this includes first instruction in the “compound”) we see restricted <code>%rsp</code> or restricted <code>%rbp</code> then it's an e rror, too. On the other hand few rare special instructions which are used to res tore the SFI invariant WRT <code>%rsp</code> or <code>%rbp</code> will only be a ccepted if restricted register is <code>%rsp</code> xor <code>%rbp</code> (depen ding on special instruction).</p>

	243

	244 <h3><div style="float:right"><a href="#TOC">▲</a></div><a name="6-2">6.2. “Norma l” instructions.</a></h3>

	245

	246 <p>The hard part is, as before, in the DFA. First of all, main machine is simila r to what we had in ia32 mode, but subtly different: it's “<code>(normal_instruc tion \| special_instruction)*</code>” now. I.e.: <code>one_instruction</code> is replaced with <code>normal_instruction</code>. And what is <code>normal_instruct ion</code>? Why, it's “<code>one_instruction - special_instruction</code>”, of c ourse! Well… this is unexpected: why will we want to remove <code>special_instru ction</code>s from <code>normal_instruction</code>s only to add them back? The a nswer is related to actions: recall how <a href="#2-1">actions</a> work. When we remove <code>special_instruction</code> from <code>one_instruction</code> we al so remove the associated actions. This is important in x86-64 case because some special instructions are just a normal instructions which are permitted to viola te the usual rules! E.g. “special” instruction <code>and $~0x1f,%rsp</code> (whi ch is used to align the stack pointer) changes the <code>%rsp</code> directly wh ich is usually forbidden, but because of properties of <code>and $xxx,…</code> ( for any <code>$xxx</code> < <code>0</code>) we know that invariants will not be violated.</p>

157	247

158 <p>This approach works well, but only if violations are detected at the instruct ion end. E.g. the aforementioned <code>and $~0x1f,%rsp</code> instruction is enc oded as </code>0x48 0x83 0xe4 0xe0</code> and after we've read </ code>0x48 0x83 0xe4</code> we already know it's normal instruction (op code <code>0x83</code> means it's <code>and</code>) which writes to <code>%rsp</ code> (<code>0x48 </code><i>opcode</i><code> 0xe4</code> means it's so me instruction which accepts some kind of immediate and writes to <code>%rsp</co de>) and we'll signal the error at this point then the fact that later we'll fin d out it's <code>special_instruction</code> which is accepted anyway will not ma tter: <code>SPL_MODIFIED</code> error will be triggered which will mean that cod e is rejected!</p>	248 <p>This approach works well, but only if violations are detected at the instruct ion end. E.g. the aforementioned <code>and $~0x1f,%rsp</code> instruction is enc oded as </code>0x48 0x83 0xe4 0xe0</code> and after we've read </ code>0x48 0x83 0xe4</code> we already know it's normal instruction (op code <code>0x83</code> means it's <code>and</code>) which writes to <code>%rsp</ code> (<code>0x48 </code><i>opcode</i><code> 0xe4</code> means it's so me instruction which accepts some kind of immediate and writes to <code>%rsp</co de>) and we'll signal the error at this point then the fact that later we'll fin d out it's <code>special_instruction</code> which is accepted anyway will not ma tter: <code>SPL_MODIFIED</code> error will be triggered which will mean that cod e is rejected!</p>

159	249

160 <p>This means that we can not do an actual conditions checking till the very end of normal instruction (we can try to process some of them but not all of them b ut this approach will be quite complex and fragile—not something you want in the most critical security piece). But there are an exception: memory access. <b>Th is</b> one is checked inline: memory access outside of “40GiB safe area” is stri ctly forbidden no matter how “special” the instruction is. That's why it's check ed immediately after operands discovery. This is how relevant fragment for the < code>and</code> instruction look like:<hr />	250 <p>This means that we can not do an actual conditions checking till the very end of normal instruction (we can try to process some of them but not all of them b ut this approach will be quite complex and fragile—not something you want in the most critical security piece). But there are an exception: memory access. <b>Th is</b> one is checked inline: memory access outside of “40GiB safe area” is stri ctly forbidden no matter how “special” the instruction is. That's why it's check ed immediately after operands discovery. This is how relevant fragment for the < code>and</code> instruction look like:<hr />

161     <code>(0x83 (opcode_2 any* & any&nbs p;. any* & operand_disp @check_access) imm8 @proce ss_0_operands) \|</code><br />	251     <code>(0x83 (opcode_4 any* & any&nbs p;. any* & operand_disp @check_access) imm8 @proce ss_0_operands) \|</code><br />

162     <code>(0x83 (opcode_2 any* & any&nbs p;. any* & operand_rip @check_access) imm8 @proces s_0_operands) \|</code><br />	252     <code>(0x83 (opcode_4 any* & any&nbs p;. any* & operand_rip @check_access) imm8 @proces s_0_operands) \|</code><br />

163     <code>(REX_B? 0x83 (opcode_2 any* && nbsp;any . any* & single_register_memory @check_access)  imm8 @process_0_operands) \|</code><br />	253     <code>(REX_B? 0x83 (opcode_4 any* && nbsp;any . any* & single_register_memory @check_access)  imm8 @process_0_operands) \|</code><br />

164     <code>(REX_X? 0x83 (opcode_2 any* && nbsp;any . any* & operand_sib_pure_index @check_access)  imm8 @process_0_operands) \|</code><br />	254     <code>(REX_X? 0x83 (opcode_4 any* && nbsp;any . any* & operand_sib_pure_index @check_access)  imm8 @process_0_operands) \|</code><br />

165     <code>(REX_XB? 0x83 (opcode_2 any* &  any . any* & operand_sib_base_index @check_access ) imm8 @process_0_operands) \|</code><br />	255     <code>(REX_XB? 0x83 (opcode_4 any* &  any . any* & operand_sib_base_index @check_access ) imm8 @process_0_operands) \|</code><br />

166     <code>(lock 0x83 (opcode_2 any* &&nb sp;any . any* & operand_disp @check_access) imm8&n bsp;@process_0_operands) \|</code><br />	256     <code>(lock 0x83 (opcode_4 any* &&nb sp;any . any* & operand_disp @check_access) imm8&n bsp;@process_0_operands) \|</code><br />

167     <code>(lock 0x83 (opcode_2 any* &&nb sp;any . any* & operand_rip @check_access) imm8&nb sp;@process_0_operands) \|</code><br />	257     <code>(lock 0x83 (opcode_4 any* &&nb sp;any . any* & operand_rip @check_access) imm8&nb sp;@process_0_operands) \|</code><br />

168     <code>(lock REX_B? 0x83 (opcode_2 an y* & any . any* & single_register_memory @che ck_access) imm8 @process_0_operands) \|</code><br />	258     <code>(lock REX_B? 0x83 (opcode_4 an y* & any . any* & single_register_memory @che ck_access) imm8 @process_0_operands) \|</code><br />

169     <code>(lock REX_X? 0x83 (opcode_2 an y* & any . any* & operand_sib_pure_index @che ck_access) imm8 @process_0_operands) \|</code><br />	259     <code>(lock REX_X? 0x83 (opcode_4 an y* & any . any* & operand_sib_pure_index @che ck_access) imm8 @process_0_operands) \|</code><br />

170     <code>(lock REX_XB? 0x83 (opcode_2 a ny* & any . any* & operand_sib_base_index @ch eck_access) imm8 @process_0_operands) \|</code><br />	260     <code>(lock REX_XB? 0x83 (opcode_4 a ny* & any . any* & operand_sib_base_index @ch eck_access) imm8 @process_0_operands) \|</code><br />

171     <code>(REX_B? 0x83 (opcode_2 @operand0_32 bit any* & modrm_registers @operand0_from_modrm_rm) imm 8 @process_1_operands) \|</code><hr />	261     <code>(REX_B? 0x83 (opcode_4 @operand0_32 bit any* & modrm_registers @operand0_from_modrm_rm) imm 8 @process_1_operand) \|</code><hr />

172 As you can see <code>check_access</code> is triggered after parsing ModRM/SIB by tes, but before parsing <code>imm<i>NN</i></code> field while <code>process_<i>N </i>_operands</code> action is triggered at the very end of the “normal” instruc tion. Even if instruction does not use <code>imm<i>NN</i></code> field <code>che ck_access</code> action is <b>still</b> triggerded before <code>process_<i>N</i> _operands</code> action. This is important because <code>check_access</code> act ion actually depends on <b>previous</b> state of “secondary” DFA while <code>pro cess_<i>N</i>_operands</code> action does the transtions of “secondary” DFA. Not e that it's only triggered for “normal” instructions—“special” instructions eith er do the work themselves (e.g. <code>add %r15,%rsp</code>—which is only valid i f previous state of “secondary” DFA was <code>REG_RSP</code> and moves DFA to <c ode>kNoRestrictedReg</code> in case of succcess) or call the usual <code>process _<i>N</i>_operands</code> action (e.g. <code>mov %rsp,%rbp</code> calls <code>pr ocess_0_operands</code> which ensures that this operation is not called in <code >REG_RSP</code>/<code>REG_RBP</code> “secondary” DFA state and transtions it to <code>kNoRestrictedReg</code> state).</p>	262 As you can see <code>check_access</code> is triggered after parsing ModRM/SIB by tes, but before parsing <code>imm<i>NN</i></code> field while <code>process_<i>N </i>_operands</code> action is triggered at the very end of the “normal” instruc tion. Even if instruction does not use <code>imm<i>NN</i></code> field <code>che ck_access</code> action is <b>still</b> triggerded before <code>process_<i>N</i> _operands</code> action. This is important because <code>check_access</code> act ion actually depends on <b>previous</b> state of <code>restricted_register</code > variable while <code>process_<i>N</i>_operands</code> action changes <code>res tricted_register</code> variable. Note that it's only triggered for “normal” ins tructions—“special” instructions either do the work themselves (e.g. <code>add % r15,%rsp</code>—which is only valid if previous state of <code>restricted_regist er</code> variable was <code>REG_RSP</code> and changes it to <code>NO_REG</code > in case of succcess) or call the usual <code>process_<i>N</i>_operands</code> action (e.g. <code>mov %rsp,%rbp</code> calls <code>process_0_operands</code> wh ich ensures that this operation is not called when <code>restricted_register</co de> is set to <code>REG_RSP</code>/<code>REG_RBP</code> state and transtions it to <code>NO_REG</code> state).</p>

173	263

174 <p>You can find yet another suprising thing in the snipped above: <code>and</cod e> instruction is handled either as instruction with zero operands or as instruc tion with one operand… but of course in reality it always has two operands! Some thing is strange here… Well, sure: the decoder part of validator is as streamlin ed as possible. We just ignore all non-register arguments and arguments which ar e not written to (but we <b>don't</b> ignore memory accesses if they happen here , of course). That's why <code>and</code> has either one or zero operands as far as validator is concerned.</p>	264 <p>You can find yet another suprising thing in the snippet above: <code>and</cod e> instruction is handled either as instruction with zero operands or as instruc tion with one operand… but of course in reality it always has two operands! Some thing is strange here… Well, sure: the decoder part of validator is as streamlin ed as possible. We just ignore all non-register arguments and arguments which ar e not written to (but we <b>don't</b> ignore memory accesses if they happen here , of course). That's why <code>and</code> has either one or zero operands as far as validator is concerned.</p>

175	265

176 <h3><div style="float:right"><a href="#TOC">▲</a></div><a name="5-3">5.3. Operan ds handling.</a></h3>	266 <h3><div style="float:right"><a href="#TOC">▲</a></div><a name="6-3">6.3. Operan ds handling.</a></h3>

177	267

178 <p>Operands handling as, again, is not that complex… if you are familiar with bi t operations. Initial version of the validator used simple array of records to s tore the information and everything worked well… with GCC, that is. MSVC produce d awful code which was almost 30% slower and also needed twenty minutes to do so thus we replaced this simple version with current macro-based one.</p>	268 <p>Operands handling as, again, is not that complex… if you are familiar with bi t operations. Initial version of the validator used simple array of records to s tore the information and everything worked well… with GCC, that is. MSVC produce d awful code which was almost 30% slower and also needed twenty minutes to do so thus we replaced this simple version with the current macro-based one.</p>

179	269

180 <p>All the information about encountered operands is collected in a single scala r variable <code>operand_states</code>. The layout of said variable looks like t his:</p>	270 <p>All the information about encountered operands is collected in a single scala r variable <code>operand_states</code>. The layout of said variable looks like t his:</p>

181 <table width="100%"><tr><td align="left">63</td><td align="right">39</td><td ali gn="left">38</td><td align="right">37</td><td align="left">36</td><td align="rig ht">32</td><td align="center">31</td><td align="left">30</td><td align="right">2 9</td><td align="left">28</td><td align="right">24</td><td align="center">23</td ><td align="left">22</td><td align="right">21</td><td align="left">20</td><td al ign="right">16</td><td align="center">15</td><td align="left">14</td><td align=" right">13</td><td align="left">12</td><td align="right">8</td><td align="center" >7</td><td align="left">6</td><td align="right">5</td><td align="left">4</td><td align="right">0</td></tr>	271 <table width="100%"><tr><td align="left">63</td><td align="right">39</td><td ali gn="left">38</td><td align="right">37</td><td align="left">36</td><td align="rig ht">32</td><td align="center">31</td><td align="left">30</td><td align="right">2 9</td><td align="left">28</td><td align="right">24</td><td align="center">23</td ><td align="left">22</td><td align="right">21</td><td align="left">20</td><td al ign="right">16</td><td align="center">15</td><td align="left">14</td><td align=" right">13</td><td align="left">12</td><td align="right">8</td><td align="center" >7</td><td align="left">6</td><td align="right">5</td><td align="left">4</td><td align="right">0</td></tr>

182 <tr><td colspan="2" style="border: thin solid black" width="100%" align="center" >padding</td><td colspan="2" style="border: thin solid black" align="center">ope rand4:<br />register_type</td><td colspan="2" style="border: thin solid black" a lign="center">operand4:<br />register_name</td><td style="border: thin solid bla ck" align="center">padding</td><td colspan="2" style="border: thin solid black" align="center">operand3:<br />register_type</td><td colspan="2" style="border: t hin solid black" align="center">operand3:<br />register_name</td><td style="bord er: thin solid black" align="center">padding</td><td colspan="2" style="border: thin solid black" align="center">operand2:<br />register_type</td><td colspan="2 " style="border: thin solid black" align="center">operand2:<br />register_name</ td><td style="border: thin solid black">padding</td><td colspan="2" style="borde r: thin solid black" align="center">operand1:<br />register_type</td><td colspan ="2" style="border: thin solid black" align="center">operand1:<br />register_nam e</td><td style="border: thin solid black" align="center">padding</td><td colspa n="2" style="border: thin solid black" align="center">operand0:<br />register_ty pe</td><td colspan="2" style="border: thin solid black" align="center">operand0: <br />register_name</td></tr><tr><td></td><td></td><td></td><td></td><td colspan ="2"> ↖<br />    0 if normal<br />   &nb sp;register</td><td></td><td></td><td></td><td colspan="2"> ↖<br /> &n bsp;  0 if normal<br />    register</td><td></td>< td></td><td></td><td colspan="2"> ↖<br />    0 if norma l<br />    register</td><td></td><td></td><td></td><td colsp an="2"> ↖<br />    0 if normal<br />   & nbsp;register</td><td></td><td></td><td></td><td colspan="2"> ↖<br />     0 if normal<br />    register</td></tr></t able>	272 <tr><td colspan="2" style="border: thin solid black;" width="100%" align="center ">padding</td><td colspan="2" style="border: thin solid black;" align="center">o perand4:<br />register_type</td><td colspan="2" style="border: thin solid black; " align="center">operand4:<br />register_name</td><td style="border: thin solid black;" align="center">padding</td><td colspan="2" style="border: thin solid bla ck;" align="center">operand3:<br />register_type</td><td colspan="2" style="bord er: thin solid black;" align="center">operand3:<br />register_name</td><td style ="border: thin solid black;" align="center">padding</td><td colspan="2" style="b order: thin solid black;" align="center">operand2:<br />register_type</td><td co lspan="2" style="border: thin solid black;" align="center">operand2:<br />regist er_name</td><td style="border: thin solid black;">padding</td><td colspan="2" st yle="border: thin solid black;" align="center">operand1:<br />register_type</td> <td colspan="2" style="border: thin solid black;" align="center">operand1:<br /> register_name</td><td style="border: thin solid black;" align="center">padding</ td><td colspan="2" style="border: thin solid black;" align="center">operand0:<br />register_type</td><td colspan="2" style="border: thin solid black;" align="ce nter">operand0:<br />register_name</td></tr>

	273 <tr><td></td><td></td><td></td><td></td><td colspan="2"> ↖<br /> &nbsp ;  0 if normal<br />    register</td><td></td><td> </td><td></td><td colspan="2"> ↖<br />    0 if normal<b r />    register</td><td></td><td></td><td></td><td colspan= "2"> ↖<br />    0 if normal<br />   &nbs p;register</td><td></td><td></td><td></td><td colspan="2"> ↖<br /> &nb sp;  0 if normal<br />    register</td><td></td><t d></td><td></td><td colspan="2"> ↖<br />    0 if normal <br />    register</td></tr></table>

183	274

184 <p>Register names are defined in <code>register_name</code> enum: first 16 are i dentical to the AMD/Intel names (from <code>REG_RAX</code> to <code>REG_R15</cod e>) while other 16 are used (partially) to describe non-register operands (memor y operand, immediate operand, <code>REG_RIP</code> and <code>REG_RIZ</code>, etc ). This means that if operand's name is >15 then it can be ignored. There are only four operand types: <code>OperandSandboxIrrelevant</code>, <code>OperandSa ndbox8bit</code>, <code>OperandSandboxRestricted</code>, and <code>OperandSandbo xUnrestricted</code>. First type is something not related to general purpose reg ister (x87, MMX, XMM, or YMM registers fall unto this category). We need to hand le 8bit operands specially because they are finicky: if <code>REX</code> byte is used they access <code>%spl</code>, <code>%bps</code>, <code>%sil</code>, and < code>%dil</code>, but when <code>REX</code> byte is not used the same numbers ar e reused for <code>%ah</code>, <code>%ch</code>, <code>%dh</code>, and <code>%bh </code>! Last two types are the most important: these are 32bit operands (which will make the appropriate register “frestricted”) or 16bit/64bit operands (these may affect register in question negatively if that's <code>%rbp</code>, <code>% rsp</code>, or <code>%r15</code>, but for other registers these are just ignored ). Note that if you assign <code>0</code> to this variable then all operands wil l be of <code>OperandSandboxIrrelevant</code> type.</p>	275 <p>Register names are defined in <code>register_name</code> enum: first 16 are i dentical to the AMD/Intel names (from <code>REG_RAX</code> to <code>REG_R15</cod e>) while other 16 are used (partially) to describe non-register operands (memor y operand, immediate operand, <code>REG_RIP</code> and <code>REG_RIZ</code>, etc ). This means that if operand's name is >15 then it can be ignored. There are only four operand types: <code>OperandSandboxIrrelevant</code>, <code>OperandSa ndbox8bit</code>, <code>OperandSandboxRestricted</code>, and <code>OperandSandbo xUnrestricted</code>. First type is something not related to general purpose reg ister (x87, MMX, XMM, or YMM registers fall unto this category). We need to hand le 8bit operands specially because they are finicky: if <code>REX</code> byte is used they access <code>%spl</code>, <code>%bps</code>, <code>%sil</code>, and < code>%dil</code>, but when <code>REX</code> byte is not used the same numbers ar e reused for <code>%ah</code>, <code>%ch</code>, <code>%dh</code>, and <code>%bh </code>! Last two types are the most important: these are 32bit operands (which will make the appropriate register “restricted”) or 16bit/64bit operands (these may affect register in question negatively if that's <code>%rbp</code>, <code>%r sp</code>, or <code>%r15</code>, but for other registers these are just ignored) . Note that if you assign <code>0</code> to this variable then all operands will be of <code>OperandSandboxIrrelevant</code> type.</p>

185	276

186 <p>Now the set of macroses used to work with operands should look less mysteriou s:<hr />	277 <p>Now the set of macro used to work with operands should look less mysterious:< hr />

187 <code>#define SET_OPERAND_NAME(N, S) operand_states \|=  ((S) << ((N) << 3))</code><br />	278 <code>#define SET_OPERAND_NAME(N, S) operand_states \|=  ((S) << ((N) << 3))</code><br />

188 <code>#define SET_OPERAND_TYPE(N, T) SET_OPERAND_TYPE_ ##&nb sp;T(N)</code><br />	279 <code>#define SET_OPERAND_TYPE(N, T) SET_OPERAND_TYPE_ ##&nb sp;T(N)</code><br />

189 <code>#define SET_OPERAND_TYPE_OperandSize8bit(N) operand_states  \|= OperandSandbox8bit << (5 + ((N) <<& nbsp;3))</code><br />	280 <code>#define SET_OPERAND_TYPE_OperandSize8bit(N) operand_states  \|= OperandSandbox8bit << (5 + ((N) <<& nbsp;3))</code><br />

190 <code>#define SET_OPERAND_TYPE_OperandSize16bit(N) operand_states&nbsp ;\|= OperandSandboxUnrestricted << (5 + ((N)  << 3))</code><br />	281 <code>#define SET_OPERAND_TYPE_OperandSize16bit(N) operand_states&nbsp ;\|= OperandSandboxUnrestricted << (5 + ((N)  << 3))</code><br />

191 <code>#define SET_OPERAND_TYPE_OperandSize32bit(N) operand_states&nbsp ;\|= OperandSandboxRestricted << (5 + ((N) &l t;< 3))</code><br />	282 <code>#define SET_OPERAND_TYPE_OperandSize32bit(N) operand_states&nbsp ;\|= OperandSandboxRestricted << (5 + ((N) &l t;< 3))</code><br />

192 <code>#define SET_OPERAND_TYPE_OperandSize64bit(N) operand_states&nbsp ;\|= OperandSandboxUnrestricted << (5 + ((N)  << 3))</code><br />	283 <code>#define SET_OPERAND_TYPE_OperandSize64bit(N) operand_states&nbsp ;\|= OperandSandboxUnrestricted << (5 + ((N)  << 3))</code><br />

193 <code>#define CHECK_OPERAND(N, S, T) ((operand_states &  (0xff << ((N) << 3))) == ((S&nbs p;\| (T << 5)) << ((N) << 3) ))</code><hr />	284 <code>#define CHECK_OPERAND(N, S, T) ((operand_states &  (0xff << ((N) << 3))) == ((S&nbs p;\| (T << 5)) << ((N) << 3) ))</code><hr />

194 Calls like <code>SET_OPERAND_NAME(0, REG_RAX)</code> are used by actions to set name of the operand (this particular one is used by <code>operand0_rax</code> ac tion) while calls like <code>SET_OPERAND_TYPE(0, OperandSize2bit)</code> are use d by actions to set the type of operand (this particular one is used by <code>op erand0_2bit</code> action). Note that we <b>don't</b> handle 2bit operands in th e set of macroses above. This is not a mistake: 2bit operands are only ever used as immediate operands (and then only in two instructions: <code>vpermil2pd</cod e> and <code>vpermil2ps</code>) and we don't process immediate operands here. If they will be by some reason left in the <codeo>validator_x86_64_instruction.rl< /code> file this will lead to the compile-time error, not to some kind of weird overflow which may [potentially] produce security hole.</p>	285 Calls like <code>SET_OPERAND_NAME(0, REG_RAX)</code> are used by actions to set name of the operand (this particular one is used by <code>operand0_rax</code> ac tion) while calls like <code>SET_OPERAND_TYPE(0, OperandSize2bit)</code> are use d by actions to set the type of operand (this particular one is used by <code>op erand0_2bit</code> action). Note that we <b>don't</b> handle 2bit operands in th e set of macro above. This is not a mistake: 2bit operands are only ever used as immediate operands (and then only in two instructions: <code>vpermil2pd</code> and <code>vpermil2ps</code>) and we don't process immediate operands here. If th ey will be by some reason left in the <code>validator_x86_64_instruction.rl</cod e> file this will lead to the compile-time error, not to some kind of weird over flow which may [potentially] produce security hole.</p>

195	286

196 <p>Almost all manipulations with <code>operand_states</code> are done using macr oses described above, but there are one construct in <code>process_<i>N</i>_oper ands</code> function which accesses the <code>operand_states</code> direfctly:<h r />	287 <p>Almost all manipulations with <code>operand_states</code> are done using macr o described above, but there are one construct in <code>process_<i>N</i>_operand s</code> function which accesses the <code>operand_states</code> direfctly:<hr / >

197     <code>/* Take 2 bits of operand  type from operand_states as *restricted_register,</cod e><br />	288     <code>/* Take 2 bits of operand  type from operand_states as *restricted_register,</cod e><br />

198      <code>* make sure operand_states&nb sp;denotes a register (4th bit == 0). */</cod e><br />	289      <code>* make sure operand_states&nb sp;denotes a register (4th bit == 0). */</cod e><br />

199     <code>} else if ((operand_states &&n bsp;0x70) == (OperandSandboxRestricted << 5)) {</ code><br />	290     <code>} else if ((operand_states &&n bsp;0x70) == (OperandSandboxRestricted << 5)) {</ code><br />

200       <code>*restricted_register = opera nd_states & 0x0f;</code><br />	291       <code>*restricted_register = opera nd_states & 0x0f;</code><br />

201     <code>}</code><hr />	292     <code>}</code><hr />

202 If you'll take a look on the layout of <code>operand_states</code> then it's pre tty easy to understand what goes on here: <code>(operand_states & 0x70) == ( OperandSandboxRestricted << 5)</code> yeilds <code>TRUE</code> if and only if zeroth operand is “normal” register <b>and</b> it's of type <code>OperandSan dboxRestricted</code>. This is actually central piece of the “Secondary” DFA han dling—most other pieces just return this “secondary” DFA back to <code>kNoRestri ctedReg</code> state.</p>	293 If you'll take a look on the layout of <code>operand_states</code> then it's pre tty easy to understand what goes on here: <code>(operand_states & 0x70) == ( OperandSandboxRestricted << 5)</code> yeilds <code>TRUE</code> if and only if zeroth operand is “normal” register <b>and</b> it's of type <code>OperandSan dboxRestricted</code>. This is actually central piece of the <code>restricted_re gister</code> handling—most other pieces just return it back to <code>NO_REG</co de> state.</p>

203	294

204 <p>Well… most, but not all. One exception happens in <code>process_<i>N</i>_oper ands</code> functions: if “secondary” DFA is in <code>kSandboxedRsi</code> state and we restrict the <code>%rdi</code> register then we go to the <code>kSandbox edRsiRestrictedRdi</code> state, not to the usual <code>REG_RDI</code> state. Ot her exceptions are related to “special” instructions: <code>lea (%r15,%rsi,1),%r si</code> may move us to <code>kSandboxedRsi</code> state and <code>lea (%r15,%r di,1),%rdi</code> may move us to either <code>kSandboxedRdi</code> or <code>kSan dboxedRsiSandboxedRdi</code> state.</p>	295 <h3><div style="float:right"><a href="#TOC">▲</a></div><a name="6-4">6.4. Dynami c code modification support.</a></h3>

205	296

206 <p>Yet another tricky piece of code can be found in <code>check_access</code> fu nction. It's this piece of code:<hr />	297 <p>Dynamic code modification support is implemented similarly to ia32 mode—with the help of <code>CALL_USER_CALLBACK_ON_EACH_INSTRUCTION</code> option. When tha t happend callback have all the information needed to process the instruction: c ollected errors, information about immediates, etc.</p>

207     <code>if (index == (restricted_register&n bsp;& 0x1f)) {</code><br />

208       <code>BitmapClearBit(valid_targets, ins truction_start);</code><br />

209     <code>}</code><hr />

210 This is where we use not the full state of the “secondary” DFA, but just low fiv e bits (which describe if there are some restricted register and if it exist the n what register is restricted currently). All other places just use full state o f “secondary” DFA.</p>

211	298

212 <h2><div style="float:right"><a href="#TOC">▲</a></div><a name="6">6. Decoders.< /a></h2>	299 <p>All that information is squeezed in <code>instruction_info_collected</code> v ariable. It has the following format:</p>

213	300

214 <p>The only remaining issue (but a big one) is about generation of the actual de coders (<code>{decoder,validator}_x86_{32,64}_instruction.rl files)</code>. This is big part of the whole package, but, thankfully, it happens in significantly less hostily environment: decoder and validator must work even if they are proce ssing specially-crafted file created by clever adversary while <code>gen_dfa.cc< /code> processes data files created by us and should only correcly process certa in “good” files.</p>	301 <table width="100%"><tr><td align="left">31</td><td align="left">30</td><td alig n="left">29</td><td align="left">28</td><td align="left">27</td><td align="left" >26</td><td align="left">25</td><td align="left">24</td><td align="left">23</td> <td align="left">22</td><td align="left">21</td><td align="left">20</td><td alig n="left">19</td><td align="left">18</td><td align="left">17</td><td align="left" >16</td><td align="left">15</td><td align="left">14</td><td align="left">13</td> <td align="left">12</td><td align="right">8</td><td align="left">7</td><td align ="left">6</td><td align="right">5</td><td align="left">4</td><td align="left">3< /td><td align="right">0</td></tr>

	302 <tr><td align="left"> </td><td align="left"> </td><td align="left">&nb sp;</td><td align="left"> </td><td align="left"> </td><td align="left" > </td><td colspan="12" align="left" style="border: thin solid black;"><tab le width="100%"><tr><td align="left">⇤</td><td align="center"><code>VALIDATION_E RRORS_MASK</code></td><td align="right">⇥</td></table></td><td align="left">&nbs p;</td><td colspan="2" align="left" width="1%" style="border: thin solid black;" ><table width="100%"><tr><td align="left">⇤</td><td width="1%" align="center"><c ode>RESTRICTED_REGISTER_MASK</code></td><td align="right">⇥</td></table></td><td align="left"> </td><td colspan="2" align="left" width="1%" style="border: thin solid black;"><table width="100%"><tr><td align="left">⇤</td><td width="1%" align="center"><code>RESTRICTED_REGISTER_MASK</code></td><td align="right">⇥</t d></table></td><td align="left"> </td><td colspan="2" align="left" width="1 %" style="border: thin solid black;"><table width="100%"><tr><td align="left">⇤< /td><td width="1%" align="center"><code>IMMEDIATES_SIZE_MASK</code></td><td alig n="right">⇥</td></table></td></tr>

	303 <tr><td style="border: thin solid black; background: gray;" width="1%" align="ce nter"> 0 </td><td style="border: thin solid black;" width="1%" align=" center">   </td><td style="border: thin solid black;" width="1%" align="center">   </td><td width="1%" style="border: thin solid b lack;" align="center">   </td><td style="border: thin solid black ;" width="1%" align="center">   </td><td style="border: thin soli d black;" width="1%" align="center">   </td><td style="border: th in solid black;" width="1%" align="center">   </td><td style="bor der: thin solid black;" width="1%" align="center">   </td><td sty le="border: thin solid black;" width="1%" align="center">   </td> <td style="border: thin solid black;" width="1%" align="center">  &nbs p;</td><td style="border: thin solid black;" width="1%" align="center"> &nb sp; </td><td style="border: thin solid black;" width="1%" align="center">&n bsp;  </td><td style="border: thin solid black;" width="1%" align="cen ter">   </td><td style="border: thin solid black;" width="1%" ali gn="center">   </td><td style="border: thin solid black;" width=" 1%" align="center">   </td><td style="border: thin solid black;" width="1%" align="center">   </td><td style="border: thin solid b lack;" width="1%" align="center">   </td><td style="border: thin solid black;" width="1%" align="center">   </td><td style="border : thin solid black;" width="1%" align="center">   </td><td colspa n="2" style="border: thin solid black;" align="center">   </td><t d style="border: thin solid black;" width="1%" align="center">    </td><td colspan="2" style="border: thin solid black;" align="center"> &nbs p; </td><td style="border: thin solid black;" width="1%" align="center">&nb sp;  </td><td colspan="2" style="border: thin solid black;" align="cen ter">   </td><td>   </td></tr>

	304 <tr><td align="left">↑</td><td align="left">↑</td><td align="left">↑</td><td ali gn="left">↑</td><td align="left">↑</td><td align="left">↑</td><td align="left">↑ </td><td align="left">↑</td><td align="left">↑</td><td align="left">↑</td><td al ign="left">↑</td><td align="left">↑</td><td align="left">↑</td><td align="left"> ↑</td><td align="left">↑</td><td align="left">↑</td><td align="left">↑</td><td a lign="left">↑</td><td align="left">↑</td><td align="left">↑</td><td align="left" > </td><td align="left">↑</td><td align="left">↑</td><td align="left">&nbsp ;</td><td align="left">↑</td><td align="left">↑</td><td align="left"> </td> <td align="left"> </td></tr>

	305 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊ </td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left"> ┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td a lign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left" > </td><td align="left">┊</td><td align="left">┊</td><td align="left">&nbsp ;</td><td align="left">┊</td><td align="left" colspan="100" >└ Cumulutive s ize of <i title="Immediates, displacements, relative offsets.">anyfields</i>.</t d></tr>

	306 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊ </td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left"> ┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td a lign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left" > </td><td align="left">┊</td><td align="left">┊</td><td align="left">&nbsp ;</td><td align="left" colspan="100" >└ <span title="enter, extrq, insertq" >Instruction has two immediates</a>.</td></tr>

	307 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊ </td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left"> ┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td a lign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left" > </td><td align="left">┊</td><td align="left" colspan="100" >└ <span title="00 == 0 bytes, 01 == 1 bytes, 10 = 2 bytes, 11 = 4 bytes">Instruction dis placement size</span>.</td></tr>

	308 <!--<tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="lef t">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><t d align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="le ft">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td>< td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="l eft"> </td><td align="left">┊</td><td align="left" colspan="100" >└ <s pan title="Top half of a last byte of an instruction is fourth register operand, two remaining bytes are reserved.">Instruction has 2bit immediate operation.</s pan></td></tr>-->

	309 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊ </td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left"> ┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td a lign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left" > </td><td align="left" colspan="100" >└ Instruction has relative offs et.</td></tr>

	310 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊ </td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left"> ┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td a lign="left">┊</td><td align="left">┊</td><td align="left" colspan="100" >└  <span title="NO_REG if instruction does not zero-extending one">Register, zero-e xtended by the instruction.</span></td></tr>

	311 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊ </td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left"> ┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td a lign="left">┊</td><td align="left" colspan="100" >└ <span title="This means that start of this instruction is not a valid jump target.">Instruction is vali d, but it access memory using register which is zero-extended by previous instru ction.</span></td></tr>

	312 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊ </td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left"> ┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td a lign="left" colspan="100" >└ <span title="Note that all unsupported instruc tions trigger this error. This includes mov by absolute 64bit address, system in structions like lidt or even call and jmp used not as part of superinstruction. If combined with CPUID_UNSUPPORTED_INSTRUCTION it means that instruction is not yet enabled in validator.">DFA error: invalid instruction. Validation then resum es from the next bundle.</span></td></tr>

	313 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊ </td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left"> ┊</td><td align="left">┊</td><td align="left">┊</td><td align="left" colspan="10 0" >└ Unaligned direct jump to address outside of given region.</td></tr>

	314 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊ </td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left"> ┊</td><td align="left">┊</td><td align="left" colspan="100" >└ Instruction is not supported for a given CPUID mask.</td></tr>

	315 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊ </td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left"> ┊</td><td align="left" colspan="100" >└ Base register is not <code>%rbp</co de>, <code>%rsp</code>, or <code>%r15</code>.</td></tr>

	316 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊ </td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left" colspan="100" >└ Index register is not zero-extended by previous instructio n.</td></tr>

	317 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊ </td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al ign="left">┊</td><td align="left">┊</td><td align="left" colspan="100" >└ I nstruction which zero-extends <code>%rbp</code> must be followed by <code>add %r 15,%rbp</code>, <code>lea (%rbp,%r15,1),%rbp</code>, or <code>lea 0x0(%rbp,%r15, 1),%rbp</code>.</td></tr>

	318 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊ </td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al ign="left">┊</td><td align="left" colspan="100" >└ <code>add %r15,%rbp</cod e>, <code>lea (%rbp,%r15,1),%rbp</code>, or <code>lea 0x0(%rbp,%r15,1),%rbp</cod e> is used after instruction which does not zero-extend <code>%rbp</code>.</td>< /tr>

	319 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊ </td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al ign="left" colspan="100" >└ Instruction which zero-extends <code>%rsp</code > must be followed by <code>add %r15,%rsp</code> or <code>lea (%rsp,%r15,1),%rsp </code>.</td></tr>

	320 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊ </td><td align="left">┊</td><td align="left">┊</td><td align="left" colspan="100 " >└ <code>add %r15,%rsp</code> or <code>lea (%rsp,%r15,1),%rsp</code> is u sed after instruction which does not zero-extend <code>%rsp</code>.</td></tr>

	321 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊ </td><td align="left">┊</td><td align="left" colspan="100" >└ <code>%r15b</ code>, <code>%r15w</code>, <code>%r15d</code>, or <code>%r15</code> is modified. <code>%r15</code> is untouchable in amd64 mode.</td></tr>

	322 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊ </td><td align="left" colspan="100" >└ <span title="Note that %ebp is not m entioned. It can be modified by a regular instruction. But NEXT instruction must be special if that happened."><code>%bpl</code>, <code>%bp</code>, or <code>%rb p</code> is incorrectly modified. Only <code>%rbp</code> can be modified and the n only by special instruction.</span></td></tr>

	323 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left" c olspan="100" >└ <span title="Note that %esp is not mentioned. It can be mod ified by a regular instruction. But NEXT instruction must be special if that hap pened."><code>%spl</code>, <code>%sp</code>, or <code>%rsp</code> is incorrectly modified. Only <code>%rsp</code> can be modified and then only by special instr uction.</span></td></tr>

	324 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left">┊</td><td align="left" colspan="100">└ Bad <code>call</code> alignment: <code>call</code> must end at the end of the bundl e, since <code>nacljmp</code> only can jump to aligned address.</span></td></tr>

	325 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left">┊</td><td align="left" colspan="100">└ <span title="amd64 mode: i n ia32 mode all non-special instructions are modifiable">Instruction is modifiab le.</span></td></tr>

	326 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali gn="left" colspan="100">└ Special instruction (uses different validation ru les from the regular instruction). Can not be changed in ia32bit mode.</td></tr>

	327 <tr><td align="left">┊</td><td align="left">┊</td><td align="left" colspan="100" >└ Last byte is not immediate. It's either <span title="3DNow! instructions .">opcode</span>, <span title="Some AVX, FMA4, XOP instructions.">register numbe r</span> or <span title="vpermil2pd and vpermil2ps">register number and two-bit immediate</span>.</td></tr>

	328 <tr><td align="left">┊</td><td align="left" colspan="100">└ Invalid jump ta rget. When this flag is set <code>instruction_start</code> and <code>instructio n_end</code> both point to the <b>jump target</b> instruction, not to the <b>jum p</b> instruction itself.</td></tr>

	329 <tr><td align="left" colspan="100">└ Reserved.</td></tr>

	330 </table>

	331

	332 <p>Using this information you can determine if the given instruction follows <sp an title="A lot of different commands in amd64 mode: %rbp/%rsp modifications, st ring instructions, “naclcall”, and “nacljmp”.">special rules</span>, if it inclu des <span title="Commands like “jcc”, “jmp”, “loopcc”, or “call”.">relative offs ets</span>, <span title="Most commands which access memory support displacements .">displacements</span>, or <span title="Immediates are support by many differen t commands. They can be combined with displacement if command accesses memory." >immediates</span>. Tests way use the information collected to precisely separat e different <i title="Immediates, displacements, relative offsets.">anyfields</i >, but in production only few bits are used to determine if the instruction can be changed or not: in amd64 mode <span title="Only “call” and “mov” can be chang ed.">most instructions can not be changed</span>—and then only <i title="Immedia tes, displacements, relative offsets.">anyfields</i>.</p>

	333

	334 <h4><div style="float:right"><a href="#TOC">▲</a></div><a name="6-4-1">6.4.1. Re placement validation.</a></h4>

	335

	336 <p>As was said <a href="#6-4">above</a> code replacement is not supported by <co de>ValidateChunkAMD64</code> function directly. Instead it's done by higher-leve l function in <code>dfa_validate_64.c</code>.</p>

	337

	338 <p>It uses <code>CALL_USER_CALLBACK_ON_EACH_INSTRUCTION</code> option to compare lengths of instructions in two fragments in callback and <code>MODIFIABLE_INSTR UCTION</code> flag passed to callback to make sure that <span title="Currently o nly “call” and “mov” can be changed.">only few hand-picked intrsuctions can be c hanged</span>.</p>

	339

	340 <p>One tricky thing there is handling of relative jumps and calls: if relative j ump (or call) triggers <code>DIRECT_JUMP_OUT_OF_RANGE</code> <b>but</b> is bit-t o-bit identical to the original instruction it's accepted anyway: this means tha t this particular <code>jump</code> (or <code>call</code>) jumps (or calls) some valid position outside of a given range. If it must be changed then you need to pass bigger region to the <code>ValidatorCodeReplacement_x86_64</code> function —<span title="This is, of course, not needed if landing point is bundle-aligned. ">this way validator will have a chance to check the landing place for validity< /span>.</p>

	341

	342 <p style="margin-bottom:0px;">Another tricky bit is related to detection of <i t itle="Immediates, displacements, relative offsets.">anyfields</i> position: most instructions put them at the end, but some instructions use the last byte for:< /p>

	343 <ul style="margin-top:0px; margin-bottom:0px;">

	344 <li><i>opcode extension</i>: 3DNow! instructions, <code>cmp<i>cc</i>sd</code>/<c ode>vcmp<i>cc</i>sd</code> and <code>cmp<i>cc</i>ss</code>/<code>vcmp<i>cc</i>ss </code>, and <code>pclmulqdq</code>/<code>vpclmulqdq</code>.</li>

	345 <li><i>fourth register operand</i>: some AVX instructions (such as <code>vblendv pd</code>/<code>vblendvps</code>), some FMA4 instructions (such as <code>vfmadds ubpd</code>), and some XOP instructions (such as <code>vpperm</code>).</li>

	346 <li><i>fourth register operand</i> <b>and</b> <i>fifth 2-bit immediate operand</ i>: <code>vpermil2pd</code>/<code>vpermil2ps</code>.</li>

	347 </ul>

	348 <p style="margin-top:0px;">All these instructions set <code>LAST_BYTE_IS_NOT_IMM EDIATE</code> flag, last form can be distinguished because it sets <span title=" Which actually includes LAST_BYTE_IS_NOT_IMMEDIATE flag"><code>IMMEDIATE_2BIT</c ode> flag</span>.</p>

	349

	350 <h4><div style="float:right"><a href="#TOC">▲</a></div><a name="6-4-2">6.4.2. Re placement copying.</a></h4>

	351

	352 <p>As was said <a href="#6-4">above</a> code replacement is not supported by <co de>ValidateChunkAMD64</code> function directly. Instead it's done by higher-leve l function in <code>dfa_validate_64.c</code>.</p>

	353

	354 <p>This is done by very simple function which uses <code>CALL_USER_CALLBACK_ON_E ACH_INSTRUCTION</code> option to process instructions one-after-another.</p>

	355

	356 <h2><div style="float:right"><a href="#TOC">▲</a></div><a name="7">7. Decoders.< /a></h2>

	357

	358 <p>The only remaining issue (but a big one) is about generation of the actual de coders (<code>{decoder,validator}_x86_{32,64}_instruction.rl files)</code>. This is big part of the whole package, but, thankfully, it happens in significantly less hostile environment: decoder and validator must work even if they are proce ssing specially-crafted file created by clever adversary while <code>gen_dfa</co de> processes data files created by us and should only correcly process certain “good” files.</p>

	359

	360 <p>To understand how it works it's better to start with the decoders. Remember h ow we've talked about “streamlined data structures”, “indispensable minimum of t he information”, etc? This approach produces fast and [relatively] simple valida tor, but it makes it hard to test and debug it. To facilitate testing and debugg ing we create separate decoders: these return all the information about all the intructions they can parse and in fact can produce output identical to <a href=" http://sourceware.org/binutils/docs/binutils/objdump.html#objdump">objdump</a>'s output.</p>

	361

	362 <p>They are used to verify the description of the instructions from <code>.def</ code> files—with a special attention to the length of a said instructions.</p>

	363

	364 <p>Decoders are created using familiar process.</p>

	365

	366 <center><img src="filesdecoder.svg" height="120%"/><br />Gray elements are hand- written, white elements are generated and dark-gray are mixers.</center><br />

	367

	368 <p></p>

	369 <p style="margin-bottom:0px;">There are few big differences between standalone d ecoders and simplified decoders embedded in <code>ValidateChunkIA32</code>/<code >ValidateChunkAMD64</code>:</p>

	370 <ul style="margin-top:0px;">

	371 <li>Standalone decoders are pretty close to each other (the only differences are CPU-dictated differences such as REX prefix handling)—simplified decoders are q uite different (as dictated by appropriate SFI models).</li>

	372 <li>Standalone decoders don't have hand-encoded “special” instructions, all the instructions they can decode come from <code>.def</code> files.</li>

	373 <li>Standalone decoders don't squeeze extracted information unto a few flat vari ables. Instead they use <code>struct instruction</code>—common for both decoders .</li>

	374 </ul>

	375

	376 <p>All these facts mean that standalone decoders are singnificantly larger and s lower—but also much easier to understand. And simplified decoders are using <b>t he exact same DFA</b> with only some actions changed or omitted.</p>

215	377

216 </body>	378 </body>

217	379

OLD	NEW