OLD | NEW |
1 <head> | 1 <head> |
2 <title>Validator structure</title> | 2 <title>Validator structure</title> |
3 <meta http-equiv="content-type" content="text/html; charset=utf-8" /> | 3 <meta http-equiv="content-type" content="text/html; charset=utf-8" /> |
4 </head> | 4 </head> |
5 <body> | 5 <body> |
| 6 <div> |
| 7 <div style="width:20%; float:left; padding-right:5%;"><a href="http://en.wikiped
ia.org/wiki/File:Duesenberg.jpg"><img border="0" src="http://upload.wikimedia.or
g/wikipedia/commons/thumb/3/3a/Duesenberg.jpg/800px-Duesenberg.jpg" width="100%"
/></a><br /><center><span style="font-size:50%">Source: <a href="http://en.wiki
pedia.org/wiki/File:Duesenberg.jpg">http://en.wikipedia.org/wiki/File:Duesenberg
.jpg</a></span></center></div><div style="width:33%; float:right; padding-left:5
%;"><a href="http://en.wikipedia.org/wiki/File:Felipe_Massa_2011_Malaysia_FP1.jp
g"><img border="0" src="http://upload.wikimedia.org/wikipedia/commons/thumb/5/58
/Felipe_Massa_2011_Malaysia_FP1.jpg/800px-Felipe_Massa_2011_Malaysia_FP1.jpg" wi
dth="100%" /></a><center><span style="font-size:50%">Source: <a href="http://upl
oad.wikimedia.org/wikipedia/commons/thumb/5/58/Felipe_Massa_2011_Malaysia_FP1.jp
g/800px-Felipe_Massa_2011_Malaysia_FP1.jpg">http://upload.wikimedia.org/wikipedi
a/commons/thumb/5/58/Felipe_Massa_2011_Malaysia_FP1.jpg/800px-Felipe_Massa_2011_
Malaysia_FP1.jpg</a></span></center></div> |
| 8 <h1>New, DFA-based validator with 5-10x speed of the original one, or…<br /> |
| 9 <div style="text-align:right;">Luxury car to F1 car.</div></h1> |
| 10 <div style="position:relative; width:55%; left:10%;">Trust me: every problem in
computer science may be solved by an indirection, but those indirections are <b>
expensive</b>. Pointer chasing is just about the most expensive thing you can do
on modern CPU's.<br /><a href="http://lwn.net/Articles/509416/"><i>—Linus Torva
lds</i></a></div> |
| 11 <div> |
6 <a name="TOC"></a> | 12 <a name="TOC"></a> |
| 13 <ol style="clear:both;"> |
| 14 <li><a href="#1">DFA, Ragel, macro and inline functions, oh my…</a></li> |
| 15 <li><a href="#2">What is ragel and how it works.</a></li> |
7 <ol> | 16 <ol> |
8 <li><a href="#1">DFA, Ragel, macroses and inline functions, oh my…</a></li> | 17 <li><a href="#2-1">Ragel actions.</a></li> |
9 <li><a href="#2">“Special” instructions.</a></li> | 18 </ol> |
10 <li><a href="#3">“No so special” instructions.</a></li> | 19 <li><a href="#3">“Special” instructions.</a></li> |
11 <li><a href="#4">Features beyond minimal validation.</a></li> | 20 <li><a href="#4">“No so special” instructions.</a></li> |
| 21 <li><a href="#5">Features beyond minimal validation.</a></li> |
12 <ol> | 22 <ol> |
13 <li><a href="#4-1"><code>CPUID</code> support.</a></li> | 23 <li><a href="#5-1"><code>CPUID</code> support.</a></li> |
14 <li><a href="#4-2">Dynamic code creation support.</a></li> | 24 <li><a href="#5-2">Dynamic code modification support.</a></li> |
15 <li><a href="#4-3">Dynamic code modification support.</a></li> | 25 <ol> |
| 26 <li><a href="#5-2-1">Replacement validation.</a></li> |
| 27 <li><a href="#5-2-2">Replacement copying.</a></li> |
16 </ol> | 28 </ol> |
17 <li><a href="#5">Validation for x86-64 mode.</a></li> | 29 </ol> |
| 30 <li><a href="#6">Validation for x86-64 mode.</a></li> |
18 <ol> | 31 <ol> |
19 <li><a href="#5-1">“Secondary” states.</a></li> | 32 <li><a href="#6-1">“Secondary” states.</a></li> |
20 <li><a href="#5-2">“Normal” instructions.</a></li> | 33 <li><a href="#6-2">“Normal” instructions.</a></li> |
21 <li><a href="#5-3">Operands handling.</a></li> | 34 <li><a href="#6-3">Operands handling.</a></li> |
| 35 <li><a href="#6-4">Dynamic code modification support.</a></li> |
| 36 <ol> |
| 37 <li><a href="#6-4-1">Replacement validation.</a></li> |
| 38 <li><a href="#6-4-2">Replacement copying.</a></li> |
22 </ol> | 39 </ol> |
23 <li><a href="#6">Decoders.</a></li> | |
24 </ol> | 40 </ol> |
25 <h2><div style="float:right"><a href="#TOC">▲</a></div><a name="1">1. DFA, Ragel
, macroses and inline functions, oh my…</a></h2> | 41 <li><a href="#7">Decoders.</a></li> |
26 <p>To understand how DFA-based validators work it's best to start from function
<code>ValidateChunkIA32</code> in <code>validator_x86_32.rl</code>. Said functio
n is very short and “simple”: it allocates couple of arrays (<code>valid_targets
</code> and <code>jump_dests</code>), then cycles over code passed to it (proces
sing it in bundle-sized chunks) and at the end it compares valid jump targets an
d collected jump destinations… that's it. Oh, and it also includes couple of cry
ptic lines right in the middle of innermost cycle:<hr /> | 42 </ol> |
| 43 <h2><div style="float:right"><a href="#TOC">▲</a></div><a name="1">1. DFA, Ragel
, macro and inline functions, oh my…</a></h2> |
| 44 |
| 45 <p>Contemporary computer systems are extremely powerful and most complex compone
nts and libraries are built like a <a href="http://en.wikipedia.org/wiki/Luxury_
vehicle">luxury car</a>: they include a lot of comfort and safety technologies w
hich are designed to improve live of the user of said components. This also faci
litates <a href="http://en.wikipedia.org/wiki/Code_reuse">code reuse</a> via <a
href="http://en.wikipedia.org/wiki/Modular_programming">modular programming</a>
and generally improves <a href="http://en.wikipedia.org/wiki/Maintainability">ma
intainability</a>.</p> |
| 46 |
| 47 <p>Unfortunately these complex structures, improved comfort for the library user
and commendable flexibility have a flip side: they lead to a lot of additional
work in runtime! You first fill and then parse complex data structures—and this
takes time. You often produce a lot of information on the low levels which is ju
st not used on higher levels—and this work is also not free.</p> |
| 48 |
| 49 <p>New validator is built differently. It only keep around the indispensable min
imum of the information needed to prove (or disprove) that code is safe. Similar
ly to how <a href="http://en.wikipedia.org/wiki/Formula_One_car">F1 car</a> uses
<a href="http://www.youtube.com/watch?v=NsvWnGgT7Ok">custom-designed car seats<
/a> we use custom-designed data structures to push the data from one point of va
lidator to another one. <span title="Actually we collect slightly more then the
bare minimum to make testing possible.">We only collect the bare minimum of the
information</span>—and if the requirements are changing we often change all the
pieces: from <code>gen_dfa</code> input data format to the highest-level <code>d
fa_validate_32.c</code>/<code>dfa_validate_64.c</code> external API adapters.</p
> |
| 50 |
| 51 <p>This streamlining was one of the most important design goals of a new validat
or. And indeed the code which reaches the CPU is very simple: it does not contai
n complex data structures and multilayered functions while all the previous vali
dators had many layers and quite a few complex data structures. How can it be? W
ere all these structures superfluous and unnecessary? Well… not really. New vali
dator throws away all that complexity and trades it for a few comparisons and ju
mps. <b>Tens of thousands comparisons and similar number of jumps</b>, to be exa
ct. In a <b>single flat function</b>. Basically we trade runtime complexity for
build-time complexity. As you can guess it's practically not possible to write s
uch a function by hand—and even if someone will be able to write tens of thousan
ds of lines of code by hand it'll be impossible to inderstand and review. People
s are not CPUs! They can keep track of millions lines of code in complex project
s if these are organized in modules and are nicely separated, but give then fift
y thousand lines of homogeneous code—and they'll be totally lost. But this is ba
sically what we have here in the end product—because CPU loves such code. To sol
ve this dilemma we employ three levels of filters to create the final code.</p> |
| 52 |
| 53 <center><img src="files32.svg" height="90%"/><br />Gray elements are hand-writte
n, white elements are generated and dark-gray are aforementioned mixers.</center
><br /> |
| 54 |
| 55 <p>To understand how validator works it's best to start from function <code>Vali
dateChunkIA32</code> in <code>validator_x86_32.rl</code>. Said function is very
short and “simple”: it allocates couple of arrays (<code>valid_targets</code> an
d <code>jump_dests</code>), then cycles over code passed to it (processing it in
bundle-sized chunks) and at the end it compares valid jump targets and collecte
d jump destinations… that's it. Oh, and it also includes couple of cryptic lines
right in the middle of innermost cycle:<hr /> |
27 <code>%% write init;</code><br /> | 56 <code>%% write init;</code><br /> |
28 <code>%% write exec;</code><hr /> | 57 <code>%% write exec;</code><hr /> |
29 Apparently collection of valid jump targets and actual target destinations happe
ns here. How?</p> | 58 Apparently collection of valid jump targets and actual target destinations happe
ns here. How?</p> |
30 <a name="ragel"></a><blockquote style="background:lightgray; font-size:90%;"> | 59 <h2><div style="float:right"><a href="#TOC">▲</a></div><a name="2">2. What is ra
gel and how it works.</a></h2> |
| 60 |
| 61 <blockquote style="background:lightgray; font-size:90%;"> |
| 62 |
31 <p>To understand that you need to know a little about DFA and Ragel. I'll not ex
plain what the DFA is (it's explained in CS course you've heard years back… or y
ou can refresh you knowleadge on <a href="http://en.wikipedia.org/wiki/Determini
stic_finite_automaton">Wikipedia</a>). But I'll explain a little about Ragel. Ex
tensive documentation with all the gory details is <a href="http://www.complang.
org/ragel/">on Ragel's site</a>, but while it explains <b>how</b> to use Ragel i
t does not explain <b>what</b> it is and <b>why</b> you may want to use it.</p> | 63 <p>To understand that you need to know a little about DFA and Ragel. I'll not ex
plain what the DFA is (it's explained in CS course you've heard years back… or y
ou can refresh you knowleadge on <a href="http://en.wikipedia.org/wiki/Determini
stic_finite_automaton">Wikipedia</a>). But I'll explain a little about Ragel. Ex
tensive documentation with all the gory details is <a href="http://www.complang.
org/ragel/">on Ragel's site</a>, but while it explains <b>how</b> to use Ragel i
t does not explain <b>what</b> it is and <b>why</b> you may want to use it.</p> |
32 | 64 |
33 <p>Let's start with the first question: <b>what</b> it is. Ragel is compiler of
DFA machines… but with a twist. You describe DFA structure using simple <a href=
"http://en.wikipedia.org/wiki/Regular_expression">RE</a>-style format and Ragel
generates the corresponding code in C (D/Go/Java/Ruby/etc: Ragel supports a lot
of laguages, but we are interested in C here). When you describe the DFA you jus
t write acceptable bytes and then use the following operations: concatenation (“
1 . 2” will accept either “1” followed by “2”), union (“1 | 2” will accept eithe
r “1” or “2”), intersection (“('a'..'n') & ('m'..'z')” will accept either “m” or
“n”), difference (“('a'..'n') - ('m'..'z')” will accept everything between “a”
and “l”, but will not accept either “m” or “n”) and kleene star (“(1 | 2)*” will
accept any number of “1” or “2”).</p> | 65 <p>Let's start with the first question: <b>what</b> it is. Ragel is compiler of
DFA machines… but with a twist. You describe DFA structure using simple <a href=
"http://en.wikipedia.org/wiki/Regular_expression">RE</a>-style format and Ragel
generates the corresponding code in C (D/Go/Java/Ruby/etc: Ragel supports a lot
of laguages, but we are interested in C here). When you describe the DFA you jus
t write acceptable bytes and then use the following operations: concatenation (“
1 . 2” will accept either “1” followed by “2”), union (“1 | 2” will accept eithe
r “1” or “2”), intersection (“('a'..'n') & ('m'..'z')” will accept either “m” or
“n”), difference (“('a'..'n') - ('m'..'z')” will accept everything between “a”
and “l”, but will not accept either “m” or “n”) and kleene star (“(1 | 2)*” will
accept any number of “1” or “2”).</p> |
34 | 66 |
35 <p>These operations can produce quite non-trivial result: e.g. “("b" . ("aa"+ |
"aaa"+))*” will produce the following DFA:</p> | 67 <p>These operations can produce quite non-trivial result: e.g. “("b" . ("aa"+ |
"aaa"+))*” will produce the following DFA:</p> |
36 <center><img src="sample1.svg" width="100%"/></center><br /> | 68 <center><img src="sample1.svg" width="100%"/></center><br /> |
37 <p>If, instead of “("aa"+ | "aaa"+)” in the example above you'll use something l
ike “("a"{5}+ | "a"{7}+ | "a"{11}+)” then the resulting DFA will include almost
four hundreds nodes and over five hundreds transitions! This limits applicabilit
y of DFA technology: e.g. it's possible to describe "valid code sequence" (inclu
ding bundles, "restricted registers" and everything else) as a DFA, but… said DF
A will include millions of nodes and billions of transitions!</p> | 69 <p>If, instead of “("aa"+ | "aaa"+)” in the example above you'll use something l
ike “("a"{5}+ | "a"{7}+ | "a"{11}+)” then the resulting DFA will include almost
four hundreds nodes and over five hundreds transitions! This limits applicabilit
y of DFA technology: e.g. it's possible to describe "valid code sequence" (inclu
ding bundles, "restricted registers" and everything else) as a DFA, but… said DF
A will include millions of nodes and billions of transitions!</p> |
38 | 70 |
39 <p><a name="actions">To overcome this problem Ragel offers so-called "actions":
pieces of code which are called when certain pieces in DFA are reached. E.g. we
can mark begin and end of “aa” (or “aaa”) in the example above—“("b" . (("aa" >b
egin @end)+ | ("aaa" >begin @end)+ ))*” produces the following DFA:</a></p> | 71 <h3><div style="float:right"><a href="#TOC">▲</a></div><a name="2-1">2.1. Ragel
actions.</a></h3> |
| 72 |
| 73 <p>To overcome this problem Ragel offers so-called "actions": pieces of code whi
ch are called when certain pieces in DFA are reached. E.g. we can mark begin and
end of “aa” (or “aaa”) in the example above—“("b" . (("aa" >begin @end)+ | ("aa
a" >begin @end)+ ))*” produces the following DFA:</p> |
40 <center><img src="sample2.svg" width="100%"/></center> | 74 <center><img src="sample2.svg" width="100%"/></center> |
41 <p style="margin-bottom:0px;">Let's see what happens if we'll feed it with “baaa
aaaaaa” sequence:</p> | 75 <p style="margin-bottom:0px;">Let's see what happens if we'll feed it with “baaa
aaaaaa” sequence:</p> |
42 <ul style="margin-top:0px;"> | 76 <ul style="margin-top:0px;"> |
43 <li><i>offset 0</i>: <i>nothing</i></li> | 77 <li><i>offset 0</i>: <i>nothing</i></li> |
44 <li><i>offset 1</i>: <code>begin</code></li> | 78 <li><i>offset 1</i>: <code>begin</code></li> |
45 <li><i>offset 2</i>: <code>end</code></li> | 79 <li><i>offset 2</i>: <code>end</code></li> |
46 <li><i>offset 3</i>: <code>begin</code> then <code>end</code></li> | 80 <li><i>offset 3</i>: <code>begin</code> then <code>end</code></li> |
47 <li><i>offset 4</i>: <code>end</code> then <code>begin</code></li> | 81 <li><i>offset 4</i>: <code>end</code> then <code>begin</code></li> |
48 <li><i>offset 5</i>: <code>begin</code></li> | 82 <li><i>offset 5</i>: <code>begin</code></li> |
49 <li><i>offset 6</i>: <code>end</code></li> | 83 <li><i>offset 6</i>: <code>end</code></li> |
(...skipping 15 matching lines...) Expand all Loading... |
65 <li><i>offset 7</i>: <code>begin2</code> then <code>begin3</code></li> | 99 <li><i>offset 7</i>: <code>begin2</code> then <code>begin3</code></li> |
66 <li><i>offset 8</i>: <code>end2</code></li> | 100 <li><i>offset 8</i>: <code>end2</code></li> |
67 <li><i>offset 9</i>: <code>begin2</code> then <code>end3</code></li> | 101 <li><i>offset 9</i>: <code>begin2</code> then <code>end3</code></li> |
68 </ul> | 102 </ul> |
69 <p style="margin-bottom:0px;">Ah-ha. Now everything is clear. DFA is DFA: it doe
s not support memory and it does not support rollbacks. This means that our DFA
it processing two branches simultaneously—both “"aa"+” and “"aaa"+”. We'll need
to keep this in mind. Couple of another observations: </p> | 103 <p style="margin-bottom:0px;">Ah-ha. Now everything is clear. DFA is DFA: it doe
s not support memory and it does not support rollbacks. This means that our DFA
it processing two branches simultaneously—both “"aa"+” and “"aaa"+”. We'll need
to keep this in mind. Couple of another observations: </p> |
70 <ol style="margin-top:0px;"> | 104 <ol style="margin-top:0px;"> |
71 <li>When we used just <code>begin</code> action action <code>begin</code> was ca
lled once, but when we split it in two (<code>begin2</code> and <code>begin3</co
de>) both are called! By default Ragel merges actions.</li> | 105 <li>When we used just <code>begin</code> action action <code>begin</code> was ca
lled once, but when we split it in two (<code>begin2</code> and <code>begin3</co
de>) both are called! By default Ragel merges actions.</li> |
72 <li>Actions are called in non-random order—take a look on <i>offset 4</i>: <code
>end2</code> is called before <code>begin3</code>. That's because <code>begin3</
code> has lower priority than <code>end2</code>! Note that in previous example t
his same effect was observed, but it was quite mysterious there. The closer the
action is to the beginning of the source file the higher it's priority is.</li> | 106 <li>Actions are called in non-random order—take a look on <i>offset 4</i>: <code
>end2</code> is called before <code>begin3</code>. That's because <code>begin3</
code> has lower priority than <code>end2</code>! Note that in previous example t
his same effect was observed, but it was quite mysterious there. The closer the
action is to the beginning of the source file the higher it's priority is.</li> |
73 </ol> | 107 </ol> |
74 </blockquote> | 108 </blockquote> |
75 <h2><div style="float:right"><a href="#TOC">▲</a></div><a name="2">2. “Special”
instructions.</a></h2> | 109 <h2><div style="float:right"><a href="#TOC">▲</a></div><a name="3">3. “Special”
instructions.</a></h2> |
76 <p>Now we can go back to machine description. Our main DFA is the same in all ca
ses, it's “<code>(one_instruction | special_instruction)*</code>”—i.e. it accept
s sequence of “normal” instructions and “special” instructions.</p> | 110 <p>Now we can go back to machine description. Our main DFA is the same in all ca
ses, it's “<code>(one_instruction | special_instruction)*</code>”—i.e. it accept
s sequence of “normal” instructions and “special” instructions.</p> |
77 | 111 |
78 <p>Also, just like in example above there are two actions: first one is triggere
d at the beginning of the <code>instruction</code> (“normal” or “special”)—it's
used to remember the beginning of the instruction, to clear the list of <code>er
rors_detected</code>, and to mark the first byte of the instruction as valid tar
get for the direct jump; second one is triggered at the final byte of the <code>
instruction</code> (“normal” or “special”)—and is used to report errors. And the
re are also one additional action which is declared as “<code>$err</code>”. This
is <i>error fallback action</i>: it's triggered whenever our machine rejects so
me byte (which means we've hit either forbidden instruction like <code>lgdt</cod
e> or some undefined byte sequence… in both cases <code> UNRECOGNIZED_INSTRUCTIO
N</code> error is reported and processing is stopped).</p> | 112 <p>Also, just like in example above there are two actions: first one is triggere
d at the beginning of the <code>instruction</code> (“normal” or “special”)—it's
used to remember the beginning of the instruction, to clear the <code>instructio
n_info_collected</code>, and to mark the first byte of the instruction as valid
target for the direct jump; second one is triggered at the final byte of the <co
de>instruction</code> (“normal” or “special”)—and is used to report errors. And
there are also one additional action which is declared as “<code>$err</code>”. T
his is <i>error fallback action</i>: it's triggered whenever our machine rejects
some byte (which means we've hit either forbidden instruction like <code>lgdt</
code> or some undefined byte sequence… in both cases <code> UNRECOGNIZED_INSTRUC
TION</code> error is reported and processing is stopped).</p> |
79 | 113 |
80 <p>There are three “special” instructions in IA32 case: <code>naclcall</code>, <
code>nacljmp</code> and <code title="mov %gs:0x0,%reg is part of public ABI,
mov %gs:0x4,%reg is used in IRT">mov %gs:0x0/0x4,%reg</code>. The last one is
declared as “special” instruction to simplify the validation logic (and DFA, to
o): instead of accepting all versions of <code>mov %gs:<i>something</i>,%reg</co
de> instruction followed by additional logic which rejects most possibilities (o
nly plain vanialla “zero” is allowed here as per ABI) we only describe this one
version of the instruction and ragel does the rest. <code>naclcall</code> and <c
ode>nacljmp</code> include special action which clears the “valid destination ad
dress” bit (remember the story with <code>begin</code> and <code>end</code> acti
ons above? when first byte of a second half of <code>naclcall</code>/<code>naclj
mp</code> is processed it's processed as <b>both</b> part of the <code>naclcall<
/code>/<code>nacljmp</code> <b>and</b> as a start of a regular instruction, too)
.</p> | 114 <p>There are three “special” instructions in IA32 case: <code>naclcall</code>, <
code>nacljmp</code> and <code title="mov %gs:0x0,%reg is part of public ABI,
mov %gs:0x4,%reg is used in IRT">mov %gs:0x0/0x4,%reg</code>. The last one is
declared as “special” instruction to simplify the validation logic (and DFA, to
o): instead of accepting all versions of <code>mov %gs:<i>something</i>,%reg</co
de> instruction followed by additional logic which rejects most possibilities (o
nly plain vanialla “zero” is allowed here as per ABI) we only describe this one
version of the instruction and ragel does the rest. <code>naclcall</code> and <c
ode>nacljmp</code> include special action which clears the “valid destination ad
dress” bit (remember the story with <code>begin</code> and <code>end</code> acti
ons above? when first byte of a second half of <code>naclcall</code>/<code>naclj
mp</code> is processed it's processed as <b>both</b> part of the <code>naclcall<
/code>/<code>nacljmp</code> <b>and</b> as a start of a regular instruction, too)
.</p> |
81 | 115 |
82 <p>This explains how <code>valid_targets</code> array is filled and invalid inst
ructions are rejected.</p> | 116 <p>This explains how <code>valid_targets</code> array is filled and invalid inst
ructions are rejected.</p> |
83 <h2><div style="float:right"><a href="#TOC">▲</a></div><a name="3">3. “Not so sp
ecial” instructions.</a></h2> | 117 <h2><div style="float:right"><a href="#TOC">▲</a></div><a name="4">4. “Not so sp
ecial” instructions.</a></h2> |
84 <p>But of course there are <code>jump_dests</code>, too. Special instructions do
n't touch it, but something obviously fills the array, isn't it. This can only b
e result of processing of normal instructions, thus we need to go deeper. Where
it all comes from? To understand that we need to look on [autogenerated] <code>v
alidator_x86_32_instruction.rl</code> file. The file looks like this:<hr /> | 118 <p>But of course there are <code>jump_dests</code>, too. Special instructions do
n't touch it, but something obviously fills the array, isn't it. This can only b
e result of processing of normal instructions, thus we need to go deeper. Where
it all comes from? To understand that we need to look on [autogenerated] <code>v
alidator_x86_32_instruction.rl</code> file. The file looks like this:<hr /> |
85 ⋮<br /> | 119 ⋮<br /> |
86 <i>Semi-manual simple helper machines and a
ctions</i><br /> | 120 <i>Semi-manual simple helper machines and a
ctions</i><br /> |
87 ⋮<br /> | 121 ⋮<br /> |
88 <code>one_instruction =</code><br /> | 122 <code>one_instruction =</code><br /> |
89 ⋮<br /> | 123 ⋮<br /> |
90 <code>(branch_hint? 0x77 rel8) |</code><b
r /> | 124 <code>(branch_hint? 0x77 rel8) |</code><b
r /> |
91 <code>(branch_hint? (0x0f 0x87) rel32)&nb
sp;|</code><br /> | 125 <code>(branch_hint? (0x0f 0x87) rel32)&nb
sp;|</code><br /> |
92 ⋮<br /> | 126 ⋮<br /> |
93 <code>((0x0f 0x01 0xd0) @CPUFeature_FXSR)
</code>;<hr /> | 127 <code>((0x0f 0x01 0xd0) @CPUFeature_FXSR)
</code>;<hr /> |
94 </p> | 128 </p> |
95 <code>0x77</code> and <code>0x0f 0x87</code> are opcodes for <code>ja</code
> (aka <code>jnbe</code>) instruction, but what are <code>branch_hint?</code> an
d <code>rel8</code>/<code>rel32</code> are doing here? Well, “<code>?</code>” me
ans “optional” (like in most <a href="http://en.wikipedia.org/wiki/Regular_expre
ssion">RE</a>-engines) and both <code>branch_hint</code> and <code>rel8</code>/<
code>rel32</code> definitions are references to machines defined in the <i>semi-
manual simple helper machines and actions</i> part of <code>validator_x86_32_ins
truction.rl</code> file. The whole construct describes part of the DFA which is
designed to accept <code>ja</code> (aka <code>jnbe</code>) instruction—complete
with optional P4-inspired branch prediction prefix. Definition of <code>branch_h
int</code> is trivial and obvious (“<code>branch_hint = 0x2e | 0x3e;</code>” if
you want to know), but <code>rel8</code>/<code>rel32</code> are somewhat more “i
nteresting”:<hr /> | 129 <code>0x77</code> and <code>0x0f 0x87</code> are opcodes for <code>ja</code
> (aka <code>jnbe</code>) instruction, but what are <code>branch_hint?</code> an
d <code>rel8</code>/<code>rel32</code> are doing here? Well, “<code>?</code>” me
ans “optional” (like in most <a href="http://en.wikipedia.org/wiki/Regular_expre
ssion">RE</a>-engines) and both <code>branch_hint</code> and <code>rel8</code>/<
code>rel32</code> definitions are references to machines defined in the <i>semi-
manual simple helper machines and actions</i> part of <code>validator_x86_32_ins
truction.rl</code> file. The whole construct describes part of the DFA which is
designed to accept <code>ja</code> (aka <code>jnbe</code>) instruction—complete
with optional P4-inspired branch prediction prefix. Definition of <code>branch_h
int</code> is trivial and obvious (“<code>branch_hint = 0x2e | 0x3e;</code>” if
you want to know), but <code>rel8</code>/<code>rel32</code> are somewhat more “i
nteresting”:<hr /> |
96 <code>rel8 = any @rel8_operand;</code><br
/> | 130 <code>rel8 = any @rel8_operand;</code><br
/> |
97 <code>rel32 = any{4} @rel32_operand;</cod
e><hr /> | 131 <code>rel32 = any{4} @rel32_operand;</cod
e><hr /> |
98 It's "more interesting not because it's complex or non-obvious. The interesting
part here is the fact that actions <code>rel8_operand</code>/<code>rel32_operand
</code> are <b>not</b> present in <code>validator_x86_32_instruction.rl</code>,
they are in <code>validator_x86_32.rl</code> file! But the definition itself is
pretty trivial:<hr /> | 132 It's "more interesting not because it's complex or non-obvious. The interesting
part here is the fact that actions <code>rel8_operand</code>/<code>rel32_operand
</code> are <b>not</b> present in <code>validator_x86_32_instruction.rl</code>,
they are in <code>validator_x86_32.rl</code> file! But the definition itself is
pretty trivial:<hr /> |
99 <code>action rel8_operand {</code><br /> | 133 <code>action rel8_operand {</code><br /> |
100 <code>int8_t offset = (uint8_t) (p[0
]);</code><br /> | 134 <code>int8_t offset = (uint8_t) (p[0
]);</code><br /> |
101 <code>size_t jump_dest = offset +&nb
sp;(p - data) + 1;</code><br /><br /> | 135 <code>size_t jump_dest = offset +&nb
sp;(p - data) + 1;</code><br /><br /> |
102 <code>if (!MarkJumpTarget(jump_dest, jump_dest
s, size)) {</code><br /> | 136 <code>if (!MarkJumpTarget(jump_dest, jump_dest
s, size)) {</code><br /> |
103 <code>errors_detected |= DIRECT_JU
MP_OUT_OF_RANGE;</code><br /> | 137 <code>instruction_info_collected |=&nbs
p;DIRECT_JUMP_OUT_OF_RANGE;</code><br /> |
104 <code>}</code><br /> | 138 <code>}</code><br /> |
105 <code>}</code><br /> | 139 <code>}</code><br /> |
106 <code>action rel32_operand {</code><br /> | 140 <code>action rel32_operand {</code><br /> |
107 <code>int32_t offset =</code><br /> | 141 <code>int32_t offset =</code><br /> |
108 <code>(p[-3] + 256U&nb
sp;* (p[-2] + 256U * (p[-1] + 256U *&nbs
p;((uint32_t) p[0]))));</code><br /> | 142 <code>(p[-3] + 256U&nb
sp;* (p[-2] + 256U * (p[-1] + 256U *&nbs
p;((uint32_t) p[0]))));</code><br /> |
109 <code>size_t jump_dest = offset +&nb
sp;(p - data) + 1;</code><br /><br /> | 143 <code>size_t jump_dest = offset +&nb
sp;(p - data) + 1;</code><br /><br /> |
110 <code>if (!MarkJumpTarget(jump_dest, jump_dest
s, size)) {</code><br /> | 144 <code>if (!MarkJumpTarget(jump_dest, jump_dest
s, size)) {</code><br /> |
111 <code>errors_detected |= DIRECT_JU
MP_OUT_OF_RANGE;</code><br /> | 145 <code>instruction_info_collected |=&nbs
p;DIRECT_JUMP_OUT_OF_RANGE;</code><br /> |
112 <code>}</code><br /> | 146 <code>}</code><br /> |
113 <code>}</code><hr /> | 147 <code>}</code><hr /> |
114 We just check if jump target passes preliminary check (direct jump to the outsid
e of the region is always invalid) and that's not so then we detect error <code>
DIRECT_JUMP_OUT_OF_RANGE</code>.</p> | 148 We just check if jump target passes preliminary check (direct jump to the outsid
e of the region is always invalid) and that's not so then we detect error <code>
DIRECT_JUMP_OUT_OF_RANGE</code>.</p> |
115 | 149 |
116 <h2><div style="float:right"><a href="#TOC">▲</a></div><a name="4">4. Features b
eyond minimal validation.</a></h2> | 150 <h2><div style="float:right"><a href="#TOC">▲</a></div><a name="5">5. Features b
eyond minimal validation.</a></h2> |
117 <p style="margin-bottom:0px;">This covers most of the functionality of the valid
ator (we'll discuss the generation of <code>validator_x86_32_instruction.rl</cod
e> file later), but there are still some details not covered here:</p> | 151 <p style="margin-bottom:0px;">This covers most of the functionality of the valid
ator (we'll discuss the generation of <code>validator_x86_32_instruction.rl</cod
e> file later), but there are still some details not covered here:</p> |
118 <ol style="margin-top:0px;"> | 152 <ol style="margin-top:0px;"> |
119 <li><a href="#4-1"><code>CPUID</code> support.</a></li> | 153 <li><a href="#5-1"><code>CPUID</code> support.</a></li> |
120 <li><a href="#4-2">Dynamic code creation support.</a></li> | 154 <li><a href="#5-2">Dynamic code modification support.</a></li> |
121 <li><a href="#4-3">Dynamic code modification support.</a></li> | 155 <ol> |
| 156 <li><a href="#5-2-1">Replacement validation.</a></li> |
| 157 <li><a href="#5-2-2">Replacement copying.</a></li> |
| 158 </ol> |
122 </ol> | 159 </ol> |
123 | 160 |
124 <h3><div style="float:right"><a href="#TOC">▲</a></div><a name="4-1">4.1. <code>
CPUID</code> support.</a></h3> | 161 |
| 162 <h3><div style="float:right"><a href="#TOC">▲</a></div><a name="5-1">5.1. <code>
CPUID</code> support.</a></h3> |
125 | 163 |
126 <p><code>CPUID</code> support is implemented using large set of actions embedded
in definition of instructions (see, e.g. <code>@CPUFeature_FXSR</code> in the l
ine for instruction <code>0x0f 0x01 0xd0</code> AKA <code>xgetbv</code>). CPUID-
related actions are triggered when we know the identity of the instruction (whic
h happens at different times for different instructions: some instructions are d
etected when opcode is read, some use <i>opcode extension</i>, etc—AMD/Intel man
uals contain all the gory details), but the definition for said actions in <code
>validator_x86_32_instruction.rl</code> are very simple<hr /> | 164 <p><code>CPUID</code> support is implemented using large set of actions embedded
in definition of instructions (see, e.g. <code>@CPUFeature_FXSR</code> in the l
ine for instruction <code>0x0f 0x01 0xd0</code> AKA <code>xgetbv</code>). CPUID-
related actions are triggered when we know the identity of the instruction (whic
h happens at different times for different instructions: some instructions are d
etected when opcode is read, some use <i>opcode extension</i>, etc—AMD/Intel man
uals contain all the gory details), but the definition for said actions in <code
>validator_x86_32_instruction.rl</code> are very simple<hr /> |
127 <code>action CPUFeature_FXSR {</code><br /> | 165 <code>action CPUFeature_FXSR {</code><br /> |
128 <code>SET_CPU_FEATURE(CPUFeature_FXSR);</code><br /> | 166 <code>SET_CPU_FEATURE(CPUFeature_FXSR);</code><br /> |
129 <code>}</code><hr /> | 167 <code>}</code><hr /> |
130 This time magic is in <code>validator_internal.h</code>. <code>SET_CPU_FEATURE</
code> is defined as<hr /> | 168 This time magic is in <code>validator_internal.h</code>. <code>SET_CPU_FEATURE</
code> is defined as<hr /> |
131 <code>#define SET_CPU_FEATURE(F) \</code><br /> | 169 <code>if (!(F##_Allowed)) { \</code><br /> |
132 <code>if (!(F)) { \</code><br /> | 170 <code>instruction_info_collected |= UNRECOGNIZED_INSTRUC
TION; \</code><br /> |
133 <code>errors_detected |= CPUID_UNSUPPORTED_INS
TRUCTION; \</code><br /> | 171 <code>} \</code><br /> |
| 172 <code>if (!(F)) { \</code><br /> |
| 173 <code>instruction_info_collected |= CPUID_UNSUPPORTED_IN
STRUCTION; \</code><br /> |
134 <code>}</code><hr /> | 174 <code>}</code><hr /> |
135 IOW: it's pretty straighforward and simple, but there are a twist: <code>CPUFeat
ure_FXSR</code> is not the name of variable, but the name of macrodefinition. Th
is is needed to handle special cases where <code>CPUFeature</code> does not corr
espond to a single <code>CPUID</code> bit. E.g. <code>prefetch</code> instructio
n is available when <b>any one</b> of three bits are set: <code>3DNnow!</code> b
it, deficated <code>Prefetch instruction</code> bit or <code>LongMode</code> bit
. On the other hand <code>vaesenc</code> is available when <b>both</b> <code>AES
</code> and <code>AVX</code> bits are set. And our ABI <a href="http://code.goog
le.com/p/nativeclient/issues/detail?id=2869">permits <code>lzcnt</code> and <cod
e>tzcnt</code> uncoditionally</a> (thus <code>CPUFeature_LZCNT</code> does not c
heck for anything but just returns <code>TRUE</code> in all cases). | 175 IOW: it's pretty straighforward and simple, but there are a twist: <code>CPUFeat
ure_FXSR</code> is not the name of variable, but the name of macrodefinition. Th
is is needed to handle special cases where <code>CPUFeature</code> does not corr
espond to a single <code>CPUID</code> bit. E.g. <code>prefetch</code> instructio
n is available when <b>any one</b> of two bits are set: <span title="AMD documtn
tation also claims it's always available if LongMode bit is set but Intel docume
ntation does not support this assertion."><code>3DNnow!</code> bit or deficated
<code>Prefetch instruction</code> bit</span>. On the other hand <code>vaesenc</c
ode> is available when <b>both</b> <code>AES</code> and <code>AVX</code> bits ar
e set. And our ABI <a href="http://code.google.com/p/nativeclient/issues/detail?
id=2869">permits <code>lzcnt</code> and <code>tzcnt</code> uncoditionally</a> (t
hus <code>CPUFeature_LZCNT</code> does not check for anything but just returns <
code>TRUE</code> in all cases).</p> |
136 </p> | |
137 | 176 |
138 <h3><div style="float:right"><a href="#TOC">▲</a></div><a name="4-2">4.2. Dynami
c code creation support.</a></h3> | 177 <p>Note: there are two CPUID masks: hardcoded one (it can be replaced if you lin
k in different definition of <code>validator_cpuid_features</code> global variab
le in your program) and runtime-supplied one (usually obtained from actual <code
>CPUID</code> call in production, but hardcoded in tests). New instructions are
first added in “production disabled” mode and must pass a security review before
they can be used in Chrome.</p> |
139 | 178 |
140 <p>TBD</p> | 179 <h3><div style="float:right"><a href="#TOC">▲</a></div><a name="5-2">5.2. Dynami
c code modification support.</a></h3> |
141 | 180 |
142 <h3><div style="float:right"><a href="#TOC">▲</a></div><a name="4-3">4.3. Dynami
c code modification support.</a></h3> | 181 <p>Dynamic code modification support is implemented with the help of <code>CALL_
USER_CALLBACK_ON_EACH_INSTRUCTION</code> option. Normally user callback is only
used when some kind of error is detected, but if this option is used then callba
ck is called after <b>each</b> instruction. When that happend callback have all
the information needed to process the instruction: collected errors, information
about immediates, etc.</p> |
143 | 182 |
144 <p>TBD</p> | 183 <p>All that information is squeezed in <code>instruction_info_collected</code> v
ariable. <span title="Note that half of the information does not make sense for
ia32 mode and is not collected by ValidateChunkIA32. It's included for completen
ess.">It has the following format</code>:</p> |
145 | 184 |
146 <h2><div style="float:right"><a href="#TOC">▲</a></div><a name="5">5. Validation
for x86-64 mode.</a></h2> | 185 <table width="100%"><tr><td align="left">31</td><td align="left">30</td><td alig
n="left">29</td><td align="left">28</td><td align="left">27</td><td align="left"
>26</td><td align="left">25</td><td align="left">24</td><td align="left">23</td>
<td align="left">22</td><td align="left">21</td><td align="left">20</td><td alig
n="left">19</td><td align="left">18</td><td align="left">17</td><td align="left"
>16</td><td align="left">15</td><td align="left">14</td><td align="left">13</td>
<td align="left">12</td><td align="right">8</td><td align="left">7</td><td align
="left">6</td><td align="right">5</td><td align="left">4</td><td align="left">3<
/td><td align="right">0</td></tr> |
| 186 <tr><td align="left"> </td><td align="left"> </td><td align="left">&nb
sp;</td><td align="left"> </td><td align="left"> </td><td align="left"
> </td><td colspan="12" align="left" style="border: thin solid black;"><tab
le width="100%"><tr><td align="left">⇤</td><td align="center"><code>VALIDATION_E
RRORS_MASK</code></td><td align="right">⇥</td></table></td><td align="left">&nbs
p;</td><td colspan="2" align="left" width="1%" style="border: thin solid black;
background:lightgray;"><table width="100%"><tr><td align="left">⇤</td><td width=
"1%" align="center"><code>RESTRICTED_REGISTER_MASK</code></td><td align="right">
⇥</td></table></td><td align="left"> </td><td colspan="2" align="left" widt
h="1%" style="border: thin solid black;"><table width="100%"><tr><td align="left
">⇤</td><td width="1%" align="center"><code>RESTRICTED_REGISTER_MASK</code></td>
<td align="right">⇥</td></table></td><td align="left"> </td><td colspan="2"
align="left" width="1%" style="border: thin solid black;"><table width="100%"><
tr><td align="left">⇤</td><td width="1%" align="center"><code>IMMEDIATES_SIZE_MA
SK</code></td><td align="right">⇥</td></table></td></tr> |
| 187 <tr><td style="border: thin solid black; background: gray;" width="1%" align="ce
nter"> 0 </td><td style="border: thin solid black;" width="1%" align="
center"> </td><td style="border: thin solid black;" width="1%"
align="center"> </td><td width="1%" style="border: thin solid b
lack;" align="center"> </td><td style="border: thin solid black
; background: lightgray;" width="1%" align="center"> </td><td s
tyle="border: thin solid black;" width="1%" align="center"> </t
d><td style="border: thin solid black; background: lightgray;" width="1%" align=
"center"> </td><td style="border: thin solid black; background:
lightgray;" width="1%" align="center"> </td><td style="border:
thin solid black; background: lightgray;" width="1%" align="center">  
; </td><td style="border: thin solid black; background: lightgray;" width="
1%" align="center"> </td><td style="border: thin solid black; b
ackground: lightgray;" width="1%" align="center"> </td><td styl
e="border: thin solid black; background: lightgray;" width="1%" align="center">&
nbsp; </td><td style="border: thin solid black; background: lightgray
;" width="1%" align="center"> </td><td style="border: thin soli
d black; background: lightgray;" width="1%" align="center"> </t
d><td style="border: thin solid black; background: lightgray;" width="1%" align=
"center"> </td><td style="border: thin solid black;" width="1%"
align="center"> </td><td style="border: thin solid black;" wid
th="1%" align="center"> </td><td style="border: thin solid blac
k;" width="1%" align="center"> </td><td style="border: thin sol
id black; background: lightgray;" width="1%" align="center"> </
td><td colspan="2" style="border: thin solid black; background: lightgray;" alig
n="center"> </td><td style="border: thin solid black;" width="1
%" align="center"> </td><td colspan="2" style="border: thin sol
id black;" align="center"> </td><td style="border: thin solid b
lack;" width="1%" align="center"> </td><td colspan="2" style="b
order: thin solid black;" align="center"> </td><td>
</td></tr> |
| 188 <tr><td align="left">↑</td><td align="left">↑</td><td align="left">↑</td><td ali
gn="left">↑</td><td align="left">↑</td><td align="left">↑</td><td align="left">↑
</td><td align="left">↑</td><td align="left">↑</td><td align="left">↑</td><td al
ign="left">↑</td><td align="left">↑</td><td align="left">↑</td><td align="left">
↑</td><td align="left">↑</td><td align="left">↑</td><td align="left">↑</td><td a
lign="left">↑</td><td align="left">↑</td><td align="left">↑</td><td align="left"
> </td><td align="left">↑</td><td align="left">↑</td><td align="left"> 
;</td><td align="left">↑</td><td align="left">↑</td><td align="left"> </td>
<td align="left"> </td></tr> |
| 189 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊
</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al
ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">
┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td a
lign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left"
> </td><td align="left">┊</td><td align="left">┊</td><td align="left"> 
;</td><td align="left">┊</td><td align="left" colspan="100" >└ Cumulutive s
ize of <i title="Immediates, displacements, relative offsets.">anyfields</i>.</t
d></tr> |
| 190 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊
</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al
ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">
┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td a
lign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left"
> </td><td align="left">┊</td><td align="left">┊</td><td align="left"> 
;</td><td align="left" colspan="100" >└ <span title="enter, extrq, insertq"
>Instruction has two immediates</a>.</td></tr> |
| 191 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊
</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al
ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">
┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td a
lign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left"
> </td><td align="left">┊</td><td align="left" colspan="100" >└ <span
title="00 == 0 bytes, 01 == 1 bytes, 10 = 2 bytes, 11 = 4 bytes">Instruction dis
placement size</span>.</td></tr> |
| 192 <!--<tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td
align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="lef
t">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><t
d align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="le
ft">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><
td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="l
eft"> </td><td align="left">┊</td><td align="left" colspan="100" >└ <s
pan title="Top half of a last byte of an instruction is fourth register operand,
two remaining bytes are reserved.">Instruction has 2bit immediate operation.</s
pan></td></tr>--> |
| 193 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊
</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al
ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">
┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td a
lign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left"
> </td><td align="left" colspan="100" >└ Instruction has relative offs
et.</td></tr> |
| 194 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊
</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al
ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">
┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td a
lign="left">┊</td><td align="left">┊</td><td align="left" colspan="100" >└
<span style="background: lightgray;">ia32 mode: reserved;</span> amd64 mode: <sp
an title="NO_REG if instruction does not zero-extending one">register, zero-exte
nded by the instruction.</span></td></tr> |
| 195 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊
</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al
ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">
┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td a
lign="left">┊</td><td align="left" colspan="100" >└ <span style="background
: lightgray;">ia32 mode: reserved;</span> amd64 mode: <span title="This means th
at start of this instruction is not a valid jump target.">instruction is valid,
but it accesses memory using register which is zero-extended by previous instruc
tion.</span></td></tr> |
| 196 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊
</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al
ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">
┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td a
lign="left" colspan="100" >└ <span title="Note that all unsupported instruc
tions trigger this error. This includes mov by absolute 64bit address, system in
structions like lidt or even call and jmp used not as part of superinstruction.
If combined with CPUID_UNSUPPORTED_INSTRUCTION it means that instruction is not
yet enabled in validator.">DFA error: invalid instruction. Validation then resum
es from the next bundle.</span></td></tr> |
| 197 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊
</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al
ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">
┊</td><td align="left">┊</td><td align="left">┊</td><td align="left" colspan="10
0" >└ Unaligned direct jump to address outside of given region.</td></tr> |
| 198 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊
</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al
ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">
┊</td><td align="left">┊</td><td align="left" colspan="100" >└ Instruction
is not supported for a given CPUID mask.</td></tr> |
| 199 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊
</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al
ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">
┊</td><td align="left" colspan="100" >└ <span style="background: lightgray;
">ia32 mode: reserved;</span> amd64 mode: base register is not <code>%rbp</code>
, <code>%rsp</code>, or <code>%r15</code>.</td></tr> |
| 200 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊
</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al
ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left"
colspan="100" >└ <span style="background: lightgray;">ia32 mode: reserved;<
/span> amd64 mode: index register is not zero-extended by previous instruction.<
/td></tr> |
| 201 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊
</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al
ign="left">┊</td><td align="left">┊</td><td align="left" colspan="100" >└ <
span style="background: lightgray;">ia32 mode: reserved;</span> amd64 mode: inst
ruction which zero-extends <code>%rbp</code> must be followed by <code>add %r15,
%rbp</code>, <code>lea (%rbp,%r15,1),%rbp</code>, or <code>lea 0x0(%rbp,%r15,1),
%rbp</code>.</td></tr> |
| 202 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊
</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al
ign="left">┊</td><td align="left" colspan="100" >└ <span style="background:
lightgray;">ia32 mode: reserved;</span> amd64 mode: <code>add %r15,%rbp</code>,
<code>lea (%rbp,%r15,1),%rbp</code>, or <code>lea 0x0(%rbp,%r15,1),%rbp</code>
is used after instruction which does not zero-extend <code>%rbp</code>.</td></tr
> |
| 203 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊
</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al
ign="left" colspan="100" >└ <span style="background: lightgray;">ia32 mode:
reserved;</span> amd64 mode: instruction which zero-extends <code>%rsp</code> m
ust be followed by <code>add %r15,%rsp</code> or <code>lea (%rsp,%r15,1),%rsp</c
ode>.</td></tr> |
| 204 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊
</td><td align="left">┊</td><td align="left">┊</td><td align="left" colspan="100
" >└ <span style="background: lightgray;">ia32 mode: reserved;</span> amd64
mode: <code>add %r15,%rsp</code> or <code>lea (%rsp,%r15,1),%rsp</code> is used
after instruction which does not zero-extend <code>%rsp</code>.</td></tr> |
| 205 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊
</td><td align="left">┊</td><td align="left" colspan="100" >└ <span style="
background: lightgray;">ia32 mode: reserved;</span> amd64 mode: <code>%r15b</cod
e>, <code>%r15w</code>, <code>%r15d</code>, or <code>%r15</code> is modified. <c
ode>%r15</code> is untouchable in amd64 mode.</td></tr> |
| 206 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊
</td><td align="left" colspan="100" >└ <span style="background: lightgray;"
>ia32 mode: reserved;</span> amd64 mode: <span title="Note that %ebp is not ment
ioned. It can be modified by a regular instruction. But NEXT instruction must be
special if that happened."><code>%bpl</code>, <code>%bp</code>, or <code>%rbp</
code> is incorrectly modified. Only <code>%rbp</code> can be modified and then o
nly by special instruction.</span></td></tr> |
| 207 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left" c
olspan="100" >└ <span style="background: lightgray;">ia32 mode: reserved;</
span> amd64 mode: <span title="Note that %esp is not mentioned. It can be modifi
ed by a regular instruction. But NEXT instruction must be special if that happen
ed."><code>%spl</code>, <code>%sp</code>, or <code>%rsp</code> is incorrectly mo
dified. Only <code>%rsp</code> can be modified and then only by special instruct
ion.</span></td></tr> |
| 208 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left" colspan="100">└ Bad
<code>call</code> alignment: <code>call</code> must end at the end of the bundl
e, since <code>nacljmp</code> only can jump to aligned address.</span></td></tr> |
| 209 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left" colspan="100">└ <span style="background: l
ightgray;">ia32 mode: reserved;</span> amd64 mode: <span title="Note: in ia32 mo
de all non-special instructions are modifiable.">instruction is modifiable.</spa
n></td></tr> |
| 210 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left" colspan="100">└ Special instruction (uses different validation ru
les from the regular instruction). Can not be changed in ia32bit mode.</td></tr> |
| 211 <tr><td align="left">┊</td><td align="left">┊</td><td align="left" colspan="100"
>└ Last byte is not immediate. It's either <span title="3DNow! instructions
.">opcode</span>, <span title="Some AVX, FMA4, XOP instructions.">register numbe
r</span> or <span title="vpermil2pd and vpermil2ps">register number and two-bit
immediate</span>.</td></tr> |
| 212 <tr><td align="left">┊</td><td align="left" colspan="100">└ Invalid jump ta
rget. When this flag is set <code>instruction_start</code> and <code>instructio
n_end</code> both point to the <b>jump target</b> instruction, not to the <b>jum
p</b> instruction itself.</td></tr> |
| 213 <tr><td align="left" colspan="100">└ Reserved.</td></tr> |
| 214 </table> |
| 215 |
| 216 <p>Using this information you can determine if the given instruction follows <sp
an title="Only “naclcall” and “nacljmp” in ia32 mode.">special rules</span>, if
it includes <span title="Commands like jcc, jmp, loopcc, or call.">relative offs
ets</span>, <span title="Most commands which access memory support displacements
.">displacements</span>, or <span title="Immediates are support by many differen
t commands. They can be combined with displacement if command accesses memory."
>immediates</span>. Tests way use the information collected to precisely separat
e different <i title="Immediates, displacements, relative offsets.">anyfields</i
>, but in production only few bits are used to determine if the instruction can
be changed or not: in ia32 mode only <span title="naclcall and nacljmp">special
instructions</span> can not be changed, while in amd64 situation is the opposite
: <span title="Only “call” and “mov” can be changed.">most instructions can not
be changed</span>.</p> |
| 217 |
| 218 <h4><div style="float:right"><a href="#TOC">▲</a></div><a name="5-2-1">5.2.1. Re
placement validation.</a></h4> |
| 219 |
| 220 <p>As was said <a href="#5-2">above</a> code replacement is not supported by <co
de>ValidateChunkIA32</code> function directly. Instead it's done by higher-level
function in <code>dfa_validate_32.c</code>.</p> |
| 221 |
| 222 <p>It uses <code>CALL_USER_CALLBACK_ON_EACH_INSTRUCTION</code> option to compare
lengths of instructions in two fragments in callback and <code>SPECIAL_INSTRUCT
ION</code> flag passed to callback to make sure special instructions will be unc
hanged.</p> |
| 223 |
| 224 <p>One tricky thing there is handling of relative jumps and calls: if relative j
ump (or call) triggers <code>DIRECT_JUMP_OUT_OF_RANGE</code> <b>but</b> is bit-t
o-bit identical to the original instruction it's accepted anyway: this means tha
t this particular <code>jump</code> (or <code>call</code>) jumps (or calls) some
valid position outside of a given range. If it must be changed then you need to
pass bigger region to the <code>ValidatorCodeReplacement_x86_32</code> function
—<span title="This is, of course, not needed if landing point is bundle-aligned.
">this way validator will have a chance to check the landing place for validity<
/span>.</p> |
| 225 |
| 226 <h4><div style="float:right"><a href="#TOC">▲</a></div><a name="5-2-2">5.2.2. Re
placement copying.</a></h4> |
| 227 |
| 228 <p>As was said <a href="#5-2">above</a> code replacement is not supported by <co
de>ValidateChunkIA32</code> function directly. Instead it's done by higher-level
function in <code>dfa_validate_32.c</code>.</p> |
| 229 |
| 230 <p>This is done by very simple function which uses <code>CALL_USER_CALLBACK_ON_E
ACH_INSTRUCTION</code> option to process instructions one-after-another.</p> |
| 231 |
| 232 <h2><div style="float:right"><a href="#TOC">▲</a></div><a name="6">6. Validation
for x86-64 mode.</a></h2> |
147 | 233 |
148 <p>While validator for ia32 mode is very simple and short (it also produces pret
ty compact code) validator for x86-64 mode is different. It still has all the sa
me properties validator for ia32 mode had (<code>valid_targets</code> and <code>
jump_dests</code> arrays, “normal” and “special” instructions, bundles and <code
>rel8_operand</code>/<code>rel32_operand</code> actions), but it adds quite a fe
w additional twists to the whole scheme.</p> | 234 <p>While validator for ia32 mode is very simple and short (it also produces pret
ty compact code) validator for x86-64 mode is different. It still has all the sa
me properties validator for ia32 mode had (<code>valid_targets</code> and <code>
jump_dests</code> arrays, “normal” and “special” instructions, bundles and <code
>rel8_operand</code>/<code>rel32_operand</code> actions), but it adds quite a fe
w additional twists to the whole scheme.</p> |
149 | 235 |
150 <h3><div style="float:right"><a href="#TOC">▲</a></div><a name="5-1">5.1. “Secon
dary” states.</a></h3> | 236 <p>It's created in a process which is similar to the process which creates the i
a32 validator.</p> |
151 | 237 |
152 <p>First of all: ia32 mode validator had one DFA in it and two arrays which kept
track of the instruction boundaries but x86-64 has few more state variables. Mo
st of them (<code>rex_prefix</code>, <code>vex_prefix2</code>, <code>vex_prefix3
</code>, <code>operand_states</code>, <code>base</code>, and <code>index</code>)
keep track of the instruction parts (and thus they are cleared before each inst
ruction), but one variable called <code>restricted_register</code> is used to ti
e different instructions together. It keeps track of the <code>restricted_regist
er</code> (if any). Note that not all restricted registers are born equal: most
registers can be restricted and then forgotten (if you write to <code>%eax</code
> and do nothing with the value before <code>call</code>), but <code>%esp</code>
and <code>%ebp</code> are exceptions. If you write to the <code>%esp</code> the
n the very next instruction must be <code>add %r15,%rbp</code> or <code>lea (%r1
5,%rbp,1),%rbp</code>. This means that if at the end of a bundle restricted regi
ster is <code>%rsp</code> or <code>%rbp</code> then program is inavlid. For the
same reason if then at beginning of a normal instruction (this includes first in
struction in the “compound”) we see restricted <code>%rsp</code> or restricted <
code>%rbp</code> then it's an error, too. On the other hand few rare special ins
tructions which are used to restore the SFI invariant WRT <code>%rsp</code> or <
code>%rbp</code> will only be accepted if restricted register is <code>%rsp</cod
e> xor <code>%rbp</code> (depending on special instruction).</p> | 238 <center><img src="files64.svg" height="90%"/><br />Gray elements are hand-writte
n, white elements are generated and dark-gray are mixers.</center><br /> |
153 | 239 |
154 <h3><div style="float:right"><a href="#TOC">▲</a></div><a name="5-2">5.2. “Norma
l” instructions.</a></h3> | 240 <h3><div style="float:right"><a href="#TOC">▲</a></div><a name="6-1">6.1. “Secon
dary” states.</a></h3> |
155 | 241 |
156 <p>The hard part is, as before, in the DFA. First of all, main machine is simila
r to what we had in ia32 mode, but subtly different: it's “<code>(normal_instruc
tion | special_instruction)*</code>” now. I.e.: <code>one_instruction</code> is
replaced with <code>normal_instruction</code>. And what is <code>normal_instruct
ion</code>? Why, it's “<code>one_instruction - special_instruction</code>”, of c
ourse! Well… this is unexpected: why will we want to remove <code>special_instru
ction</code>s from <code>normal_instruction</code>s only to add them back? The a
nswer is related to actions: recall how <a href="#actions">actions</a> work. Whe
n we remove <code>special_instruction</code> from <code>one_instruction</code> w
e also remove the associated actions. This important in x86-64 case because some
special instructions are just a normal instructions which are permitted to viol
ate the usual rules! E.g. “special” instruction <code>and $~0x1f,%rsp</code> (wh
ich is used to align the stack pointer) changes the <code>%rsp</code> directly w
hich is usually forbidden, but because of properties of <code>and $xxx,…</code>
(for any <code>$xxx</code> < <code>0</code>) we know that invariants will not
be violated.</p> | 242 <p>First of all: ia32 mode validator had one DFA in it and two arrays which kept
track of the instruction boundaries but x86-64 has few more state variables. Mo
st of them (<code>rex_prefix</code>, <code>vex_prefix2</code>, <code>vex_prefix3
</code>, <code>operand_states</code>, <code>base</code>, and <code>index</code>)
keep track of the instruction parts (and thus they are cleared before each inst
ruction), but one variable called <code>restricted_register</code> is used to ti
e different instructions together. As the name implies it keeps track of the res
tricted register (if any). Note that not all restricted registers are born equal
: most registers can be restricted and then forgotten (if you write to <code>%ea
x</code> and do nothing with the value before <code>call</code>), but <code>%esp
</code> and <code>%ebp</code> are exceptions. If you write to the <code>%esp</co
de> then the very next instruction must be <code>add %r15,%rsp</code> or <code>l
ea (%r15,%rsp,1),%rsp</code>—and <code>%rbp</code> has similar requirements. Thi
s means that if at the end of a bundle restricted register is <code>%rsp</code>
or <code>%rbp</code> then program is invalid. For the same reason if at beginnin
g of a normal instruction (this includes first instruction in the “compound”) we
see restricted <code>%rsp</code> or restricted <code>%rbp</code> then it's an e
rror, too. On the other hand few rare special instructions which are used to res
tore the SFI invariant WRT <code>%rsp</code> or <code>%rbp</code> will only be a
ccepted if restricted register is <code>%rsp</code> xor <code>%rbp</code> (depen
ding on special instruction).</p> |
| 243 |
| 244 <h3><div style="float:right"><a href="#TOC">▲</a></div><a name="6-2">6.2. “Norma
l” instructions.</a></h3> |
| 245 |
| 246 <p>The hard part is, as before, in the DFA. First of all, main machine is simila
r to what we had in ia32 mode, but subtly different: it's “<code>(normal_instruc
tion | special_instruction)*</code>” now. I.e.: <code>one_instruction</code> is
replaced with <code>normal_instruction</code>. And what is <code>normal_instruct
ion</code>? Why, it's “<code>one_instruction - special_instruction</code>”, of c
ourse! Well… this is unexpected: why will we want to remove <code>special_instru
ction</code>s from <code>normal_instruction</code>s only to add them back? The a
nswer is related to actions: recall how <a href="#2-1">actions</a> work. When we
remove <code>special_instruction</code> from <code>one_instruction</code> we al
so remove the associated actions. This is important in x86-64 case because some
special instructions are just a normal instructions which are permitted to viola
te the usual rules! E.g. “special” instruction <code>and $~0x1f,%rsp</code> (whi
ch is used to align the stack pointer) changes the <code>%rsp</code> directly wh
ich is usually forbidden, but because of properties of <code>and $xxx,…</code> (
for any <code>$xxx</code> < <code>0</code>) we know that invariants will not
be violated.</p> |
157 | 247 |
158 <p>This approach works well, but only if violations are detected at the instruct
ion end. E.g. the aforementioned <code>and $~0x1f,%rsp</code> instruction is enc
oded as </code>0x48 0x83 0xe4 0xe0</code> and after we've read </
code>0x48 0x83 0xe4</code> we already know it's normal instruction (op
code <code>0x83</code> means it's <code>and</code>) which writes to <code>%rsp</
code> (<code>0x48 </code><i>opcode</i><code> 0xe4</code> means it's so
me instruction which accepts some kind of immediate and writes to <code>%rsp</co
de>) and we'll signal the error at this point then the fact that later we'll fin
d out it's <code>special_instruction</code> which is accepted anyway will not ma
tter: <code>SPL_MODIFIED</code> error will be triggered which will mean that cod
e is rejected!</p> | 248 <p>This approach works well, but only if violations are detected at the instruct
ion end. E.g. the aforementioned <code>and $~0x1f,%rsp</code> instruction is enc
oded as </code>0x48 0x83 0xe4 0xe0</code> and after we've read </
code>0x48 0x83 0xe4</code> we already know it's normal instruction (op
code <code>0x83</code> means it's <code>and</code>) which writes to <code>%rsp</
code> (<code>0x48 </code><i>opcode</i><code> 0xe4</code> means it's so
me instruction which accepts some kind of immediate and writes to <code>%rsp</co
de>) and we'll signal the error at this point then the fact that later we'll fin
d out it's <code>special_instruction</code> which is accepted anyway will not ma
tter: <code>SPL_MODIFIED</code> error will be triggered which will mean that cod
e is rejected!</p> |
159 | 249 |
160 <p>This means that we can not do an actual conditions checking till the very end
of normal instruction (we can try to process some of them but not all of them b
ut this approach will be quite complex and fragile—not something you want in the
most critical security piece). But there are an exception: memory access. <b>Th
is</b> one is checked inline: memory access outside of “40GiB safe area” is stri
ctly forbidden no matter how “special” the instruction is. That's why it's check
ed immediately after operands discovery. This is how relevant fragment for the <
code>and</code> instruction look like:<hr /> | 250 <p>This means that we can not do an actual conditions checking till the very end
of normal instruction (we can try to process some of them but not all of them b
ut this approach will be quite complex and fragile—not something you want in the
most critical security piece). But there are an exception: memory access. <b>Th
is</b> one is checked inline: memory access outside of “40GiB safe area” is stri
ctly forbidden no matter how “special” the instruction is. That's why it's check
ed immediately after operands discovery. This is how relevant fragment for the <
code>and</code> instruction look like:<hr /> |
161 <code>(0x83 (opcode_2 any* & any&nbs
p;. any* & operand_disp @check_access) imm8 @proce
ss_0_operands) |</code><br /> | 251 <code>(0x83 (opcode_4 any* & any&nbs
p;. any* & operand_disp @check_access) imm8 @proce
ss_0_operands) |</code><br /> |
162 <code>(0x83 (opcode_2 any* & any&nbs
p;. any* & operand_rip @check_access) imm8 @proces
s_0_operands) |</code><br /> | 252 <code>(0x83 (opcode_4 any* & any&nbs
p;. any* & operand_rip @check_access) imm8 @proces
s_0_operands) |</code><br /> |
163 <code>(REX_B? 0x83 (opcode_2 any* &&
nbsp;any . any* & single_register_memory @check_access)
imm8 @process_0_operands) |</code><br /> | 253 <code>(REX_B? 0x83 (opcode_4 any* &&
nbsp;any . any* & single_register_memory @check_access)
imm8 @process_0_operands) |</code><br /> |
164 <code>(REX_X? 0x83 (opcode_2 any* &&
nbsp;any . any* & operand_sib_pure_index @check_access)
imm8 @process_0_operands) |</code><br /> | 254 <code>(REX_X? 0x83 (opcode_4 any* &&
nbsp;any . any* & operand_sib_pure_index @check_access)
imm8 @process_0_operands) |</code><br /> |
165 <code>(REX_XB? 0x83 (opcode_2 any* &
any . any* & operand_sib_base_index @check_access
) imm8 @process_0_operands) |</code><br /> | 255 <code>(REX_XB? 0x83 (opcode_4 any* &
any . any* & operand_sib_base_index @check_access
) imm8 @process_0_operands) |</code><br /> |
166 <code>(lock 0x83 (opcode_2 any* &&nb
sp;any . any* & operand_disp @check_access) imm8&n
bsp;@process_0_operands) |</code><br /> | 256 <code>(lock 0x83 (opcode_4 any* &&nb
sp;any . any* & operand_disp @check_access) imm8&n
bsp;@process_0_operands) |</code><br /> |
167 <code>(lock 0x83 (opcode_2 any* &&nb
sp;any . any* & operand_rip @check_access) imm8&nb
sp;@process_0_operands) |</code><br /> | 257 <code>(lock 0x83 (opcode_4 any* &&nb
sp;any . any* & operand_rip @check_access) imm8&nb
sp;@process_0_operands) |</code><br /> |
168 <code>(lock REX_B? 0x83 (opcode_2 an
y* & any . any* & single_register_memory @che
ck_access) imm8 @process_0_operands) |</code><br /> | 258 <code>(lock REX_B? 0x83 (opcode_4 an
y* & any . any* & single_register_memory @che
ck_access) imm8 @process_0_operands) |</code><br /> |
169 <code>(lock REX_X? 0x83 (opcode_2 an
y* & any . any* & operand_sib_pure_index @che
ck_access) imm8 @process_0_operands) |</code><br /> | 259 <code>(lock REX_X? 0x83 (opcode_4 an
y* & any . any* & operand_sib_pure_index @che
ck_access) imm8 @process_0_operands) |</code><br /> |
170 <code>(lock REX_XB? 0x83 (opcode_2 a
ny* & any . any* & operand_sib_base_index @ch
eck_access) imm8 @process_0_operands) |</code><br /> | 260 <code>(lock REX_XB? 0x83 (opcode_4 a
ny* & any . any* & operand_sib_base_index @ch
eck_access) imm8 @process_0_operands) |</code><br /> |
171 <code>(REX_B? 0x83 (opcode_2 @operand0_32
bit any* & modrm_registers @operand0_from_modrm_rm) imm
8 @process_1_operands) |</code><hr /> | 261 <code>(REX_B? 0x83 (opcode_4 @operand0_32
bit any* & modrm_registers @operand0_from_modrm_rm) imm
8 @process_1_operand) |</code><hr /> |
172 As you can see <code>check_access</code> is triggered after parsing ModRM/SIB by
tes, but before parsing <code>imm<i>NN</i></code> field while <code>process_<i>N
</i>_operands</code> action is triggered at the very end of the “normal” instruc
tion. Even if instruction does not use <code>imm<i>NN</i></code> field <code>che
ck_access</code> action is <b>still</b> triggerded before <code>process_<i>N</i>
_operands</code> action. This is important because <code>check_access</code> act
ion actually depends on <b>previous</b> state of “secondary” DFA while <code>pro
cess_<i>N</i>_operands</code> action does the transtions of “secondary” DFA. Not
e that it's only triggered for “normal” instructions—“special” instructions eith
er do the work themselves (e.g. <code>add %r15,%rsp</code>—which is only valid i
f previous state of “secondary” DFA was <code>REG_RSP</code> and moves DFA to <c
ode>kNoRestrictedReg</code> in case of succcess) or call the usual <code>process
_<i>N</i>_operands</code> action (e.g. <code>mov %rsp,%rbp</code> calls <code>pr
ocess_0_operands</code> which ensures that this operation is not called in <code
>REG_RSP</code>/<code>REG_RBP</code> “secondary” DFA state and transtions it to
<code>kNoRestrictedReg</code> state).</p> | 262 As you can see <code>check_access</code> is triggered after parsing ModRM/SIB by
tes, but before parsing <code>imm<i>NN</i></code> field while <code>process_<i>N
</i>_operands</code> action is triggered at the very end of the “normal” instruc
tion. Even if instruction does not use <code>imm<i>NN</i></code> field <code>che
ck_access</code> action is <b>still</b> triggerded before <code>process_<i>N</i>
_operands</code> action. This is important because <code>check_access</code> act
ion actually depends on <b>previous</b> state of <code>restricted_register</code
> variable while <code>process_<i>N</i>_operands</code> action changes <code>res
tricted_register</code> variable. Note that it's only triggered for “normal” ins
tructions—“special” instructions either do the work themselves (e.g. <code>add %
r15,%rsp</code>—which is only valid if previous state of <code>restricted_regist
er</code> variable was <code>REG_RSP</code> and changes it to <code>NO_REG</code
> in case of succcess) or call the usual <code>process_<i>N</i>_operands</code>
action (e.g. <code>mov %rsp,%rbp</code> calls <code>process_0_operands</code> wh
ich ensures that this operation is not called when <code>restricted_register</co
de> is set to <code>REG_RSP</code>/<code>REG_RBP</code> state and transtions it
to <code>NO_REG</code> state).</p> |
173 | 263 |
174 <p>You can find yet another suprising thing in the snipped above: <code>and</cod
e> instruction is handled either as instruction with zero operands or as instruc
tion with one operand… but of course in reality it always has two operands! Some
thing is strange here… Well, sure: the decoder part of validator is as streamlin
ed as possible. We just ignore all non-register arguments and arguments which ar
e not written to (but we <b>don't</b> ignore memory accesses if they happen here
, of course). That's why <code>and</code> has either one or zero operands as far
as validator is concerned.</p> | 264 <p>You can find yet another suprising thing in the snippet above: <code>and</cod
e> instruction is handled either as instruction with zero operands or as instruc
tion with one operand… but of course in reality it always has two operands! Some
thing is strange here… Well, sure: the decoder part of validator is as streamlin
ed as possible. We just ignore all non-register arguments and arguments which ar
e not written to (but we <b>don't</b> ignore memory accesses if they happen here
, of course). That's why <code>and</code> has either one or zero operands as far
as validator is concerned.</p> |
175 | 265 |
176 <h3><div style="float:right"><a href="#TOC">▲</a></div><a name="5-3">5.3. Operan
ds handling.</a></h3> | 266 <h3><div style="float:right"><a href="#TOC">▲</a></div><a name="6-3">6.3. Operan
ds handling.</a></h3> |
177 | 267 |
178 <p>Operands handling as, again, is not that complex… if you are familiar with bi
t operations. Initial version of the validator used simple array of records to s
tore the information and everything worked well… with GCC, that is. MSVC produce
d awful code which was almost 30% slower and also needed twenty minutes to do so
thus we replaced this simple version with current macro-based one.</p> | 268 <p>Operands handling as, again, is not that complex… if you are familiar with bi
t operations. Initial version of the validator used simple array of records to s
tore the information and everything worked well… with GCC, that is. MSVC produce
d awful code which was almost 30% slower and also needed twenty minutes to do so
thus we replaced this simple version with the current macro-based one.</p> |
179 | 269 |
180 <p>All the information about encountered operands is collected in a single scala
r variable <code>operand_states</code>. The layout of said variable looks like t
his:</p> | 270 <p>All the information about encountered operands is collected in a single scala
r variable <code>operand_states</code>. The layout of said variable looks like t
his:</p> |
181 <table width="100%"><tr><td align="left">63</td><td align="right">39</td><td ali
gn="left">38</td><td align="right">37</td><td align="left">36</td><td align="rig
ht">32</td><td align="center">31</td><td align="left">30</td><td align="right">2
9</td><td align="left">28</td><td align="right">24</td><td align="center">23</td
><td align="left">22</td><td align="right">21</td><td align="left">20</td><td al
ign="right">16</td><td align="center">15</td><td align="left">14</td><td align="
right">13</td><td align="left">12</td><td align="right">8</td><td align="center"
>7</td><td align="left">6</td><td align="right">5</td><td align="left">4</td><td
align="right">0</td></tr> | 271 <table width="100%"><tr><td align="left">63</td><td align="right">39</td><td ali
gn="left">38</td><td align="right">37</td><td align="left">36</td><td align="rig
ht">32</td><td align="center">31</td><td align="left">30</td><td align="right">2
9</td><td align="left">28</td><td align="right">24</td><td align="center">23</td
><td align="left">22</td><td align="right">21</td><td align="left">20</td><td al
ign="right">16</td><td align="center">15</td><td align="left">14</td><td align="
right">13</td><td align="left">12</td><td align="right">8</td><td align="center"
>7</td><td align="left">6</td><td align="right">5</td><td align="left">4</td><td
align="right">0</td></tr> |
182 <tr><td colspan="2" style="border: thin solid black" width="100%" align="center"
>padding</td><td colspan="2" style="border: thin solid black" align="center">ope
rand4:<br />register_type</td><td colspan="2" style="border: thin solid black" a
lign="center">operand4:<br />register_name</td><td style="border: thin solid bla
ck" align="center">padding</td><td colspan="2" style="border: thin solid black"
align="center">operand3:<br />register_type</td><td colspan="2" style="border: t
hin solid black" align="center">operand3:<br />register_name</td><td style="bord
er: thin solid black" align="center">padding</td><td colspan="2" style="border:
thin solid black" align="center">operand2:<br />register_type</td><td colspan="2
" style="border: thin solid black" align="center">operand2:<br />register_name</
td><td style="border: thin solid black">padding</td><td colspan="2" style="borde
r: thin solid black" align="center">operand1:<br />register_type</td><td colspan
="2" style="border: thin solid black" align="center">operand1:<br />register_nam
e</td><td style="border: thin solid black" align="center">padding</td><td colspa
n="2" style="border: thin solid black" align="center">operand0:<br />register_ty
pe</td><td colspan="2" style="border: thin solid black" align="center">operand0:
<br />register_name</td></tr><tr><td></td><td></td><td></td><td></td><td colspan
="2"> ↖<br /> 0 if normal<br /> &nb
sp;register</td><td></td><td></td><td></td><td colspan="2"> ↖<br /> &n
bsp; 0 if normal<br /> register</td><td></td><
td></td><td></td><td colspan="2"> ↖<br /> 0 if norma
l<br /> register</td><td></td><td></td><td></td><td colsp
an="2"> ↖<br /> 0 if normal<br /> &
nbsp;register</td><td></td><td></td><td></td><td colspan="2"> ↖<br />
0 if normal<br /> register</td></tr></t
able> | 272 <tr><td colspan="2" style="border: thin solid black;" width="100%" align="center
">padding</td><td colspan="2" style="border: thin solid black;" align="center">o
perand4:<br />register_type</td><td colspan="2" style="border: thin solid black;
" align="center">operand4:<br />register_name</td><td style="border: thin solid
black;" align="center">padding</td><td colspan="2" style="border: thin solid bla
ck;" align="center">operand3:<br />register_type</td><td colspan="2" style="bord
er: thin solid black;" align="center">operand3:<br />register_name</td><td style
="border: thin solid black;" align="center">padding</td><td colspan="2" style="b
order: thin solid black;" align="center">operand2:<br />register_type</td><td co
lspan="2" style="border: thin solid black;" align="center">operand2:<br />regist
er_name</td><td style="border: thin solid black;">padding</td><td colspan="2" st
yle="border: thin solid black;" align="center">operand1:<br />register_type</td>
<td colspan="2" style="border: thin solid black;" align="center">operand1:<br />
register_name</td><td style="border: thin solid black;" align="center">padding</
td><td colspan="2" style="border: thin solid black;" align="center">operand0:<br
/>register_type</td><td colspan="2" style="border: thin solid black;" align="ce
nter">operand0:<br />register_name</td></tr> |
| 273 <tr><td></td><td></td><td></td><td></td><td colspan="2"> ↖<br />  
; 0 if normal<br /> register</td><td></td><td>
</td><td></td><td colspan="2"> ↖<br /> 0 if normal<b
r /> register</td><td></td><td></td><td></td><td colspan=
"2"> ↖<br /> 0 if normal<br /> &nbs
p;register</td><td></td><td></td><td></td><td colspan="2"> ↖<br /> &nb
sp; 0 if normal<br /> register</td><td></td><t
d></td><td></td><td colspan="2"> ↖<br /> 0 if normal
<br /> register</td></tr></table> |
183 | 274 |
184 <p>Register names are defined in <code>register_name</code> enum: first 16 are i
dentical to the AMD/Intel names (from <code>REG_RAX</code> to <code>REG_R15</cod
e>) while other 16 are used (partially) to describe non-register operands (memor
y operand, immediate operand, <code>REG_RIP</code> and <code>REG_RIZ</code>, etc
). This means that if operand's name is >15 then it can be ignored. There are
only four operand types: <code>OperandSandboxIrrelevant</code>, <code>OperandSa
ndbox8bit</code>, <code>OperandSandboxRestricted</code>, and <code>OperandSandbo
xUnrestricted</code>. First type is something not related to general purpose reg
ister (x87, MMX, XMM, or YMM registers fall unto this category). We need to hand
le 8bit operands specially because they are finicky: if <code>REX</code> byte is
used they access <code>%spl</code>, <code>%bps</code>, <code>%sil</code>, and <
code>%dil</code>, but when <code>REX</code> byte is not used the same numbers ar
e reused for <code>%ah</code>, <code>%ch</code>, <code>%dh</code>, and <code>%bh
</code>! Last two types are the most important: these are 32bit operands (which
will make the appropriate register “frestricted”) or 16bit/64bit operands (these
may affect register in question negatively if that's <code>%rbp</code>, <code>%
rsp</code>, or <code>%r15</code>, but for other registers these are just ignored
). Note that if you assign <code>0</code> to this variable then all operands wil
l be of <code>OperandSandboxIrrelevant</code> type.</p> | 275 <p>Register names are defined in <code>register_name</code> enum: first 16 are i
dentical to the AMD/Intel names (from <code>REG_RAX</code> to <code>REG_R15</cod
e>) while other 16 are used (partially) to describe non-register operands (memor
y operand, immediate operand, <code>REG_RIP</code> and <code>REG_RIZ</code>, etc
). This means that if operand's name is >15 then it can be ignored. There are
only four operand types: <code>OperandSandboxIrrelevant</code>, <code>OperandSa
ndbox8bit</code>, <code>OperandSandboxRestricted</code>, and <code>OperandSandbo
xUnrestricted</code>. First type is something not related to general purpose reg
ister (x87, MMX, XMM, or YMM registers fall unto this category). We need to hand
le 8bit operands specially because they are finicky: if <code>REX</code> byte is
used they access <code>%spl</code>, <code>%bps</code>, <code>%sil</code>, and <
code>%dil</code>, but when <code>REX</code> byte is not used the same numbers ar
e reused for <code>%ah</code>, <code>%ch</code>, <code>%dh</code>, and <code>%bh
</code>! Last two types are the most important: these are 32bit operands (which
will make the appropriate register “restricted”) or 16bit/64bit operands (these
may affect register in question negatively if that's <code>%rbp</code>, <code>%r
sp</code>, or <code>%r15</code>, but for other registers these are just ignored)
. Note that if you assign <code>0</code> to this variable then all operands will
be of <code>OperandSandboxIrrelevant</code> type.</p> |
185 | 276 |
186 <p>Now the set of macroses used to work with operands should look less mysteriou
s:<hr /> | 277 <p>Now the set of macro used to work with operands should look less mysterious:<
hr /> |
187 <code>#define SET_OPERAND_NAME(N, S) operand_states |=
((S) << ((N) << 3))</code><br /> | 278 <code>#define SET_OPERAND_NAME(N, S) operand_states |=
((S) << ((N) << 3))</code><br /> |
188 <code>#define SET_OPERAND_TYPE(N, T) SET_OPERAND_TYPE_ ##&nb
sp;T(N)</code><br /> | 279 <code>#define SET_OPERAND_TYPE(N, T) SET_OPERAND_TYPE_ ##&nb
sp;T(N)</code><br /> |
189 <code>#define SET_OPERAND_TYPE_OperandSize8bit(N) operand_states
|= OperandSandbox8bit << (5 + ((N) <<&
nbsp;3))</code><br /> | 280 <code>#define SET_OPERAND_TYPE_OperandSize8bit(N) operand_states
|= OperandSandbox8bit << (5 + ((N) <<&
nbsp;3))</code><br /> |
190 <code>#define SET_OPERAND_TYPE_OperandSize16bit(N) operand_states 
;|= OperandSandboxUnrestricted << (5 + ((N)
<< 3))</code><br /> | 281 <code>#define SET_OPERAND_TYPE_OperandSize16bit(N) operand_states 
;|= OperandSandboxUnrestricted << (5 + ((N)
<< 3))</code><br /> |
191 <code>#define SET_OPERAND_TYPE_OperandSize32bit(N) operand_states 
;|= OperandSandboxRestricted << (5 + ((N) &l
t;< 3))</code><br /> | 282 <code>#define SET_OPERAND_TYPE_OperandSize32bit(N) operand_states 
;|= OperandSandboxRestricted << (5 + ((N) &l
t;< 3))</code><br /> |
192 <code>#define SET_OPERAND_TYPE_OperandSize64bit(N) operand_states 
;|= OperandSandboxUnrestricted << (5 + ((N)
<< 3))</code><br /> | 283 <code>#define SET_OPERAND_TYPE_OperandSize64bit(N) operand_states 
;|= OperandSandboxUnrestricted << (5 + ((N)
<< 3))</code><br /> |
193 <code>#define CHECK_OPERAND(N, S, T) ((operand_states &
(0xff << ((N) << 3))) == ((S&nbs
p;| (T << 5)) << ((N) << 3)
))</code><hr /> | 284 <code>#define CHECK_OPERAND(N, S, T) ((operand_states &
(0xff << ((N) << 3))) == ((S&nbs
p;| (T << 5)) << ((N) << 3)
))</code><hr /> |
194 Calls like <code>SET_OPERAND_NAME(0, REG_RAX)</code> are used by actions to set
name of the operand (this particular one is used by <code>operand0_rax</code> ac
tion) while calls like <code>SET_OPERAND_TYPE(0, OperandSize2bit)</code> are use
d by actions to set the type of operand (this particular one is used by <code>op
erand0_2bit</code> action). Note that we <b>don't</b> handle 2bit operands in th
e set of macroses above. This is not a mistake: 2bit operands are only ever used
as immediate operands (and then only in two instructions: <code>vpermil2pd</cod
e> and <code>vpermil2ps</code>) and we don't process immediate operands here. If
they will be by some reason left in the <codeo>validator_x86_64_instruction.rl<
/code> file this will lead to the compile-time error, not to some kind of weird
overflow which may [potentially] produce security hole.</p> | 285 Calls like <code>SET_OPERAND_NAME(0, REG_RAX)</code> are used by actions to set
name of the operand (this particular one is used by <code>operand0_rax</code> ac
tion) while calls like <code>SET_OPERAND_TYPE(0, OperandSize2bit)</code> are use
d by actions to set the type of operand (this particular one is used by <code>op
erand0_2bit</code> action). Note that we <b>don't</b> handle 2bit operands in th
e set of macro above. This is not a mistake: 2bit operands are only ever used as
immediate operands (and then only in two instructions: <code>vpermil2pd</code>
and <code>vpermil2ps</code>) and we don't process immediate operands here. If th
ey will be by some reason left in the <code>validator_x86_64_instruction.rl</cod
e> file this will lead to the compile-time error, not to some kind of weird over
flow which may [potentially] produce security hole.</p> |
195 | 286 |
196 <p>Almost all manipulations with <code>operand_states</code> are done using macr
oses described above, but there are one construct in <code>process_<i>N</i>_oper
ands</code> function which accesses the <code>operand_states</code> direfctly:<h
r /> | 287 <p>Almost all manipulations with <code>operand_states</code> are done using macr
o described above, but there are one construct in <code>process_<i>N</i>_operand
s</code> function which accesses the <code>operand_states</code> direfctly:<hr /
> |
197 <code>/* Take 2 bits of operand
type from operand_states as *restricted_register,</cod
e><br /> | 288 <code>/* Take 2 bits of operand
type from operand_states as *restricted_register,</cod
e><br /> |
198 <code>* make sure operand_states&nb
sp;denotes a register (4th bit == 0). */</cod
e><br /> | 289 <code>* make sure operand_states&nb
sp;denotes a register (4th bit == 0). */</cod
e><br /> |
199 <code>} else if ((operand_states &&n
bsp;0x70) == (OperandSandboxRestricted << 5)) {</
code><br /> | 290 <code>} else if ((operand_states &&n
bsp;0x70) == (OperandSandboxRestricted << 5)) {</
code><br /> |
200 <code>*restricted_register = opera
nd_states & 0x0f;</code><br /> | 291 <code>*restricted_register = opera
nd_states & 0x0f;</code><br /> |
201 <code>}</code><hr /> | 292 <code>}</code><hr /> |
202 If you'll take a look on the layout of <code>operand_states</code> then it's pre
tty easy to understand what goes on here: <code>(operand_states & 0x70) == (
OperandSandboxRestricted << 5)</code> yeilds <code>TRUE</code> if and only
if zeroth operand is “normal” register <b>and</b> it's of type <code>OperandSan
dboxRestricted</code>. This is actually central piece of the “Secondary” DFA han
dling—most other pieces just return this “secondary” DFA back to <code>kNoRestri
ctedReg</code> state.</p> | 293 If you'll take a look on the layout of <code>operand_states</code> then it's pre
tty easy to understand what goes on here: <code>(operand_states & 0x70) == (
OperandSandboxRestricted << 5)</code> yeilds <code>TRUE</code> if and only
if zeroth operand is “normal” register <b>and</b> it's of type <code>OperandSan
dboxRestricted</code>. This is actually central piece of the <code>restricted_re
gister</code> handling—most other pieces just return it back to <code>NO_REG</co
de> state.</p> |
203 | 294 |
204 <p>Well… most, but not all. One exception happens in <code>process_<i>N</i>_oper
ands</code> functions: if “secondary” DFA is in <code>kSandboxedRsi</code> state
and we restrict the <code>%rdi</code> register then we go to the <code>kSandbox
edRsiRestrictedRdi</code> state, not to the usual <code>REG_RDI</code> state. Ot
her exceptions are related to “special” instructions: <code>lea (%r15,%rsi,1),%r
si</code> may move us to <code>kSandboxedRsi</code> state and <code>lea (%r15,%r
di,1),%rdi</code> may move us to either <code>kSandboxedRdi</code> or <code>kSan
dboxedRsiSandboxedRdi</code> state.</p> | 295 <h3><div style="float:right"><a href="#TOC">▲</a></div><a name="6-4">6.4. Dynami
c code modification support.</a></h3> |
205 | 296 |
206 <p>Yet another tricky piece of code can be found in <code>check_access</code> fu
nction. It's this piece of code:<hr /> | 297 <p>Dynamic code modification support is implemented similarly to ia32 mode—with
the help of <code>CALL_USER_CALLBACK_ON_EACH_INSTRUCTION</code> option. When tha
t happend callback have all the information needed to process the instruction: c
ollected errors, information about immediates, etc.</p> |
207 <code>if (index == (restricted_register&n
bsp;& 0x1f)) {</code><br /> | |
208 <code>BitmapClearBit(valid_targets, ins
truction_start);</code><br /> | |
209 <code>}</code><hr /> | |
210 This is where we use not the full state of the “secondary” DFA, but just low fiv
e bits (which describe if there are some restricted register and if it exist the
n what register is restricted currently). All other places just use full state o
f “secondary” DFA.</p> | |
211 | 298 |
212 <h2><div style="float:right"><a href="#TOC">▲</a></div><a name="6">6. Decoders.<
/a></h2> | 299 <p>All that information is squeezed in <code>instruction_info_collected</code> v
ariable. It has the following format:</p> |
213 | 300 |
214 <p>The only remaining issue (but a big one) is about generation of the actual de
coders (<code>{decoder,validator}_x86_{32,64}_instruction.rl files)</code>. This
is big part of the whole package, but, thankfully, it happens in significantly
less hostily environment: decoder and validator must work even if they are proce
ssing specially-crafted file created by clever adversary while <code>gen_dfa.cc<
/code> processes data files created by us and should only correcly process certa
in “good” files.</p> | 301 <table width="100%"><tr><td align="left">31</td><td align="left">30</td><td alig
n="left">29</td><td align="left">28</td><td align="left">27</td><td align="left"
>26</td><td align="left">25</td><td align="left">24</td><td align="left">23</td>
<td align="left">22</td><td align="left">21</td><td align="left">20</td><td alig
n="left">19</td><td align="left">18</td><td align="left">17</td><td align="left"
>16</td><td align="left">15</td><td align="left">14</td><td align="left">13</td>
<td align="left">12</td><td align="right">8</td><td align="left">7</td><td align
="left">6</td><td align="right">5</td><td align="left">4</td><td align="left">3<
/td><td align="right">0</td></tr> |
| 302 <tr><td align="left"> </td><td align="left"> </td><td align="left">&nb
sp;</td><td align="left"> </td><td align="left"> </td><td align="left"
> </td><td colspan="12" align="left" style="border: thin solid black;"><tab
le width="100%"><tr><td align="left">⇤</td><td align="center"><code>VALIDATION_E
RRORS_MASK</code></td><td align="right">⇥</td></table></td><td align="left">&nbs
p;</td><td colspan="2" align="left" width="1%" style="border: thin solid black;"
><table width="100%"><tr><td align="left">⇤</td><td width="1%" align="center"><c
ode>RESTRICTED_REGISTER_MASK</code></td><td align="right">⇥</td></table></td><td
align="left"> </td><td colspan="2" align="left" width="1%" style="border:
thin solid black;"><table width="100%"><tr><td align="left">⇤</td><td width="1%"
align="center"><code>RESTRICTED_REGISTER_MASK</code></td><td align="right">⇥</t
d></table></td><td align="left"> </td><td colspan="2" align="left" width="1
%" style="border: thin solid black;"><table width="100%"><tr><td align="left">⇤<
/td><td width="1%" align="center"><code>IMMEDIATES_SIZE_MASK</code></td><td alig
n="right">⇥</td></table></td></tr> |
| 303 <tr><td style="border: thin solid black; background: gray;" width="1%" align="ce
nter"> 0 </td><td style="border: thin solid black;" width="1%" align="
center"> </td><td style="border: thin solid black;" width="1%"
align="center"> </td><td width="1%" style="border: thin solid b
lack;" align="center"> </td><td style="border: thin solid black
;" width="1%" align="center"> </td><td style="border: thin soli
d black;" width="1%" align="center"> </td><td style="border: th
in solid black;" width="1%" align="center"> </td><td style="bor
der: thin solid black;" width="1%" align="center"> </td><td sty
le="border: thin solid black;" width="1%" align="center"> </td>
<td style="border: thin solid black;" width="1%" align="center"> &nbs
p;</td><td style="border: thin solid black;" width="1%" align="center"> &nb
sp; </td><td style="border: thin solid black;" width="1%" align="center">&n
bsp; </td><td style="border: thin solid black;" width="1%" align="cen
ter"> </td><td style="border: thin solid black;" width="1%" ali
gn="center"> </td><td style="border: thin solid black;" width="
1%" align="center"> </td><td style="border: thin solid black;"
width="1%" align="center"> </td><td style="border: thin solid b
lack;" width="1%" align="center"> </td><td style="border: thin
solid black;" width="1%" align="center"> </td><td style="border
: thin solid black;" width="1%" align="center"> </td><td colspa
n="2" style="border: thin solid black;" align="center"> </td><t
d style="border: thin solid black;" width="1%" align="center">
</td><td colspan="2" style="border: thin solid black;" align="center"> &nbs
p; </td><td style="border: thin solid black;" width="1%" align="center">&nb
sp; </td><td colspan="2" style="border: thin solid black;" align="cen
ter"> </td><td> </td></tr> |
| 304 <tr><td align="left">↑</td><td align="left">↑</td><td align="left">↑</td><td ali
gn="left">↑</td><td align="left">↑</td><td align="left">↑</td><td align="left">↑
</td><td align="left">↑</td><td align="left">↑</td><td align="left">↑</td><td al
ign="left">↑</td><td align="left">↑</td><td align="left">↑</td><td align="left">
↑</td><td align="left">↑</td><td align="left">↑</td><td align="left">↑</td><td a
lign="left">↑</td><td align="left">↑</td><td align="left">↑</td><td align="left"
> </td><td align="left">↑</td><td align="left">↑</td><td align="left"> 
;</td><td align="left">↑</td><td align="left">↑</td><td align="left"> </td>
<td align="left"> </td></tr> |
| 305 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊
</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al
ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">
┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td a
lign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left"
> </td><td align="left">┊</td><td align="left">┊</td><td align="left"> 
;</td><td align="left">┊</td><td align="left" colspan="100" >└ Cumulutive s
ize of <i title="Immediates, displacements, relative offsets.">anyfields</i>.</t
d></tr> |
| 306 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊
</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al
ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">
┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td a
lign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left"
> </td><td align="left">┊</td><td align="left">┊</td><td align="left"> 
;</td><td align="left" colspan="100" >└ <span title="enter, extrq, insertq"
>Instruction has two immediates</a>.</td></tr> |
| 307 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊
</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al
ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">
┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td a
lign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left"
> </td><td align="left">┊</td><td align="left" colspan="100" >└ <span
title="00 == 0 bytes, 01 == 1 bytes, 10 = 2 bytes, 11 = 4 bytes">Instruction dis
placement size</span>.</td></tr> |
| 308 <!--<tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td
align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="lef
t">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><t
d align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="le
ft">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><
td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="l
eft"> </td><td align="left">┊</td><td align="left" colspan="100" >└ <s
pan title="Top half of a last byte of an instruction is fourth register operand,
two remaining bytes are reserved.">Instruction has 2bit immediate operation.</s
pan></td></tr>--> |
| 309 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊
</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al
ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">
┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td a
lign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left"
> </td><td align="left" colspan="100" >└ Instruction has relative offs
et.</td></tr> |
| 310 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊
</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al
ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">
┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td a
lign="left">┊</td><td align="left">┊</td><td align="left" colspan="100" >└
<span title="NO_REG if instruction does not zero-extending one">Register, zero-e
xtended by the instruction.</span></td></tr> |
| 311 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊
</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al
ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">
┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td a
lign="left">┊</td><td align="left" colspan="100" >└ <span title="This means
that start of this instruction is not a valid jump target.">Instruction is vali
d, but it access memory using register which is zero-extended by previous instru
ction.</span></td></tr> |
| 312 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊
</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al
ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">
┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td a
lign="left" colspan="100" >└ <span title="Note that all unsupported instruc
tions trigger this error. This includes mov by absolute 64bit address, system in
structions like lidt or even call and jmp used not as part of superinstruction.
If combined with CPUID_UNSUPPORTED_INSTRUCTION it means that instruction is not
yet enabled in validator.">DFA error: invalid instruction. Validation then resum
es from the next bundle.</span></td></tr> |
| 313 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊
</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al
ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">
┊</td><td align="left">┊</td><td align="left">┊</td><td align="left" colspan="10
0" >└ Unaligned direct jump to address outside of given region.</td></tr> |
| 314 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊
</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al
ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">
┊</td><td align="left">┊</td><td align="left" colspan="100" >└ Instruction
is not supported for a given CPUID mask.</td></tr> |
| 315 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊
</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al
ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">
┊</td><td align="left" colspan="100" >└ Base register is not <code>%rbp</co
de>, <code>%rsp</code>, or <code>%r15</code>.</td></tr> |
| 316 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊
</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al
ign="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left"
colspan="100" >└ Index register is not zero-extended by previous instructio
n.</td></tr> |
| 317 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊
</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al
ign="left">┊</td><td align="left">┊</td><td align="left" colspan="100" >└ I
nstruction which zero-extends <code>%rbp</code> must be followed by <code>add %r
15,%rbp</code>, <code>lea (%rbp,%r15,1),%rbp</code>, or <code>lea 0x0(%rbp,%r15,
1),%rbp</code>.</td></tr> |
| 318 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊
</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al
ign="left">┊</td><td align="left" colspan="100" >└ <code>add %r15,%rbp</cod
e>, <code>lea (%rbp,%r15,1),%rbp</code>, or <code>lea 0x0(%rbp,%r15,1),%rbp</cod
e> is used after instruction which does not zero-extend <code>%rbp</code>.</td><
/tr> |
| 319 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊
</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td al
ign="left" colspan="100" >└ Instruction which zero-extends <code>%rsp</code
> must be followed by <code>add %r15,%rsp</code> or <code>lea (%rsp,%r15,1),%rsp
</code>.</td></tr> |
| 320 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊
</td><td align="left">┊</td><td align="left">┊</td><td align="left" colspan="100
" >└ <code>add %r15,%rsp</code> or <code>lea (%rsp,%r15,1),%rsp</code> is u
sed after instruction which does not zero-extend <code>%rsp</code>.</td></tr> |
| 321 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊
</td><td align="left">┊</td><td align="left" colspan="100" >└ <code>%r15b</
code>, <code>%r15w</code>, <code>%r15d</code>, or <code>%r15</code> is modified.
<code>%r15</code> is untouchable in amd64 mode.</td></tr> |
| 322 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left">┊
</td><td align="left" colspan="100" >└ <span title="Note that %ebp is not m
entioned. It can be modified by a regular instruction. But NEXT instruction must
be special if that happened."><code>%bpl</code>, <code>%bp</code>, or <code>%rb
p</code> is incorrectly modified. Only <code>%rbp</code> can be modified and the
n only by special instruction.</span></td></tr> |
| 323 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left">┊</td><td align="left" c
olspan="100" >└ <span title="Note that %esp is not mentioned. It can be mod
ified by a regular instruction. But NEXT instruction must be special if that hap
pened."><code>%spl</code>, <code>%sp</code>, or <code>%rsp</code> is incorrectly
modified. Only <code>%rsp</code> can be modified and then only by special instr
uction.</span></td></tr> |
| 324 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left">┊</td><td align="left" colspan="100">└ Bad
<code>call</code> alignment: <code>call</code> must end at the end of the bundl
e, since <code>nacljmp</code> only can jump to aligned address.</span></td></tr> |
| 325 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left">┊</td><td align="left" colspan="100">└ <span title="amd64 mode: i
n ia32 mode all non-special instructions are modifiable">Instruction is modifiab
le.</span></td></tr> |
| 326 <tr><td align="left">┊</td><td align="left">┊</td><td align="left">┊</td><td ali
gn="left" colspan="100">└ Special instruction (uses different validation ru
les from the regular instruction). Can not be changed in ia32bit mode.</td></tr> |
| 327 <tr><td align="left">┊</td><td align="left">┊</td><td align="left" colspan="100"
>└ Last byte is not immediate. It's either <span title="3DNow! instructions
.">opcode</span>, <span title="Some AVX, FMA4, XOP instructions.">register numbe
r</span> or <span title="vpermil2pd and vpermil2ps">register number and two-bit
immediate</span>.</td></tr> |
| 328 <tr><td align="left">┊</td><td align="left" colspan="100">└ Invalid jump ta
rget. When this flag is set <code>instruction_start</code> and <code>instructio
n_end</code> both point to the <b>jump target</b> instruction, not to the <b>jum
p</b> instruction itself.</td></tr> |
| 329 <tr><td align="left" colspan="100">└ Reserved.</td></tr> |
| 330 </table> |
| 331 |
| 332 <p>Using this information you can determine if the given instruction follows <sp
an title="A lot of different commands in amd64 mode: %rbp/%rsp modifications, st
ring instructions, “naclcall”, and “nacljmp”.">special rules</span>, if it inclu
des <span title="Commands like “jcc”, “jmp”, “loopcc”, or “call”.">relative offs
ets</span>, <span title="Most commands which access memory support displacements
.">displacements</span>, or <span title="Immediates are support by many differen
t commands. They can be combined with displacement if command accesses memory."
>immediates</span>. Tests way use the information collected to precisely separat
e different <i title="Immediates, displacements, relative offsets.">anyfields</i
>, but in production only few bits are used to determine if the instruction can
be changed or not: in amd64 mode <span title="Only “call” and “mov” can be chang
ed.">most instructions can not be changed</span>—and then only <i title="Immedia
tes, displacements, relative offsets.">anyfields</i>.</p> |
| 333 |
| 334 <h4><div style="float:right"><a href="#TOC">▲</a></div><a name="6-4-1">6.4.1. Re
placement validation.</a></h4> |
| 335 |
| 336 <p>As was said <a href="#6-4">above</a> code replacement is not supported by <co
de>ValidateChunkAMD64</code> function directly. Instead it's done by higher-leve
l function in <code>dfa_validate_64.c</code>.</p> |
| 337 |
| 338 <p>It uses <code>CALL_USER_CALLBACK_ON_EACH_INSTRUCTION</code> option to compare
lengths of instructions in two fragments in callback and <code>MODIFIABLE_INSTR
UCTION</code> flag passed to callback to make sure that <span title="Currently o
nly “call” and “mov” can be changed.">only few hand-picked intrsuctions can be c
hanged</span>.</p> |
| 339 |
| 340 <p>One tricky thing there is handling of relative jumps and calls: if relative j
ump (or call) triggers <code>DIRECT_JUMP_OUT_OF_RANGE</code> <b>but</b> is bit-t
o-bit identical to the original instruction it's accepted anyway: this means tha
t this particular <code>jump</code> (or <code>call</code>) jumps (or calls) some
valid position outside of a given range. If it must be changed then you need to
pass bigger region to the <code>ValidatorCodeReplacement_x86_64</code> function
—<span title="This is, of course, not needed if landing point is bundle-aligned.
">this way validator will have a chance to check the landing place for validity<
/span>.</p> |
| 341 |
| 342 <p style="margin-bottom:0px;">Another tricky bit is related to detection of <i t
itle="Immediates, displacements, relative offsets.">anyfields</i> position: most
instructions put them at the end, but some instructions use the last byte for:<
/p> |
| 343 <ul style="margin-top:0px; margin-bottom:0px;"> |
| 344 <li><i>opcode extension</i>: 3DNow! instructions, <code>cmp<i>cc</i>sd</code>/<c
ode>vcmp<i>cc</i>sd</code> and <code>cmp<i>cc</i>ss</code>/<code>vcmp<i>cc</i>ss
</code>, and <code>pclmulqdq</code>/<code>vpclmulqdq</code>.</li> |
| 345 <li><i>fourth register operand</i>: some AVX instructions (such as <code>vblendv
pd</code>/<code>vblendvps</code>), some FMA4 instructions (such as <code>vfmadds
ubpd</code>), and some XOP instructions (such as <code>vpperm</code>).</li> |
| 346 <li><i>fourth register operand</i> <b>and</b> <i>fifth 2-bit immediate operand</
i>: <code>vpermil2pd</code>/<code>vpermil2ps</code>.</li> |
| 347 </ul> |
| 348 <p style="margin-top:0px;">All these instructions set <code>LAST_BYTE_IS_NOT_IMM
EDIATE</code> flag, last form can be distinguished because it sets <span title="
Which actually includes LAST_BYTE_IS_NOT_IMMEDIATE flag"><code>IMMEDIATE_2BIT</c
ode> flag</span>.</p> |
| 349 |
| 350 <h4><div style="float:right"><a href="#TOC">▲</a></div><a name="6-4-2">6.4.2. Re
placement copying.</a></h4> |
| 351 |
| 352 <p>As was said <a href="#6-4">above</a> code replacement is not supported by <co
de>ValidateChunkAMD64</code> function directly. Instead it's done by higher-leve
l function in <code>dfa_validate_64.c</code>.</p> |
| 353 |
| 354 <p>This is done by very simple function which uses <code>CALL_USER_CALLBACK_ON_E
ACH_INSTRUCTION</code> option to process instructions one-after-another.</p> |
| 355 |
| 356 <h2><div style="float:right"><a href="#TOC">▲</a></div><a name="7">7. Decoders.<
/a></h2> |
| 357 |
| 358 <p>The only remaining issue (but a big one) is about generation of the actual de
coders (<code>{decoder,validator}_x86_{32,64}_instruction.rl files)</code>. This
is big part of the whole package, but, thankfully, it happens in significantly
less hostile environment: decoder and validator must work even if they are proce
ssing specially-crafted file created by clever adversary while <code>gen_dfa</co
de> processes data files created by us and should only correcly process certain
“good” files.</p> |
| 359 |
| 360 <p>To understand how it works it's better to start with the decoders. Remember h
ow we've talked about “streamlined data structures”, “indispensable minimum of t
he information”, etc? This approach produces fast and [relatively] simple valida
tor, but it makes it hard to test and debug it. To facilitate testing and debugg
ing we create separate decoders: these return all the information about all the
intructions they can parse and in fact can produce output identical to <a href="
http://sourceware.org/binutils/docs/binutils/objdump.html#objdump">objdump</a>'s
output.</p> |
| 361 |
| 362 <p>They are used to verify the description of the instructions from <code>.def</
code> files—with a special attention to the length of a said instructions.</p> |
| 363 |
| 364 <p>Decoders are created using familiar process.</p> |
| 365 |
| 366 <center><img src="filesdecoder.svg" height="120%"/><br />Gray elements are hand-
written, white elements are generated and dark-gray are mixers.</center><br /> |
| 367 |
| 368 <p></p> |
| 369 <p style="margin-bottom:0px;">There are few big differences between standalone d
ecoders and simplified decoders embedded in <code>ValidateChunkIA32</code>/<code
>ValidateChunkAMD64</code>:</p> |
| 370 <ul style="margin-top:0px;"> |
| 371 <li>Standalone decoders are pretty close to each other (the only differences are
CPU-dictated differences such as REX prefix handling)—simplified decoders are q
uite different (as dictated by appropriate SFI models).</li> |
| 372 <li>Standalone decoders don't have hand-encoded “special” instructions, all the
instructions they can decode come from <code>.def</code> files.</li> |
| 373 <li>Standalone decoders don't squeeze extracted information unto a few flat vari
ables. Instead they use <code>struct instruction</code>—common for both decoders
.</li> |
| 374 </ul> |
| 375 |
| 376 <p>All these facts mean that standalone decoders are singnificantly larger and s
lower—but also much easier to understand. And simplified decoders are using <b>t
he exact same DFA</b> with only some actions changed or omitted.</p> |
215 | 377 |
216 </body> | 378 </body> |
217 | 379 |
OLD | NEW |