I/O Reader http://www.ioreader.com/ Peter Goodman's blog about computer programming. Tue, 10 Sep 2013 22:03:40 GMT en <![CDATA[Granary]]> http://www.ioreader.com/2013/09/10/granary http://www.ioreader.com/2013/09/10/granary#comments Tue, 10 Sep 2013 22:03:40 GMT Peter Goodman 2v The Granary source code is now publically available. I have also started a new blog all about Granary. Hopefully I will get around to publishing the first post soon. The topic of the first post will be about register allocation for the Behavioural Watchpoints framework.

<![CDATA[Python GNU C99 Parser]]> http://www.ioreader.com/2013/02/12/python-gnu-c99-parser http://www.ioreader.com/2013/02/12/python-gnu-c99-parser#comments Tue, 12 Feb 2013 22:44:52 GMT Peter Goodman 2u As part of Granary, I have developed a sort-of GNU C99 type and function declaration parser, which is now hosted on GitHub. I am releasing this code because others might find it useful. There are a number of other implementations out there; however, when I last tried them, none met my exact needs (parsing glibc headers, Darwin libc headers, and Linux kernel headers).

This parser is not particularly novel, and likely contains bugs (I am 100% sure that the cprinter.py file has bugs). However, it has also been very useful.

I welcome feedback, bug fixes, feature requests, or feature additions to this parser from interested third parties.

<![CDATA[Tracking Data with Function Pointers]]> http://www.ioreader.com/2012/10/14/tracking-data-with-function-pointers http://www.ioreader.com/2012/10/14/tracking-data-with-function-pointers#comments Sun, 14 Oct 2012 22:03:02 GMT Peter Goodman 2s Recently I presented a poster at OSDI'12. The poster outlined our use of dynamic binary translation (DBT) for analysing operating system (OS) kernel modules. One novelty of our approach is that we ensure that only module code is analysed; non-module kernel code is never translated. This restriction entails taking control when module code executes (so that it can be translated) and relinquishing control when non-module kernel code executes. To regain control when kernel code invokes module code, we proactively search for and change function pointers in shared data structures.

Proactively changing function pointers that potentially point into module code is achieved by interposing on the interface between modules and the kernel. Modules and the kernel share data structures, and those data structures can contain function pointers. Finding and changing function pointers requires recursively applying a replacement function to the fields of data structures, starting from the "root" function arguments. Without guards in place, this recursive process might not terminate (e.g. a cyclic data structure). In the case of deeply linked data structures (e.g. trees), this recursive process might be expensive. To avoid this expense, we apply the replacement function only to those data structures that have changed.

Suppose we have the following code that defines a function called func_name and a function pointer field called func_ptr in a struct foo.

void func_name(void) {
    printf("hello world!n");

struct foo {
    void (*func_ptr)(void);

int main(void) {
    struct foo bar;
    bar.func_ptr = &func_name;
    return 0;

The code in the main function roughly corresponds to the following objects in memory:

On the left, we have func_ptr, which contains the address (0xBEEF) of the func_name function. When func_ptr is invoked, control transfers into the func_name function. This is signalled by the instruction pointer (%rip) changing to 0xBEEF.

Recall that the goal is to detect changes to data structures. Suppose that we have no control over the allocation (static, stack, or heap) or layout/structure/semantics of the data structures that we want to track. Given these constraints, there does not appear to be a convenient way to embed information inside of a structure.

Two immediate approaches come to mind: i) embed some information in pointers to the data structures that we want to track, or; ii) use a map to associate addresses of data structures to their tracking meta-information.

The first solution is undersirable for four reasons:

  1. One must ensure that all instances of the original pointer are altered.
  2. One must ensure that the altered data structure pointer is correctly used.
  3. The meta-information of distinct instances of the altered pointer might get out of sync.
  4. There is a limited number of useful bits for meta-information in pointers.

The second solution has many desirable properties and would work. However, if one is tracking many data structures, then the cost of maintaining the map might be undesirable.

A third solution exists if we know the types of the fields of the data structures to track. As previously stated, we have no control over the allocation or semantics of a data structure. This implies that we cannot extend the structure to contain meta-information (or a pointer thereof), and that we cannot arbitrarily change values in the structure (lest we break the program semantics). Suppose, however, that we knew that a structure contained a function pointer. Then changing that function is fine, so long as the control-flow behaviour when invoking that function pointer remains the same. Consider the following:

Above, we introduced extra indirection in the form of a jmp instruction at address 0xFEED. When func_ptr is invoked, control transfers to 0xFEED, which then jmps to 0xBEEF.

But, instructions are just another form of data. There is no reason (at least on x86) that we can't just put some meta-information beside the newly inserted jmp. For example:

struct meta {
    byte jmp_code[5] __attribute__((aligned (8)));
} __attribute__((packed));

struct meta *func_name_meta = …;

Suppose that func_name_meta is initialized to a pointer to a struct meta object, and that object is located in executable memory. Futher, suppose that the jmp_code of func_name_meta is intialized to a 5-byte jmp instruction that transfers control to func_name (0xBEEF). Then we can swap func_ptr with func_name_meta and still expect the same control-flow behaviour. Why?

The first five bytes of *func_name_meta are machine code, and the entire structure lives in executable memory. The next N bytes of the struct meta object contain meta-information (used for detecting changes). The address of the struct meta object (func_name_meta) is also the address of the first field (jmp_code) within the object. As a result, replacing a pointer to func_name with func_name_meta is valid insofar as we are changing one code pointer with another. When control transfers to func_name_meta, the jmp instruction in the jmp_code field transfers control to func_name.

Usefully storing and extracting information from the meta-information is convenient:

struct meta *meta_of_bar = (struct meta *) bar.func_ptr;

Again, taking advantage of the layout of struct meta, we can now cast function pointers into struct meta pointers and operate on function pointers as if they were pointers to objects (because they are!).

There is a bit more going on behind the scenes in Granary, in particular: the allocation of the executable code, what meta-information is kept, how untracked objects are detected and handled, and garbage collection of object trackers. However, I think this article has outlined the salient points of the approach, which I believe is more general than simple object tracking. Hopefully you will find this technique as fun/evil as I do!

<![CDATA[Traditional Parsing Methods]]> http://www.ioreader.com/2012/05/09/traditional-parsing-methods http://www.ioreader.com/2012/05/09/traditional-parsing-methods#comments Wed, 09 May 2012 16:13:04 GMT Peter Goodman 2r One parsing technique that I sometimes use is Top Down Operator Precedence Parsing (TDOP). TDOP parsers have been discussed in many other places as well. Unfortunately, I have not seen TDOP described in terms of left-corner parsing (except for a passing comment in this thesis).

The purpose of this post is to set the stage for a later discussion about TDOP parsing. This post will introduce top-down and bottom-up parsing, then combine the two methods to introduce left-corner parsing. Also, the top-down parsing language (TDPL) will be briefly mentioned as its semantics relate to TDOP.

Traditional Parsing Methods

Before getting into TDOP, it's important to have at least some background in non-TDOP parsing methods. This is because TDOP can be understood as a combination of several different parsing methods.

Parsing is a language acceptance problem. That is, a parser is a function that accepts or rejects a string. If a parser accepts a string then we say that string is in some language. The opposite is said of rejection. A string in this case means a sequence of zero or more symbols. In the English language, symbols are Latin/alphabetic characters. In the C programming language, symbols are reserved words, variables, literals, and punctuation (e.g. void, "foo", >, etc.).

Typically, a parser accepts the language generated by a context-free grammar (CFG). CFGs are a formalism for describing some languages. The following is an example CFG that generates simple arithmetic expressions:

E → "(" E ")"
E → A

A → M "+" A
A → M "-" A
A → "-" A
A → M

M → N "×" M
M → N "÷" M
M → N

N → "0"
N → "1"
N → "10"

Note: ignore the unusual placement of the parentheses and the right-associativity of the operators described by the grammar.

The name to the left of the is called a variable or a non-terminal. Something in quotes is called a token, or terminal. Both terminals and non-terminals are considered symbols. Terminals can be thought of as the letters of one's language.

The itself is a relation which says that non-terminal on the left-hand side can generate the language on the right-hand side. This combination is called a production.

Note: the rest of this article will focus on parsing strings from left-to-right. The following examples detailing various parsing methods assume that our parsers alway guess correctly. Finally, we assume that our grammars are ε-free. That is, the right-hand side of a production is never empty (with one exception).

Top-Down Parsing

As its name implies, top-down parsing proceeds top-down. In the case of the above expression grammar, the "top" starts off as E. The action of going "down" involves one of two things:

  1. Replacing a non-terminal with something that it is related to (the right-hand side of ).
  2. Consuming a terminal.

Right-hand sides of productions contain both terminals and non-terminals. In replacing a non-terminal with ones of its right-hand sides, we set up expectations about the structure of later parts of the string. For example, suppose we want to parse "(2 × 3)". Parsing will proceed as follows:

Step Action Expectations Remainder of string
1 start E (2 × 3)
2 replace E → "(" E ")" "(" E ")" (2 × 3)
3 consume "(" E ")" 2 × 3)
4 replace E → A A ")" 2 × 3)
5 replace A → M M ")" 2 × 3)
6 replace M → N "×" M N "×" M ")" 2 × 3)
7 replace N → "2" "2" "×" M ")" 2 × 3)
8 consume "2" "×" M ")" × 3)
9 consume "×" M ")" 3)
10 replace M → N N ")" 3)
11 replace N → "3" "3" ")" 3)
12 consume "3" ")" )
13 consume ")"

If—as a side-effect of parsing a string—one wanted to build a parse tree, then the order of constructing nodes in the parse tree would be as follows:

Top-Down Parsing Language

Brief mention needs to be given to the top-down parsing language (TDPL). The TDPL formalizes the behavior of many top-down parsers. A key difference between a TDPL grammar and a CFG is that productions are totally ordered in a TDPL grammar.

For example, if the productions of the above CFG were totally ordered according to their text order, then a parser cannot try the second production (E → A) without first failing to parse according to the first production (E → "(" E ")").

Bottom-Up Parsing

We can characterize top-down parsers as making "global" decisions. Their expectations about the future structure of the as-of-yet unseen parts of the string are evidence of this. On the other hand, bottom-up parsers operate "locally". That is, they make decisions based only on the structure of the part of the string that they have already seen.

The consequence of local decision making is that bottom-up parsers discover sub-structures of the parsed string before they discover super/structures. In theory, a bottom-up parser has no expectations about the remainder of the string to be parsed. In practice, common bottom-up parsers implicitly make use of top-down information.

Bottom-up parsers typically perform two main actions: shift and reduce.

  1. Shifting is similar to consuming to the extent that our cursor into the string being parsed moves forward by one symbol. This is equivalent to removing the first symbol of the input string.

    Unlike top-down parsers, bottom-up parsers do not maintain a sequence of expectations. Instead, they operate on a partially parsed substring of the input string.

    Shifting involves taking the first symbol from remainder of the input string and appending it to the end of the partially parsed string.
  2. Reducing operates on a suffix of the partially parsed string. A reduction involves taking a suffix of the partially parsed string, matching it against the right-hand side of a production, and then replacing it with the left-hand side of a production (non-terminal).

For example, suppose we want to parse "(2 × 3)". Parsing will proceed as follows:

Step Action Partial parse Remainder of string
1 start (2 × 3)
2 shift "(" "(" 2 × 3)
3 shift "2" "(" "2" × 3)
4 reduce N → "2" "(" N × 3)
5 shift "×" "(" N "×" 3)
6 shift "3" "(" N "×" "3" )
7 reduce N → "3" "(" N "×" N )
8 reduce M → N "(" N "×" M )
9 reduce M → N "×" M "(" M )
10 reduce A → M "(" A )
11 reduce E → A "(" E )
12 shift ")" "(" E ")"
13 reduce E → "(" E ")" E

If—as a side-effect of parsing a string—one wanted to build a parse tree, then the order of constructing nodes in the parse tree would be as follows:

Left-Corner Parsing

Left-corner parsing (LC) is a parsing technique that makes decisions based on top-down and bottom-up information.

In the case of the bottom-up parser above, it appears that we were lucky that the sequence of shifts and reductions ended up reducing the entire string to an E. Strictly speaking, the goal of the above bottom-up parser was exactly that: reduce a string to E. If our expression were very long, then it wouldn't be clear until near the end of a bottom-up parse that our parser might have a chance of reaching its goal of E.

An LC parser attempts to satisfy multiple goals, including the end goal of reducing the string to E. An LC parser predicts substructures present in the remainder of the string, and attempts to parse those sub-structures bottom-up. But the prediction step sets up expectations about the structure of unseen parts of the string, which is a top-down approach.

In fact, LC parsers alternate between bottom-up and top-down parsing. Alternation is possible because an LC parser maintains a list of goals (analogous to our top-down expectations), a list of predictions, and a partial parse of the input string (as in a bottom-up parser). An LC parser operates on its input string and these three lists in the following way:

  1. Repeat:
    1. If the head of the goal list is a terminal, then consume the terminal and shift the first symbol of the remainder of the input string onto the end of the partial parse. If the goal terminal does not match the first symbol of the string then reject.

      If the head of the goal list is a non-terminal, then attempt to reduce a suffix of the partial parse to the to the goal non-terminal. If such a reduction is possible, then remove the non-terminal from the head of goal list and update the partial parse accordingly.

      This step is repeated until the goal list remains unchanged.
    2. If β is the last symbol of the partial parse, then find a production of the form "α → β γ" where γ is a string of zero-or-more symbols. β is said to be a left corner of α. Left corners can be both terminals and non-terminals. If we weren't restricting ourselves to ε-free CFGs, then left corners do not necessarily appear immediately following the "→"!

      Place γ and α on the head of the goal list, so that the first symbol (if any) of γ is our next goal.

      If the goals list is changed then return to the step 1.1.
    3. If neither of the previous two steps changed the goals list, then shift a symbol from the remainder of the input string onto the end of the partial parse.

      If no such symbol can be shifted, then reject the string. Otherwise, return to step 1.2.
  2. Stop when the goal list is empty.

For example, suppose we want to parse "(2 × 3)". Parsing will proceed as follows:

Step Action Goals Partial parse Remainder of string
1 start (2 × 3)
2 (1.1) no change to goals list
(2.2) no change to goals list
shift "(" 2 × 3)
3 (1.1) no change to goals list
corner E"(" E ")" E ")" E "(" 2 × 3)
4 (1.1) no change to goals list
(1.2) no change to goals list
shift "2" E ")" E "(" "2" × 3)
5 (1.1) no change to goals list
corner N"2" N E ")" E "(" "2" × 3)
6 reduce N → "2" E ")" E "(" N × 3)
7 (1.1) no change to goals list
corner MN "×" M "×" M M E ")" E "(" N × 3)
8 consume "×" M M E ")" E "(" N "×" 3)
9 (1.1) no change to goals list
(1.2) no change to goals list
shift "3" M M E ")" E "(" N "×" "3" )
Step Action Goals Partial parse Remainder of string
10 (1.1) no change to goals list
corner N"3" N M M E ")" E "(" N "×" "3" )
11 reduce N → "3" M M E ")" E "(" N "×" N )
12 reduce M → N M E ")" E "(" N "×" M )
13 reduce M → N "×" M E ")" E "(" M )
14 (1.1) no change to goals list
corner AM A E ")" E "(" M )
15 reduce A → M E ")" E "(" A )
16 reduce E → A ")" E "(" E )
17 consume ")" E "(" E ")"
18 reduce E → "(" E ")" E

If—as a side-effect of parsing a string—one wanted to build a parse tree, then the order of constructing nodes in the parse tree would be as follows:

Compared to the other two methods, this seems like a lot of work for nothing! Also, there is some amount of magic happening: recall that we are operating under the assumption that every action taken will be the correct one. In practice, one constructs a table and "cheats" when deciding which actions to take.


Top-down and bottom-up parsing were covered to set the stage for left-corner parsing and the TDPL, which provide context for the behavior of TDOP parsers. My next post will go into TDOP and how it relates to left-corner parsing and the TDPL.

<![CDATA[Symbolic Interpretation]]> http://www.ioreader.com/2012/04/07/symbolic-interpretation http://www.ioreader.com/2012/04/07/symbolic-interpretation#comments Sat, 07 Apr 2012 23:09:05 GMT Peter Goodman 2q Recently I worked on a project for my Optimizing Compilers course. The purpose of this project was to implement Loop-invariant Code Motion and any other compiler optimizations that we choose. The project is competitive because one's mark is based on how one's compiler improves the mean execution time on a small set of static, pre-determined test cases. Given that the test cases do not change, it is natural to specialize one's optimizations to the code being tested. Realistically, this might not be the best approach as code tends to change and compiler optimizations are not always transparent.


So far I have implemented the following optimizations. This post will focus on the last optimization, symbolic interpretation (labeled EVAL).

Copy propagation
Constant folding (with local constant propagation)
Loop-invariant code motion
Dead code elimination (with unreachable code elimination, block merging, and local constant de-duplication)
Common subexpression elimination
Symbolic interpretation (based on abstract interpretation)

These optimizations were arranged into the following pipeline, where dashed edges are followed when a pass changes something and solid edges are followed when no changes are made:

Pipeline of optimization passes


This project uses Stanford's SimpleSUIF compiler infrastructure. SimpleSUIF's intermediate representation (IR) is a linked list of instructions, including such things as basic arithmetic, bitwise operators, memory/constant load/store, and calling/branching operations. The IR is register based, with three register classes: machine, pseudo, and temporary. For our purposes, machine registers are never used. Temporary registers represent single-definition and single-use registers, where both the definition and use (if any) must reside in the same basic block. Temporary registers often hold loaded constants. Pseudo registers behave like general purpose registers. Finally, all registers are typed.

One quirk of how we use SimpleSUIF is that there is no apparent way to access the IR for an arbitrary function within the same compilation unit. As such, interprocedural optimizations such as function inlining and compile-time execution are not possible. This was unfortunate as there was one particular test case that would have benefitted from interprocedural optimization.

Test case

Below is one of the functions in the test case of interest. Two lines are striked out because the dead code elimination optimization pass regards them as useless.

float f1(float b, float c){
   int i;
   float j, k;

   j = c;
   for(i = 0; i < 2; i++) {
      k = b * i;
      j += k;

   return k;

Looking closely at this example, it is clear that only the initialization of i to 0, the last iteration of the loop, and the value of b are important to the output of f1. However, this is difficult to tell from the perspective of the IR without running through the program. With more information (e.g. about loop induction variables or loop dependencies), we might be able to make smarter decision, but only in some really restricted cases. Unfortunately, it's not clear how one should go about "executing" this program in the absence of a particular value for b. This is where symbolic interpretation comes in.

Symbolic interpretation

Symbolic interpretation is similar to local value numbering in that we operate on concrete and symbolic values. For simplicity, I restricted this optimization pass to a subset of the provably pure functions. Because information about other functions was absent, I considered a pure function to be any function that does not:

  • Load from or store to a memory location.
  • Call any functions. Note: this constraint can be relaxed in the case of a recursive function call. The test cases I focused on did not include recursive function calls; however, this method can easily be extended to apply to that case.
  • Copy from one memory location to another memory location.

Thus, a function is considered pure if it depends only on constants, local variables, and function arguments, and performs no operation that could generate a side-effect.

The following control-flow graph (does not include some edges because I am lazy with SVG) is an interactive symbolic executor of the SimpleSUIF-like IR representing the above function. Below I describe how each step of the evaluator is performed.

The symbolic interpreter behaves similarly to something that performs a combination of constant folding and constant propagation, with the exception that when an operation is performed on an expression containing a symbol, a new symbol is generated.

For example, if one performs an ldc operation to load the constant 0 into register t6, then we can assign to t6 the value 0. If a copy (cpy) operation is performed, then the value of the right-hand register is assigned to be the new value of the left-hand register. For example, cpy r3 = t6 assigns to r3 the value 0.

Sometimes a register is used before it is defined. For example, r1 in mul t8 = r1, r3 is never defined in the above code. This is because r1 represents one of the arguments to the function. In this case, r1 is given a new symbolic value that is distinct from every other symbolic value. In the above simulator, the symbolic value assigned to r1 is named r1. The purpose of being able to identify the "origin" of a symbol value will be useful for code generation.

When a symbolic value participates in an expression, as in mul t8 = r1, r3, a new and unique symbolic value is generated that represents the expression. If any of the components of the expression are constants (known at compile time) then we want to store those constants as part of the symbolic expression. For example, in the first iteration of the loop, t8 is assigned the symbolic expression r1 * 0. In the second iteration of the loop, t8 is assigned the symbolic expression r1 * 1.

Something not touched on in this example is a branch that depends on a symbolic value. In this case, we cannot follow the branch as we don't know in which direction it will go at runtime. We are concerned with cases in which we can statically determine the direction of the branch.

Code Generation

The focus of symbolic evaluation has been to end up with some symbolic or constant expression for each register. In fact, for this optimization, only the returned register (r5) ends up being useful. If the returned register contained a constant value then the function is necessarily constant, and so the function's code can be replaced with a ldc followed by a ret.

In the case that the returned register is a symbolic expression, we can walk the expression tree and output for each subexpression the instructions needed to compute that subexpression. The leaves of the expression tree will be symbolic register values (named according to their register) or constants.

Using the above expression tree walking strategy, the symbolic expression of r5 can be converted to the following sequence of instructions:

ldc t1 = 1
mul t2 = r1, t1
ret t2

Here we have generated new registers to hold temporaries, but left symbolic registers alone. This new sequence of instructions takes the place of the old, larger sequence of instruction.