Preliminaries

Most nodes in the parse tree contain symbol references. Uno tags the nodes with flags that indicate how each symbol is being used. The names of the most relevant flags are: The first row of flags give the standard def-use flags. A DEF flag is present, for instance, if the symbol receives a new value at this node, e.g., on the left-hand side of an assignment, and a USE flag is present if the current value of the symbol was evaluated, e.g., on the right-hand side of an assignment. In the expression "y = x++;" the variable x will get a USE and a DEF flag, and y only a DEF flag. The next line gives various type of declaration contexts. All newly declared symbols get the flag DECL at the point of declaration. If array brackets are used, the flag ARRAY_DECL is added. If a star (indicating a pointer) is used, the flag IS_PTR is added. If the declaration appears in the formal parameter list in a function definition, the flag PARAM is used (and this flag will persist across all further appearances of this symbol that follow). If the symbol was declared as an enum constant the flag UNO_CONST is added.

The last line contains three flags that are added if, respectively, the symbol appears inside a condition, as part of the argument of a sizeof() operation, or if the symbol represents a function call.

Two special symbols can be used to match either no flags (NONE), or any possible flag (ANY).

Tracking Symbols: Building State Machines

To track a symbol across several operations, it first has to be selected by UNO and then marked. You can think of the marking operation as the initialization of a state machine for the selected symbols. Markings can be any integer value, and it is easy to think of these markings as state numbers. The initial marking is typically 1, and as the symbol is tracked across relevant operations it's state (marking) can increase until an error condition can be declared. Once we lose interest in a symbol its state machine can be discarded by removing the marking from the symbol. To do so we first again select() the symbol and then unmark() it.

Example

As an example, consider the following UNO property for checking possible dereference operations on uninitialized pointer variables, and the possible use of free() on uninitialized pointer variables (line numbers added). We are looking for uninitialized pointer variables here, so we initialize a state machine for each such variable by selecting all symbols that appear inside a declaration, excluding array variables (ARRAY_DECL, and excluding those variables that are initialized inside the declaration (DEF). This happens on line 3. All these symbols are marked 1, meaning that they are placed in state 1 of a 3-state machine that we will build. The marking happens on line 4.

We could have looked for those declarations that also contain the IS_PTR flag, but this is neither necessary nor sufficient. It is not necessary since we will be looking later for dereference operations, which naturally will only happen for pointer variables anyway (we are assuming a syntactically correct program).
It is not sufficient either because the IS_PTR tag would be missing if a new type was introduced through a typedef definition that would hide the explicit pointer-tag, for instance as in "typedef int * pint; pint x;".

The symbols move from state 1 to state 2 on line 25-28, once they are assigned a new value, as evidenced by the presence of a DEF flag. First we select all symbols with such a DEF tag, and then we intersect that set with the once in state 1 of the state machine -- the resulting symbols get a new mark of 2 on line 27.

The symbols can move back from state 2 to state 1 on lines 35-39, if we detect that the memory pointed to is being freed, with the symbols used in the arguments to this function call. The procedure for the state machine update is again in two main steps: first the symbols in the context of interest are selected (lines 35-36), and then we intersect that selection with the symbols in state machine state 2 (line 37).

On line 7 we select all symbols that are involved in a derefence operation, provided that there is no alias operator as well. We then intersect that set with the set of symbols for which we initiated a state machine, and where the symbols are in state 1 of that state machine: line 8. A dereference of these uninitialized symbols is an error, unless UNO can tell through path-analysis that the symbol is necessary non-zero. This would be the case, for instance, if the dereference was preceded by a null-test.

The other parts of the UNO property definition should now be easy to read. See the full user manual for a more detailed description.


UNO homepage