Verifier Tutorial

by Daniel Wilkerson

Rough Draft

Need to say something about how time works in the declarative world. Refer to it in the VCGlue section. There seem to be three kinds of time:

Must have lots more about how Simplify works and examples using just it by itself. You have to know what is possible, what is not, and what will put it into an infinite loop.

Need a section on conventions for how to translate some common programmer concepts.

Introduction

As anyone knows who has given directions, one small mistake can make a big difference as to whether your friend makes to your house or not. However, after a few tries, most people can make it to their own house quite reliably. This is because people do not simply remember the imperative directions, but also "landmarks", facts they can check at various points to know that progress so far has been without error. Landmarks are a useful complement declarative language to that of the imperative.

Most software is written in the imperative mood: fancy rearrangements of "do". However, a computer is like an idiot savant friend: he will follow the directions very reliably but knows not what they mean; if they say to go off a cliff, so he will without hesitation. Such a situation is not a recipe for producing reliable software. What is needed is a declarative complement language to the usual imperative code.

The imperative and declarative interact: the declarative says what we know about the state of the world, the imperative run in that context changes that state to the next one. Many correctness properties can be stated as declarative sentences that are always true. The ultimate goal of adding declarative language is to check that the imperative will never cause that sentence to be false.

Our earlier landmark story elided one important distinction: dynamic versus static time. People reading directions can use landmarks as they are driving; this is called "dynamic" time. People checking directions on a piece of paper can only check them versus, say, a map, as they are not actually driving yet; this is called "static" time. Imposing the requirement that it be possible to verify the correctness of directions at static time is stronger, since we know less. A static time system also requires no modification to the directions themselves as run-time checks would. Though note that in a world where error and noise may added to the directions as they are run, dynamic landmarks would be indispensable.

Imperative programming languages that engineers actually use (as opposed to academic fantasies) evolved bottom up. This seems to be the natural order of things in real engineering: you can only build on abstractions that are grounded themselves in other grounded abstractions or in reality itself. The sequence was generally: wires, stored program machine language, assembly language, structured language: C, OO language: C++. Note that at each layer, the abstractions are rather straightforwardly rendered into the next lower layer; we do not allow ourselves to entertain fantasies of artificial intelligence.

For this all to be interesting it should be possible in reality. Any such system should work with the existing imperative languages that are actually used: C and C++. To prove the strongest facts without modification to the program it should be static-time. Staying grounded in reality, it must evolve bottom up, starting as a "declarative assembly language". The verifier is this system.

Logic

The underlying logic engine for the declarative language is forward-chaining first order logic.

Declarative annotations to C

C programs may be annotated with declarative statements which are parsed by the partial grammar in annot.gr. The annotations consist of

VCExpr0

VCExpr0 is the executable fragment of VCExpr.

VCExpr1

VCExpr1 is the first-order logic, non-executable fragment of VCExpr.

Note that the "pre" annotation before a parameter or global variable name X is an abbreviation for X0, given that you said "X0 = X" at the top.

Modeling memory

There are two special function symbols "upd" and "sel".

They are defined by the following axioms (in selupd.model):

  // reading from last location written
    sel( upd(obj, index, val) , index) == val;

  // reading from a different location
  forall(int obj, int index1, int index2, int val).
    index1!=index2 ==>
      sel( upd(obj, index1, val) , index2 ) = sel( obj, index2 );

They are so frequently used that they have their own notation.

In this notation, the axioms become
  // reading from last location written
  forall(int obj, int index, int val).
    obj{index := val}[index] == val;

  // reading from a different location
  forall(int obj, int index1, int index2, int val).
    index1!=index2 ==>
      obj{index1 := val}[index2] == obj[index2];

VCGlue

VCExprs may occur in various places in the C grammar with various consequences.

Verification Modeling Language

Scott says that "VML is a semantic model of the original program" or that it is annotated C "without pointers". C with annotations is translated into VML with a shallow semantic depth; that is the translation amounts to preprocessing. Here are the differences.

Tool chain

As described in index.html, runvml runs the entire verifier toolchain. We start with .c file, which is actually annotated C.
  1. C Preprocessing: cpp converts .c -> .i
  2. C to VML: c2vml converts .i -> .vml (below)
  3. Verification Condition generation: vml generates VCs (below), but it is not always realized as a file. These are left in (current dir)/simp.log
  4. Prove the VCs: Simplify or Kettle prove the VCs. They succeed or fail. (Kettle is George's theorem prover).

Examples

Here is a simple example just using assert.

./runvml tutorial/assert1.c; echo $?

Now, uncomment the wrong assertion at the end and run again; the verifier will complain.

Major feature requests

The following are improvements that would be needed to make this tool usable for the general programmer.

Rules for preventing infinite loops in the logic.

If you get the changes wrong for a function you are assumed to be correct. That is the tool will assume something did not change if you fail to mention that it changes.

A way to debug Simplify theorem proving computations.

A tool for removing the annotations and producing C that compiles so that only one source file need be maintained.