Up Next

1 Introduction

This a fork of the CIL tool maintained by the team behind the static analyzer Goblint, which we refer to as Goblint-CIL (CIL for short).

CIL (C Intermediate Language) is a high-level representation along with a set of tools that permit easy analysis and source-to-source transformation of C programs.

The original CIL has a Source Forge page: http://sourceforge.net/projects/cil.

goblint-cil is developed on Github: https://github.com/goblint/cil/.

Compared with the original CIL there are some breaking changes:

CIL is both lower-level than abstract-syntax trees, by clarifying ambiguous constructs and removing redundant ones, and also higher-level than typical intermediate languages designed for compilation, by maintaining types and a close relationship with the source program. The main advantage of CIL is that it compiles all valid C programs into a few core constructs with a very clean semantics. Also CIL has a syntax-directed type system that makes it easy to analyze and manipulate C programs. Furthermore, the CIL front-end is able to process not only ANSI-C programs but also those using GNU C extensions. If you do not use CIL and want instead to use just a C parser and analyze programs expressed as abstract-syntax trees then your analysis will have to handle a lot of ugly corners of the language (let alone the fact that parsing C itself is not a trivial task). See Section ?? for some examples of such extreme programs that CIL simplifies for you.

In essence, CIL is a highly-structured, “clean” subset of C. CIL features a reduced number of syntactic and conceptual forms. For example, all looping constructs are reduced to a single form, all function bodies are given explicit return statements, syntactic sugar like "->" is eliminated and function arguments with array types become pointers. (For an extensive list of how CIL simplifies C programs, see Section 3.) This reduces the number of cases that must be considered when manipulating a C program. CIL also separates type declarations from code and flattens scopes within function bodies. This structures the program in a manner more amenable to rapid analysis and transformation. CIL computes the types of all program expressions, and makes all type promotions and casts explicit. CIL supports some GCC extensions (a notable exception being nested functions). Finally, CIL organizes C’s imperative features into expressions, instructions and statements based on the presence and absence of side-effects and control-flow. Every statement can be annotated with successor and predecessor information. Thus CIL provides an integrated program representation that can be used with routines that require an AST (e.g. type-based analyses and pretty-printers), as well as with routines that require a CFG (e.g., dataflow analyses).

CIL comes accompanied by a number of Perl scripts that perform generally useful operations on code:

CIL is relatively independent on the underlying machine and compiler. When you build it CIL will configure itself according to the underlying compiler. However, CIL has only been tested on Intel x86 using the gcc compiler on Linux and cygwin and using the MS Visual C compiler. (See below for specific versions of these compilers that we have used CIL for.)

The largest application the authors of the original CIL have used CIL for is CCured, a compiler that compiles C code into type-safe code by analyzing your pointer usage and inserting runtime checks in the places that cannot be guaranteed statically to be type safe.

You can also use CIL to “compile” code that uses GCC extensions (e.g. the Linux kernel) into standard C code.

If you want to cite CIL in your research writings, please refer to the paper “CIL: Intermediate Language and Tools for Analysis and Transformation of C Programs” by George C. Necula, Scott McPeak, S.P. Rahul and Westley Weimer, in “Proceedings of Conference on Compiler Construction”, 2002 by the authors of the original CIL.


Up Next