STATOR:Establishing code-base

From STATOR
Jump to navigation Jump to search

Here we list pros and cons of all our approaches towards establishing STATOR's code-base. All pros/cons are listed in the order from the most important to the least one.

  • TODO:
    • Update information in all pros/cons points in all subsections below such that free tools like PAGAI, FramaC, UFO, CPROVER, CBMC, KLEE, Bugst, etc. are included.


Perspective 1: Main implementation language/platform

Java

  • pros
    • We can take a lot of code from tools like CPAchecker and Stanse with minimal effort.
    • Huge standard library
    • There several very good IDEs for development of Java applications.
    • It is stable and portable to many different platforms.
    • George already implemented its analysis there and no issues appeared.
    • Many students know it.
  • cons
    • Implementation of some program analyses may be slower then the one in programming languages like OCaml or F#. Nevertheless, it is quite difficult to predict impact of this potentially increased programming effort.

F# (.NET/Mono)

  • pros
    • May provide faster implementation of some program analyses then in Java, because symbolic manipulations of particular data structures (like syntax trees, expressions, BDD,...)
    • We can take a C front end from tools like CPAchecker, Stanse, Bugst, or Predator with minimal effort. It only requires to dump internal CFG representations of the tools to disc and load them into equivalent data structures syntactically expressed in F#.
    • When we are in .NET we can freely use any other .NET language (C#, Visual basic, managed C++, ...). Since many students know C# and C# is similar to (or better then) Java, so from this point of view F# is better choice then Java. Moreover, there is also implemented integration of non-managed languages, but portability is then a question.
    • We can move to Windows and thus develop STATOR in Visual Studio. (Visual Studio can really speed-up development (e.g. excellent debugging, edit&continue, ...)
  • cons
    • Development in Mono can be less comfortable. Also support of F# in Mono known to us yet. Similarly, state of standard library is not know.
    • Portability mostly depends on qualities of Mono.
    • Not many people knows it, but the situation improves.

OCaml

  • pros
    • May provide faster implementation of some program analyses then in Java, because symbolic manipulations of particular data structures (like syntax trees, expressions, BDD,...)
    • We can take a C front end from tools like CPAchecker, Stanse, Bugst, or Predator with minimal effort. It only requires to dump internal CFG representations of the tools to disc and load them into equivalent data structures syntactically expressed in OCaml.
  • cons
    • OCaml development environments are still immature and primitive. This can slow down whole development.
    • Not many people knows it, but the situation improves.
    • OCaml is not that portable as Java.

C++

  • pros
    • There is software like PAGAI, KLEE, Predator, Bugst, Symbiotic from which we can take code. Moreover, if we use F# (i.e. .NET) then calls to native C++ libraries are no problem (they are quite nicely supported in .NET).
    • TODO
  • cons
    • Students generally do not like this language.
    • TODO

Perspective 2: Programs in what languages should STATOR analyse

Using dedicated C front-end

  • pros
    • Such front-ends are already available in tools like CPAchecker, Stanse, Predator, ...
    • In CPAchecker they are implementing front-end to Java. So we can take their code, when it is ready. There might further be implemented front-ends to other languages in some other free tools. But then we could use them as external translators from their internal CFG representation into our automata files.
  • cons
    • Extension of STATOR to other languages will be complicated, if others (like Dirk's team) terminates development of Java front-end or we fail to find free tool which provides front-end to language we are interested in.

Using LLVM front-end

  • pros
    • LLVM project intends to compile several languages into LLVM byte code (C,C++, Objective C, Java,...). So, this might be very valuable for future of STATOR. But if development of STATOR will formally end in 2017. If there are no successor projects, then this feature does not matter.
  • cons
    • I know only one tool, Bugst, which translates LLVM into an internal control-flow graph representation. So, only this tool can be used for F# or OCaml platform for STATOR. But Bugst is not finished, so some LLVM features are not supported. Other LLVM based tools like PAGAI, KLEE do not transform LLVM into an internal CFG representation.

Perspective 3: What sort of programs we consider

There should mostly be considered benchmarks from SV_COMP. We are further considering some (probably smaller) open-source projects and some SW for drones and airplanes.

It would be really nice to try C front-ends from CPAchecker, Stanse, Bugst, etc. to few representative programs from each category.

Perspective 4: Build STATOR on top on a single freely available tool

The tools are listed in no particular order (just as they came into my mind). Did not we forget some interesting open-source tool?

CPAchecker

  • pros
    • George already implemented its analysis there and no issues appeared.
    • TODO
  • cons
    • TODO

PAGAI

  • pros
    • TODO
  • cons
    • TODO

FramaC

  • pros
    • TODO
  • cons
    • TODO

UFO

  • pros
    • TODO
  • cons
    • TODO

Stanse

  • pros
    • TODO
  • cons
    • TODO

Bugst

  • pros
    • TODO
  • cons
    • TODO

Predator

  • pros
    • TODO
  • cons
    • TODO

KLEE

  • pros
    • TODO
  • cons
    • TODO

CBMC

  • pros
    • TODO
  • cons
    • TODO

CPROVER

  • pros
    • TODO
  • cons
    • TODO

Perspective 5: Build STATOR on top of a code collected from several different sources (tools)

Here we consider possibilities to take useful code from different open-source tools. This code may differ in implementation language. The goal here is:

  • Make I list of code we would like to have (like front-end to C/LLVM, ...)
  • Search open-source tools for their implementations of items in our list (we can extend the list if we find some interesting functionality).
  • Prune the result s.t. variant with minimal compatibility issue (i.e. language differences and type incompatibility). For the pruning it is crucial a good estimation of amount of work which has to be done to resolve incompatibility issue (e.g. where we can use dynamic libraries, where inter-language call, what will have to be reimplemented, and for what items in the list we have nothing).

TODO: complete the section!

Open questions

How much time/programming effort can we actually save when we use languages like F# or OCaml relative to other languages like Java or C++?

  •  ???

How much we can actually benefit (i.e. time/programming effort savings) from the "new-school" control-flow graph representation relative to the use of the "old-school" one?

  •  ???

How much is important for as that CPAchecker already provides several program analyses like predicate abstraction?

  •  ???

How much restrictive is CPAchecker's concept of CPAs with respect to the kind of program analyses we intend to implement in STATOR? (Of course, there is a possibility to circumvent their concept, but it is against tool's philosophy, i.e. it is bad.)

  •  ???

Results