Paper: "From Uncertainty to Belief: Inferring the Specification Within"
Ted Kremenek and Paul Twohey, Stanford University; Godmar Back, Virginia Polytechnic Institute and State University; Andrew Ng and Dawson Engler, Stanford University
Abstract
Automatic tools for finding software errors require a set of specifications before they can check code: if they do not know what to check, they cannot find bugs. This paper presents a novel framework based on factor graphs for automatically inferring specifications directly from programs. The key strength of the approach is that it can incorporate many disparate sources of evidence, allowing us to squeeze significantly more information from our observations than previously published techniques.We illustrate the strengths of our approach by applying it to the problem of inferring what functions in C programs allocate and release resources. We evaluated its effectiveness on five codebases: SDL, OpenSSH, GIMP, and the OS kernels for Linux and Mac OS X (XNU). For each codebase, starting with zero initially provided annotations, we observed an inferred annotation accuracy of 80-90%, with often near perfect accuracy for functions called as little as five times. Many of the inferred allocator and deallocator functions are functions for which we both lack the implementation and are rarely called—in some cases functions with at most one or two callsites. Finally, with the inferred annotations we quickly found both missing and incorrect properties in a specification used by a commercial static bug-finding tool.
3 Comments:
From Uncertainty to Belief: Inferring the Specification Within
Ted Kremenek and Paul Twohey, Stanford University; Godmar Back, Virginia Polytechnic Institute and State University; Andrew Ng and Dawson Engler, Stanford University
Presented by Ted Kremenek
All systems have correctness rules such as to not leak memory or to acquire some lock before accessing data. We can check these rules using program analysis but the problem is that missed rules lead to missed bugs. The specification of a system is the set of these rules and invariants. The problem addressed in this talk is how to find errors that violate these rules when we don't know what the rules are or more precisely, how can we infer the specification. The talk presented a general framework to do so and the technique is described using an example: finding resource leaks by inferring allocators and deallocators.
There are many sources of knowledge that can be used to infer system rules, such as behavioral tendencies (i.e. programs are generally correct), function names, etc. but there isn't a way to bound all of this information together. To address this issue, the authors present an approach based on Annotation Factor Graph (AFG), a form of probabilistic graphical modeling. This approach reduces all forms of knowledge, either from evidence or intuitions, to probabilities. The idea is to express program properties to infer as annotations variables and infer these annotations by combining scores obtained from factors. Factors represent models of domain specific knowledge and are used to score assignments of values to annotation variables based on the belief that an assignment is relatively more or less likely to be correct.
The talk focused on how to infer resource ownership using AFGs. First, functions return values and parameters are annotated. The domain for a return value annotation variable is to return or not return ownership and the domain for a function parameter is to claim or not claim ownership. Ted Kremenek then described some of the factors used to infer resource ownership. Some use static analysis based on common programming idioms such as a resources should never be claimed twice, others are based on ad-hoc knowledge such as function names.
The evaluation was based on inferring annotations on five projects: SDL, OpenSSH, GIMP, XNU and the Linux Kernel. The authors' technique obtained an 90%+ accuracy on the top 20 ranked annotations. It was also able to infer allocators unknown or misclassified by Coverity Prevent. Using their technique, the authors were able to find a memory leak in the GIMP library.
Someone from Yahoo asked how well the tool performs when allocator that are simply wrappers around malloc are factored out. Ted answered that although many allocators call malloc, the tool doesn't use this knowledge. Brad Chen from Google asked if the tool would be confused by realloc. Ted answered that the tool did infer that realloc was both a allocator and a deallocator. Some corner cases such as this one that do not fit the model must be dealt with separately. They had a similar issue with certain functions in the Linux kernel. Such outliers have to be modelled explicitly but are fairly rare.
Micah Brodsky from MIT asked if AFGs are an instance of Bayesian Net. Ted answered that both AFGs and Bayes Nets are part of the same family of probability modelling. The authors actually started with Bayes Net but found them to be too rigid. AFG were much easier to work with. Someone asked what other things could be modelled. Ted answered that locks are very behavioral so their approach would work well. He stated that the temporal relationship could be expressed as a grammar.
The technique found multiple bugs besides just the memory leak we presented from the GIMP. We presented this single bug in the talk because it of its complexity from a program analysis point of view. We found previously undetected memory leaks in all five projects we analyzed.
Outliers, such as IS_ERR_PTR in the Linux kernel, are automatically detected by the model (there is a section in the paper on this). However, because the model doesn't understand the semantics of such outliers, it effectively ignores such outliers by removing them from the model. These outliers, however, can be easily presented to whoever is using the AFG so that they can modify the static analysis or the AFG to directly model the semantics of this unknown idiom, thus allowing the technique to gleam information from such outliers. In practice we found only a few cases where something grossly did not fit the ownership idiom, and these were identified immediately by the technique.
Post a Comment
<< Home