OpenRefactory fixers have a 70% fix rate on SAMATE benchmark programs

October 11th, 2017 Posted by News, Technical Merit

OpenRefactory released its first set of Beta fixers in September 2017. There are 11 fixers that target security and compliance bugs in C programs.

The OpenRefactory team ran the fixers on the SAMATE benchmark created by the National Institute of Standards and Technology (NIST) to evaluate its bug fixing capability. “The NIST Software Assurance Metrics And Tool Evaluation (SAMATE) project is dedicated to improving software assurance by developing methods to enable software tool evaluations, measuring the effectiveness of tools and techniques, and identifying gaps in tools and methods.” [1]. SAMATE’s Juliet test suite contains programs that show the nuances of vulnerabilities in C and C++.

Here is a snippet from an example program from the Juliet test suite that demonstrates a buffer overflow vulnerability.

#define SRC_STR "0123456789abcde0123"

typedef struct _charVoid
    char charFirst[16];
    void * voidSecond;
    void * voidThird;
} charVoid;

#ifndef OMITBAD

void CWE122_Heap_Based_Buffer_Overflow__char_type_overrun_memcpy_02_bad()
            charVoid * structCharVoid = (charVoid *)malloc(sizeof(charVoid));
            structCharVoid->voidSecond = (void *)SRC_STR;
            /* Print the initial block pointed to by structCharVoid->voidSecond */
            printLine((char *)structCharVoid->voidSecond);
            /* FLAW: Use the sizeof(*structCharVoid) which will overwrite the pointer y */
            memcpy(structCharVoid->charFirst, SRC_STR, sizeof(*structCharVoid));
            structCharVoid->charFirst[(sizeof(structCharVoid->charFirst)/sizeof(char))-1] = '\0'; /* null terminate the string */
            printLine((char *)structCharVoid->charFirst);
            printLine((char *)structCharVoid->voidSecond);

#endif /* OMITBAD */

Each of the SAMATE benchmark programs has a good function and a bad function. The bad function demonstrates a deviation from the good function that leads to a vulnerability. In this case, the deviation is a heap based buffer overflow (CWE 122). In line 43, the memcpy function copies a string of 20 bytes to an array of 16 bytes (structCharVoid->charFirst is defined in line 26). There is an apparent protection on the number of bytes copied in line 43, but it uses the size of the wrong object; it uses the size of the entire structure instead of the char array inside the structure. This allows the buffer overflow.

OpenRefactory’s Safe Library Replacement (SLR) fixer fixes line 43 by modifying it to be:

	  /* FLAW: Use the sizeof(*structCharVoid) which will overwrite the pointer y */
	  memcpy(structCharVoid->charFirst, SRC_STR, (sizeof(*structCharVoid) > sizeof(structCharVoid->charFirst) ? sizeof(structCharVoid->charFirst) : sizeof(*structCharVoid)));

Here, a new check is introduced that enforces the correct size for memcpy. Instead of the entire structure, it compares the size of the structure with the size of the struct member and picks the smaller size of the two. This prevents the buffer overflow vulnerability. The following video on Youtube shows the effect of the modification.

Here are the overall results. The Safe Library Replacement fixer impacted the programs that had buffer overflow in the stack and in the heap (CWE 121 and CWE 122). For CWE 121, 61 files were patched out of 94 (67%). For CWE 122, 19 files were patched out of 36 (53%). There was 62% true positive fix rate overall.

Another fixer that impacted several SAMATE benchmark CWEs is the Change Integer Type (CIT) fixer. If the declared type of an integer is different from the used type of the integer, the declared type is modified to match the used type. This fixed integer issues demonstrated by CWE 194, CWE 195, and CWE 196. For CWE 194, 80 files were patched out of 112 (71%). For CWE 195, 80 files were patched out of 112 (71%). For CWE 196, 18 files were patched out of 18 (100%). There was 74% true positive fix rate overall. The following table summarizes the result.

Also interesting is the low false positive rate. Technically, OpenRefactory fixers have zero false positives because all the fixes are actual bugs in the code. However, it turns out that there are many fixes that people may not care about. For example, a Change Integer Type fixer that modifies the integer type may impact many integers in C programs. In fact, previous research showed that even in mature C programs, one out of three integers are declared with a type that is different than their used types [2]. This is because the weak type semantics of C allows it. But it will be annoying to change every third variable declaration to something else. OpenRefactory’s fixers are supported by taint analysis that only targets interesting code parts that may actually require fixing. For CWE 194, CIT generated 94 patches and 80 were targeting bad functions (15% FP). For CWE 195, CIT generated 86 patches and 80 were targeting bad functions (7% FP).

Note that the extra patches are all correct changes, they are just on a part of the code that the developers may not want to fix in the first place. This brings us to a philosophical debate about what false positives are in the case of fixers. The false positive concept is borrowed from the bug detection tools space. For the bug detection tools, it makes sense to report fewer bugs because developers have to spend time and effort to fix the bugs. Most of the bug detection tools in the market have between 25%-80% false positive rate. In contrast, when there are tools that automatically fixes bugs, should we allow them to fix parts of the code that developers do not necessarily care about? Isn’t proactive prevention better than a reactive fix? Imagine the improvement on the overall health of the program.

For more information about how to use OpenRefactory fixers in your project, contact:

1. Introduction to SAMATE.
2. Z. Coker and M. Hafiz. Program Transformations to Fix C Integers. In Proceedings of ICSE 2013.