[SC-L] [WEB SECURITY] SATE?

Jim Manico jim at manico.net
Wed Jun 9 15:40:06 EDT 2010


Fantastic SATE reply from Steven M. Christey:
>
> I participated in SATE 2008 and SATE 2009, much more actively in the 
> 2008 effort.  I'm not completely sure of the 2009 results and final 
> publication, as I've been otherwise occupied lately :-/ Looks like a 
> final report has been delayed till June (the SATE 2008 report didn't 
> get published till July 2009).
>
> For SATE 2008, we did not release final results because the human 
> analysis itself had too many false positives - so sometimes we claimed 
> a false positive when, in fact, the issue was a true positive.  Given 
> this and other data-quality problems (e.g. we only covered ~12% of the 
> more than 49,000 items), we believed that to release the raw data 
> would make it way too easy for people to make completely wrong 
> conclusions about the tools.
>
>> The problems that the data would have revealed is:
>>
>> 1) false positive rates from these tools are overwhelming
>
> As covered extensively in the 2008 SATE report (see my section for 
> example), there is no clear definition of "false positive" especially 
> when it comes to proving that a specific finding is a vulnerability.
>
> For example: suppose you have a report in a function of a buffer 
> overflow. To prove the finding is a vulnerability, you have to dig 
> back through all the data flow, sometimes going 20 levels deep.  This 
> is not feasible for a human evaluator to determine if there's really a 
> vulnerability.  Or, maybe the overflow happens when you're reading a 
> configuration file that's only under the control of the 
> administrator.  These could be regarded as false positives.  However, 
> the finding may be "locally true" - i.e. the function itself might not 
> do any validation at all, so *if* it's called incorrectly, an overflow 
> will occur.  My suspicion is that a lot of the "false positives" 
> people complain about are actually "locally true." And, as we saw in 
> SATE 2008 (and 2009 I suspect), sometimes the human evaluator is 
> actually wrong, and the finding is correct.  Hopefully we'll account 
> for "locally true" in the design of SATE 2010.
>
>> 2) the work load to triage results from ONE of these tools were 
>> man-years
>
> This was also covered (albeit estimated) in the 2008 SATE report, both 
> the original section and my section.
>
>> 3) by every possible measurement, manual review was more cost effective
>
> There was no consideration of cost in this sense.
>
> One lost opportunity for SATE 2008, however, was in comparing the 
> results from the manual-review participants (e.g. Aspect) versus the 
> tools in terms of what kinds of problems got reported.  (This also had 
> major implications for how to count number of results).  I believe 
> that such a focused effort would have shown some differences in what 
> got reported. At least, that's in the raw data since it shows who 
> claimed what got found.
>
> While the SATE 2008 report is quite long mostly thanks to my excessive 
> verbiage, I believe people who read that document will see that SATE 
> has been steadily improving its design over the years.  The reality is 
> that any study of this type is going to suffer from limited manpower 
> in evaluating the results.
>
> http://samate.nist.gov/docs/NIST_Special_Publication_500-279.pdf
>
>> The coverage was limited ONLY to injection and data flow problems 
>> that tools have a chance of finding. In fact, the NIST team chose 
>> only a small percentage of the automated findings to review, since it 
>> would have taken years to review everything due to the massive number 
>> of false positives. Get the problem here?
>
> While there were focused efforts in various types of issues, there was 
> also random sampling to get some exposure the wide range of problems 
> being reported by the tools.  Your critique of SATE with respect to 
> its focus on tools versus manual methods is understandable, but SATE 
> (and its parent SAMATE project) are really about understanding tools, 
> so this focus should not be a surprise.  After all, the first three 
> letters of SATE expand to "Static Analysis Tool."
>
> - Steve



More information about the SC-L mailing list