[SC-L] BSIMM update (informIT)
Gary McGraw
gem at cigital.com
Wed Feb 3 15:04:43 EST 2010
Hi Steve (and sc-l),
I'll invoke my skiing with Eli excuse again on this thread as well...
On Tue, 2 Feb 2010, Wall, Kevin wrote:
> To study something scientifically goes _beyond_ simply gathering
> observable and measurable evidence. Not only does data needs to be
> collected, but it also needs to be tested against a hypotheses that offers
> a tentative *explanation* of the observed phenomena;
> i.e., the hypotheses should offer some predictive value.
On 2/2/10 4:12 PM, "Steven M. Christey" <coley at linus.mitre.org> wrote:
>>I believe that the cross-industry efforts like BSIMM, ESAPI, top-n lists,
>>SAMATE, etc. are largely at the beginning of the data collection phase.
I agree 100%. It's high time we gathered some data to back up our claims. I would love to see the top-n lists do more with data.
Here's an example. In the BSIMM, 10 of 30 firms have built top-N bug lists based on their own data culled from their own code. I would love to see how those top-n lists compare to the OWASP top ten or the CWE-25. I would also love to see whether the union of these lists is even remotely interesting. One of my (many) worries about top-n lists that are NOT bound to a particular code base is that the lists are so generic as to be useless and maybe even unhelpful if adopted wholesale without understanding what's actually going on in a codebase. [see <http://www.informit.com/articles/article.aspx?p=1322398>].
Note for the record that "asking lots of people what they think should be in the top-10" is not quite the same as taking the union of particular top-n lists which are tied to particular code bases. Popularity contests are not the kind of data we should count on. But maybe we'll make some progress on that one day.
>Ultimately, I would love to see the kind of linkage between the collected
>data ("evidence") and some larger goal ("higher security" whatever THAT
>means in quantitative terms) but if it's out there, I don't see it
Neither do I, and that is a serious issue with models like the BSIMM that measure "second order" effects like activities. Do the activities actually do any good? Important question!
>The 2010 OWASP Top 10 RC1 is more data-driven than previous versions; same
>with the 2010 Top 25 (whose release has been delayed to Feb 16, btw).
>Unlike last year's Top 25 effort, this time I received several sources of
>raw prevalence data, but unfortunately it wasn't in sufficiently
>consumable form to combine.
I was with you up until that last part. Combining the prevalence data is something you guys should definitely do. BTW, how is the 2010 CWE-25 (which doesn't yet exist) more data driven??
>I for one am pretty satisfied with the rate at which things are
>progressing and am delighted to see that we're finally getting some raw
>data, as good (or as bad) as it may be. The data collection process,
>source data, metrics, and conclusions associated with the 2010 Top 25 will
>probably be controversial, but at least there's some data to argue about.
Cool!
>So in that sense, I see Gary's article not so much as a clarion call for
>action to a reluctant and primitive industry, but an early announcement of
>a shift that is already underway.
Well put.
gem
company www.cigital.com
podcast www.cigital.com/~gem
blog www.cigital.com/justiceleague
book www.swsec.com
More information about the SC-L
mailing list