CDM vs .docx

Making useful security compliance docs

CDM and the scourge of the semi-structured Microsoft Word security compliance documentation

As a developer and occasional devops practitioner I have always been bothered by .docx files being the sole repository of useful security control information. Microsoft Word is a wonderful word processing tool, but its files should not be the persistence mechanism for what should be structured data. Such data should be stored in a ready-made structured format like YAML, JSON, XML, or even a database. The .docx , OCS, PDF formats are fabulous as an aesthetically pleasing presentation mechanism, but they leave much to desire as a common data interchange format.

The CDM program is backed by a federal mandate that says-- among other things-- an agency must constantly find, mitigate and air its dirty risk laundry with DHS. Ultimately, this is pushing agencies towards the concept of ongoing authorizations. Critical to the implementation of CDM is the ability to readily pull and parse security control as well as compliance data. If you can’t parse the data you can’t make intelligent risk scores and the whole thing goes downhill from there; so obviously it’s important to be able to ferret out and parse the data from the various systems of record. For the past few months my company has been working to help a large federal agency to define the operational future state for Continuous Diagnostics Mitigation (CDM) and unfortunately, we keep running into the same issue; needed security compliance data is locked up in a Word Document.

Shockingly, a structured format is much easier for a computer to parse. But what structure is appropriate for this “control” data? Enter the Open Control object schema It’s a schema for documenting “controls” technical, operational, or management. It allows compliance data to be held in a meaningful format that enables easy integration with subsequent applications; CDM for example or more specifically, Splunk. It’s not hard to imagine how a set of Open Control stores could be used to help automate this continuous diagnostic mitigation and be an enabler of ongoing authorization through a devops automated build pipeline. Open Control object stores would be the source application specific security control and management information. This would enable the correlation tool Spunk to have a much simpler task in matching Hardware Asset Management (HWAM), Software Asset Management (SWAM), Vulnerability Management (VUL), and Configuration Settings Management (CSM) data with application specific compliance data.

Ok so what’s it going to take to parse my semi-structured FISMA and compliance docs into these open control objects? Well, the level of structure in your document will be the major variable in the level of complexity in parsing out security controls. C# does offers a pretty good NuGet package for parsing .docx files “DocX”. It allows you to stream the file contents as well as getting meta-data about the document like electronic signature which in many cases would be the Authorizing Official for a given system boundary. The Authorizing Official is a critical piece of data in the CDM object Model.

Ultimately, the limiting factor in how far you can go in migrating to open controls is the quality of data contained in existing documentation.

In my next article I’ll discuss turning those open control repository into the aesthetically pleasing documents using the Compliance Masonry Framework