• DoD correspondence log converted from pix to spreadsheets

    I’m posting, in an Excel spread sheet, the congressional correspondence logs covering the first three months of 2007 that we got a while back in a less than user friendly format from the Office of the Secretary of Defense. Here’s a sample of what we got in response to our FOIA — a .tif or tagged image file format — I picked one at random, but we have a CD-Rom with 189 files just like it.

    Anu turned the files over to Scott Wells, our multi-talented office administrator, who used a program called ocrad (it runs on Linux) to convert it to a text file, which Anu posted here. Here’s a sample of what the converted .tif files looked like:

    OSD CONTROL NUMBER: OSD 03257-07 DOCUMENTTYPE: INCOMING DOC: 2128/2007 DOR: 31212007
    FROM: uss LEVIN, c TO: SECDEF
    SUBJECT: REauEsT FOR NUMBER OF IRAal INDIVIDUALS WHO HAVE HELPED THE u.s. SUSTAIN AND MANAGE ITS PRESENCE IN IRAa
    AGENCY: JCS TASK: PRS SUSPENSE: 3/1312007 ACD:
    FILE NUMBER: IRAa

    OSD CONTROL NUMBER: OSD 03288-07 DOCUMENT TYPE: INCOMING DOC: 212812007 DOR: 3/2/2007
    FROM: uss VOINOVICH, G TO: SECDEF
    ___’_√öBJEIT_CLAIM AGAINSTTHE FEDERAL GOVERNMENT FOR COSTS INCURRED AS A RESULT OF A TERMINATION OF CONTRACT _
    AGENCY: SA - TASK: RD SUSPENSE: 3113/2007 ACD: 3/13/2007 _
    FILE NUMBER: 160

    OSD CONTROL NUMBER: OSD 03443-07 DOCUMENT TYPE: INCOMING DOC: 212812007 DOR: 3/6/2007
    FROM: uss CANTWELL, M TO: LA
    SUBJECT: REauEsT YOUR SUPPORT IN EXPEDITING MY INVESTIGATION _
    AGENCY:SA TASK:RD SUSPENSE: ACD:
    FILE NUMBER: T-

    Not perfect, but at least digitized and searchable. But still, somehow unsatisfying. I fooled around with the text file and was able to convert it to a tab delimited form (all those years coding agate at the Philadelphia Inquirer really came in handy).

    Now, a few explanations. There are three sheets on the spreadsheet. Sheet one is the cleanest version of the data with some value added fields, sheet two has every field–the enhanced ones and the original ones, and sheet three has only the original fields from the text file. I ended up ignoring some fields (in part because DoD stopped sending them to us in response to subsequent requests, and in part because we’ve been unable to learn from DoD what those fields mean–the ones I didn’t really touch were Agency, Task, Suspense and File Number). There’s also a pair of columns called “Extra One” and “Extra Two” — some of the data got bumped further to the right, but it was hard to tell which column to assign the extras to.

    There’s also some very messy data. For example, there’s a lot of garble like this, WOULD LIKE7a REIOhR_REPID_R POSITION IN DOD,
    and this OSD “j2o5-07. The latter is from the OSD control number field, which one could use in a freedom of information request to more easily get a copy of the actual letter to which it refers. Those numbers are supposed to look like this: OSD 00136-07.

    Now, to get this data into better shape, I need to go back to those .tif files, print them all out (there are 189 of them) and painstakingly go through them, comparing each page to the corresponding line in the spread sheet, fixing garble and double checking numbers, names and dates.

    Now, the really ridiculous thing about all this is that the Office of the Secretary of Defense keeps its records in a form not dissimilar to the one that we’ve managed to put together here. To respond to our FOIA request, someone at DoD (probably a contractor) printed pages from a database, which they then turned into .tif files, which they then copied onto a CD-Rom, which they then sent to us. I’ll have more to say about that aspect of this later.

    Out of Compliance: Nonprofit with ties to Stevens’ PAC, Tardy on Paperwork and Fees to the State of Alaska  Congressional-DoD correspondence visualization → 
  • 0 Comments on DoD correspondence log converted from pix to spreadsheets
    Leave a Comment

(also available as a full page with descriptions)

The Site may contain links to Internet sites that are not operated by Sunlight Foundation. These links are provided as a service and do not imply any endorsement of the activities or content of these sites, nor any association with their operators. Sunlight Foundation does not control these Internet sites and is not responsible for their content, security, or privacy practices. We urge you to review the privacy policy posted on web sites you visit before using the site or providing personal information.


This work by Sunlight Foundation is licensed under a Creative Commons Attribution 3.0 United States License.