Exploring trends in the CIA's CREST database

Exploring trends in the CIA’s CREST database

Google Trends, except for what the Agency is talking about

Written by
Edited by JPat Brown

When the Central Intelligence Agency (CIA) released its CREST database online, it created a historical treasure trove of 13 million pages, more than any one researcher is likely to ever comb through.

Fortunately, we’re able to have a computer do that for us, showing various trends in what the Agency is paying attention to in a given year.

The database officially starts in 1917 - decades before the CIA was even formed, with an old recipe for disappearing ink. And it includes documents that are 25 years old or older, so currently capping out at 1994 (due to processing oddities, some documents are listed with publication dates much newer than that, but we’ve discarded those from our analysis).

Huge thanks goes to MuckRock contributor Emma Best and the awesome team at Data.World who helped wrangle all the data. You can dig more deeply into the CREST database metadata at Data.World. Be sure to check out the “On This Day in CIA History” bot we built with the data.

First, I looked at key words that were close to home.

Chart of terms used in the CREST database

No real surprises, and a good chance to mention a major caveat: The parser I built looks for the term even if it’s surrounded by other letters, so if the CIA included “Hydromedusa” or “Crusader” in the title of the report, that would also boost “USA“‘s numbers.

One other word did rocket off the charts, however: Bush. Former President George H.W. Bush, of course, served as director of the Agency from January 30, 1976 to January 20, 1977, and it’s not surprising that the surname of two recent presidents would garner a number of mentions. For comparison:

It’s also interesting tracking what kind of weaponry catches the Agency’s attention. Overall, I was surprised how rare key terms showed up, generally less than ten times in any given year with only one exception.

Speaking of danger, it will be no surprise that the Agency paid a lot of attention to the Soviet Union, although oddly almost never referring to it as the Soviet Union, instead preferring the term USSR. It also paid a lot of attention to Communism generally.

Interest in China was comparatively muted.

I also look at a variety of other terms - from murder and poison to some of the world’s major religions - but the results were usually zero or a handful of mentions a year with no real discernible trends. If you’d like to dig deeper, check out the code and all the data used in the examples above on Github, or suggest other terms we should check out via email, on Twitter, or on Facebook.