Release Notes: The return of project embeds and much more on DocumentCloud

Release Notes: The return of project embeds and much more on DocumentCloud

Towards the end of 2020, we made a number of improvements to our document analysis platform

Written by
Edited by Beryl Lipton

Towards the end of (a very long) 2020, we pushed out a bunch of feature improvements to DocumentCloud, our document hosting and analysis service.

Right now, DocumentCloud uploading is limited to verified working journalists, but we’re working on offering broader access later this year. If you qualify, you can request an account here. Everyone, however, can search and browse the public repository of almost 2 million primary source documents.

Most of the improvements were relatively behind-the-scenes, including improvements to general speed, search, and caching, but we also launched a few major updates we wanted to highlight.

For previous site improvements, check out all of MuckRock’s release notes, and if you’d like updates emailed to you — along with ways to help contribute to the site’s development yourself — subscribe to our developer newsletter here.

Return of DocumentCloud project embeds

One of DocumentCloud’s marquee features has always been the ability to embed documents within articles and other webpages. In fact, since shortly after MuckRock launched, we used the DocumentCloud embedded viewer to show responsive documents and letters from agencies right within request pages.

But sometimes you don’t want to show just one document but a collection of documents. For that, DocumentCloud offered somewhat experimental project embeds. We say experimental because some of the implementation left DocumentCloud’s search servers vulnerable to being overloaded when these project embeds were on high-trafficked pages.

While project embeds took a while to get back and working, they’re now live again on the DocumentCloud Beta. To create a project embed, simply click the pencil icon next to one of your projects, and then click “Share / Embed Project.” Here’s a sample embed:

You can also see an example in the wild with the Center for Public Integrity’s collection of weekly reports from the White House Coronavirus Task Force.

We’re working to continue to expand and build on this functionality in the coming months, but we’re excited to launch a more stable, scalable version of embeds to help highlight important collections of documents.

DocumentCloud keyboard shortcuts

We’re working to make it easy to get a lot done right from the keyboard in DocumentCloud, and to that end we’ve added a variety of shortcuts, focusing first on the document editing view. Shortcuts you can use:

  • A: Start annotating a document.
  • R: Start redacting a document.
  • S: Add or edit page sections.
  • Ctrl/CMD+F: Start searching through page.
  • Esc: Cancel the current action.

Multi-word search for organizations, names, and other special queries

Many organizations and people have multiple words in their name, but our special searches for organizations and individual uploaders didn’t do a great job of supporting that. We’ve now fixed that so if you type in organization: or user: and then multiple words, in knows to search both the first word and the following words until you pick your selection. We try to be smart about how this is implemented so if we detect you’re stopped typing in a name we don’t keep searching for one, but now it’s a lot easier to pull up documents from a specific newsroom.

Screenshot of searching for documents from the Daily Bruin by typing in the query “organization:Daily Br”

Tip of the Day and Tips and Tricks page

We wanted to make it easier for DocumentCloud users to learn about new features, upcoming trainings, and other pertinent information, so we’ve updated the green feedback bar to be a Tip of the Day bar, which will highlight what DocumentCloud users need to know.

We’ve also added a Tips and Tricks page detailing how to tackle common questions or highlighting some of the less obvious DocumentCloud features.

DocumentCloud embeds hide back button more consistently

Depending on how you embedded a DocumentCloud document, sometimes a back button would appear, which could create a somewhat confusing experience for readers and casual browsers. We now detect when a document is embedded via an iframe and hide the back button.

Seeking feedback: Entity extractions

We’re currently wrapping up a very early version of entity extraction, and we hope to have a public preview in the coming weeks. One thing that’s really helpful is your ideas and thoughts on how entity extraction, which automatically detects names, places, dates, and more, would be useful to you.

If you’ve used entity extraction for reporting or analysis in the past (whether with DocumentCloud or another service), we’d love examples. Please get in touch at info@documentcloud.org or tweet us.

Reporting bugs and feature requests

We’re continuing to tweak and improve all our digital tools, including MuckRock and DocumentCloud. If you spot a bug or have a feature request, you can help by emailing us at info@muckrock.com.

It’s particularly helpful if you can provide more details about when the issue crops up or what you think is causing the problem, or if it’s a feature request let us know some specifics about the use case you have in mind or problem you’re trying to solve.


Image via Wikimedia Commons