FOIA Request for all digital images and text metadata created through NARA's public-private digitization partnership program

Reclaim The Records filed this request with the National Archives and Records Administration of the United States of America.
Tracking #

NGC21-028

NARA-NGC-2021-000052

Due Dec. 9, 2020
Est. Completion Oct. 16, 2024
Status
Awaiting Response

Communications

From: Reclaim The Records

To Whom It May Concern:

This is a request under the Freedom of Information Act.

I represent a 501(c)(3) non-profit organization called Reclaim The Records. We are an activist group of genealogists, historians, journalists, teachers, and open government advocates. We acquire genealogical and historical databases and images from government sources, including government archives, often through the use of Freedom of Information laws. We then upload those records to the Internet, without any copyright or usage restrictions or paywalls, making them freely available to the public and returning these taxpayer-funded materials to the public domain.

PART I: BACKGROUND FOR THIS REQUEST

The United States National Archives and Records Administration (NARA) has for several years managed an innovative public-private partnership program to digitize many of the important historical documents they hold, particularly records that would be useful for family history research. These include multiple enumerations of the United States Federal Census (through 1940), immigration and naturalization records, military and veteran records, tax assessment lists, and more.

More than four hundred of these important historical record sets have been digitized so far under this long-running partnership program, with each of those record sets containing hundreds of thousands, or more often millions, of individual documents. A likely-incomplete listing of these record sets is available on the NARA web page "Microfilm Publications and Original Records Digitized by Our Digitization Partners" located at https://www.archives.gov/digitization/digitized-by-partners . The total number of unique historical documents digitized and transcribed through this program is probably in the billions.

In exchange for having private corporations and non-profit organizations agree to become "partners" and digitize these historical records from their original paper or microfilm formats -- a massive task that would be largely cost-prohibitive for NARA to conduct on its own -- NARA agreed to let these partners have the exclusive use of those newly-digitized materials on their own websites for a certain amount of time, an "embargo period".

This grant of a supposedly exclusive entitlement to public records was meant to induce these partners to spend their time and money to conduct the records digitization and transcription at their own expense, instead of at the taxpayer's expense. But while well-intentioned, it also meant that these original historical records were often completely removed from public access while the companies worked on them, making the records functionally unavailable to researchers, sometimes for years.

And even once the digitization and transcription work was finally completed, the exclusivity period for each newly-created digital record set was also supposed to be time-limited. After the stated embargo period would end for each unique record set, usually within five years but sometimes in three years, NARA would then be able to freely disseminate the now-digitized versions of these public documents, both the images and the text metadata that accompanied them. NARA's own policies state that the agency could and would publish the digital copies through NARA's own website or in their official online Catalog or through their official API access or through other means. See item number two from "NARA Principles for Partnerships to Digitize Archival Materials" at https://www.archives.gov/digitization/principles.html :

"2. After an agreed-upon period of time, otherwise known as an embargo period, NARA gains unrestricted rights to the digital copies and the associated metadata transmitted to NARA by the partner, including the right to give or sell digital copies in whole or part to other entities, if NARA so chooses. If resources permit, we will try to make the digital materials available in our online catalog within the same year they are no longer in the embargo period."

But in practice, this simply hasn't happened. NARA has never actually posted online the vast majority of these records that were digitized through their partnership program, not to their Catalog nor indeed anywhere else where the public might be able to freely access and download the now-digital records. This remains the case today, even when the embargo periods for many of these record sets have been expired for more than a decade, sometimes two decades. A small number of the records are now finally online in the NARA Catalog, but even there, the data sets are still not available to the general public as bulk image or bulk data downloads and are cumbersome to search or use individually.

Instead, literally billions of these historical American records remain solely in the hands of NARA's primary digitization program partner, Ancestry.com. Ancestry is a private corporation, previously co-owned by a private equity firm and the government of Singapore's sovereign wealth fund, until they were sold to a different private equity firm for $4.7 billion in August 2020. Ancestry has purchased several smaller companies in the genealogy and family history space over the past few years, including the companies Fold3.com and Archives.com, both of which had previously independently been included in NARA's digitization partnership program. Thus, the vast majority of the billions of records digitized through NARA's partnership program are now available only behind Ancestry's subscription paywall, or through companies now owned by Ancestry with their own additional subscription paywalls. Annual subscriptions to these websites can cost hundreds of dollars per year per person.

NARA surely did not mean to create a de facto monopoly on nearly all digital copies of important American historical documents like the Census and immigration records and military files, all for the benefit of a single private corporation. But by not making the no-longer-embargoed documents available to the public anywhere else, not even on NARA's own website, and leaving them solely in the hands of their mostly-commercial partner organizations, that is exactly what has happened.

NARA's own "Principles for Partnerships to Digitize Archival Materials", as referenced above, clearly states in item number seven that:

"Public access to publicly owned resources will remain free. Partners may develop and charge for value-added features, but access to the digital copies ultimately should be readily accessible and free...NARA will have unrestricted ownership of these copies, including the right to make these copies freely available online for download."

However, in practice, NARA has also repeatedly denied independent requests for copies of even subsets of this voluminous partnership-created digital data. We are aware of at least three different entities, two genealogy-related corporations and one non-profit organization, none of which were NARA digitization partners, who each independently requested and were each denied access to copies of this data through e-mails, phone calls, meetings, and other discussions with NARA leadership. In all three cases, NARA denied the requests, saying that NARA would put the records online themselves, through their Catalog or API...eventually.

Thus, the end result of NARA's digitization partnership program has been that billions of important American historical documents were successfully digitized and transcribed -- but then were mostly not made available to the public for decades in any way other than by requiring the public to buy expensive annual data subscriptions benefiting private corporations, primarily a single multi-billion-dollar conglomerate, whose previous owners included a foreign government.

We at Reclaim The Records would now like to make an official request for open public access to these important American historical records.

PART II: OUR REQUEST

Under the Freedom of Information Act, we at Reclaim The Records request copies of the following:

1) We request every single record created under NARA's public-private digitization partnership with the entities Ancestry.com, Fold3.com (formerly known as Footnote, now owned by Ancestry), Archives.com (now owned by Ancestry), and FamilySearch (a non-profit organization). We do not request any records that were created through NARA's partnership with other smaller entities, such as the Daughters of the American Revolution (the DAR). Specifically:

1a) We request all of the digital images, in their original, full-size, uncompressed, and non-watermarked versions.

1b) We request all of the associated text metadata (names, dates, places, etc.) also created under the partnership agreement, which goes along with those images, making them searchable. For example, a spreadsheet or database may have been created for each data set that lists the name of each person referenced in each image, along with the date, the location, or other extracted information such as place of birth, marital status, volume number, census enumeration district, microfilm reel number, or any other text information relevant to that particular data set and/or each individual image.

1c) We request all copies of finding aids, training materials, handbooks, checklists, formatting guidelines, data dictionaries, data templates, data lists, or other internal documentation that explains more about the digitization of these images and the transcription and compilation of their associated text metadata, and how they relate to each individual data set.

2) We also request any records that were digitized under NARA's partnership program that may not have been properly delivered or returned to NARA after their digitization was completed. We have heard stories about records that remain solely in the possession of certain partner corporations, for which NARA never collected the files upon completion of the image scanning and the text metadata entry. We therefore request copies of all the partnership-created digital images, associated text metadata, and finding aids (or data dictionaries, documentation, templates, etc.) for those previously-undelivered files, as well. To be clear, we contend that NARA is required to collect these records from these companies and produce them to us in response to our request and we are requesting that NARA do so.

PART III: FORMAT OF PRODUCTION

We request that all of these files, the images and text metadata and finding aids and data dictionaries and so on, be turned over to us in their original digital formats, as they were delivered to NARA by the partners, or turned over for the first time if the partner never delivered the final files to NARA as they should have.

We would like to receive our copies of this information on portable USB drives. We are willing to pay the costs for purchasing those drives and for their insured and trackable domestic shipping. However, we believe some of this data may already be stored online in the Amazon Web Services (AWS) S3 Glacier system, which we believe NARA uses for its internal file storage. If this is the case, then for any data sets that are already completely online in AWS S3, we would consider receiving just the online versions of those specific data sets, by having that data copied directly from NARA's AWS S3 bucket(s) into Reclaim The Records' AWS S3 bucket(s), and those data sets would then not need to be downloaded to a USB drive.

Please inform us of all fees in advance of fulfilling our order.

PART IV: REQUEST FOR FEE WAIVER

We also request to be treated as a "media requester" for the purposes of calculating the fees for this FOIA request. We are a non-profit organization, not a commercial entity. We do not charge for copies of any of the tens of millions of records we have already acquired from government agencies and released to the public. We are one of the largest open records organizations in the United States. As of October 1, 2020, our e-mail newsletter, which has been published several times a year for the past six years, now has a circulation of over 7,500 subscribers. Our social media outlets such as our Facebook page have more than 11,000 followers, and our Twitter account has more than 6,100 followers.

We have even created several free standalone websites to both disseminate and discuss the data that we receive from government entities. As just one example, please see our website https://www.MissouriDeathIndex.com/ and our multiple associated newsletter issues linked from that website. We don't just release data sets, we discuss them too, using our editorial skills and discretion, and then disseminate those discussions to our readers.

Therefore, under 45 CFR 1602.2, we believe that we properly meet the legal qualifications as a "media requester" entity, and so we would need to pay only any duplication fees after the first 100 pages of material, and we should not need to pay any search fees or review fees.

Thank you for your consideration, and we look forward to your timely response within twenty business days, as the statute requires.

sincerely yours,

Brooke Schreier Ganz, on behalf of Reclaim The Records
info@reclaimtherecords.org
https://www.reclaimtherecords.org/

From: National Archives and Records Administration

Dear Brooke Ganz <requests@muckrock.com>,
Your password has been updated per your request. You can change your password in the future from your
profile page.
If you believe you received this email in error or need additional assistance, please contact the FOIAonline Help Desk. (mailto:foia.help@epa.gov)
Thank you!
FOIAonline Team (mailto:foia.help@epa.gov)

From: National Archives and Records Administration

This message is to confirm your request submission to the FOIAonline application: View Request. Request information is as follows: (https://www.foiaonline.gov/foiaonline/action/public/submissionDetails?trackingNumber=NARA-NGC-2021-000052&type=request)

* Tracking Number: NARA-NGC-2021-000052
* Requester Name: Brooke Ganz
* Date Submitted: 10/17/2020
* Request Status: Submitted
* Description:
Please See attachment for description.

From: National Archives and Records Administration

Dear Brooke Ganz <requests@muckrock.com>,
Your password has been updated per your request. You can change your password in the future from your
profile page.
If you believe you received this email in error or need additional assistance, please contact the FOIAonline Help Desk. (mailto:foia.help@epa.gov)
Thank you!
FOIAonline Team (mailto:foia.help@epa.gov)

From: National Archives and Records Administration

Dear Ms. Ganz:

This is in response to your email of August 16, 2021 asking for an
estimated completion date for FOIA request *NGC21-028 *[FOIAonline
NARA-NGC-2021-000052]. Currently, NGC21-028 is #215 in our complex FOIA
queue. Our estimated time to completion is 18 months from today. This is
a conservative estimate, and cases may move more quickly through the
queue. If you do not receive a response within this estimated completion
time, please contact us again for a status update.

In future correspondence, please cite both tracking numbers NGC21-028 and
FOIAonline NARA-NGC-2021-000052.

Sincerely,

Susan Gillett
Government Information Specialist
Office of General Counsel
National Archives and Records Administration
College Park, MD 20740-6001
susan.gillett@nara.gov

From: National Archives and Records Administration

Dear Mr. Ganz:

Our office gave you an estimated completion time in an email on August 16,
2021. That time period has not passed yet. Please contact us once that
time period has passed if you have not received a response.

Thank you.
Sincerely,

Susan Gillett
Government Information Specialist
Office of General Counsel
National Archives and Records Administration
College Park, MD 20740-6001
susan.gillett@nara.gov

From: National Archives and Records Administration

Your FOIAonline user account for requests@muckrock.com will be deactivated in 5 days due to inactivity.

From: National Archives and Records Administration

Your FOIAonline user account for requests@muckrock.com will be deactivated in 5 days due to inactivity.

From: National Archives and Records Administration

Sent via Email <requests@muckrock.com>

MuckRock News

DEPT MR 129755

263 Huntington Ave

Boston, MA 02115

RE: Freedom of Information Act Request NGC21-028

Dear MuckRock News,

This is in response to your email dated September 16, 2022 asking for a
status update for FOIA request NGC21-028 [FOIAonline NARA-NGC-2021-000052].
Currently, NGC21-028 is #189 in our complex FOIA queue. Our estimated time
to completion is 15 months from today. This is a conservative estimate, and
cases may move more quickly through the queue. If you do not receive a
response within this estimated completion time, please contact us again for
a status update.

In future correspondence, please cite both tracking numbers NGC21-028 and
FOIAonline NARA-NGC-2021-000052.

Sincerely,

Ashley A. Bryan

Government Information Specialist

Office of General Counsel (NGC)

National Archives and Records Administration

8601 Adelphi Road

College Park, Maryland, 20740-6001

301-837-3642

From: National Archives and Records Administration

Sent via Email <requests@muckrock.com>

October 17, 2022

MuckRock News

DEPT MR 103767

263 Huntington Ave

Boston, MA 02115

RE: Freedom of Information Act Request NGC21-028

To Whom It May Concern:

This is in response to your email dated October 17, 2022 asking for a
status update for FOIAonline request NARA-NGC-2021-000052, internal
tracking number NGC21-028. A status update to this request was provided on
September 16, 2022, via email. In that response it was stated NGC21-029
was #189 in our complex FOIA queue. The estimated time to completion is 15
months from September 16, 2022. This is a conservative estimate, and cases
may move more quickly through the queue. If you do not receive a response
within this estimated completion time, please contact us again for a status
update.

In future correspondence, please cite both tracking numbers;
NARA-NGC-2021-000052 and NGC21-028.

Sincerely,

Ashley A. Bryan

Government Information Specialist

Office of General Counsel (NGC)

National Archives and Records Administration

8601 Adelphi Road

College Park, Maryland, 20740-6001

301-837-3642

From: National Archives and Records Administration

Sent via Email <requests@muckrock.com>

November 16, 2022

MuckRock News

DEPT MR 103767

263 Huntington Ave

Boston, MA 02115

RE: Freedom of Information Act Request NGC21-028

To Whom It May Concern:

This is in response to your email dated November 16, 2022 asking for a
status update for FOIAonline request NARA-NGC-2021-000052, internal
tracking number NGC21-028. A status update to this request was provided on
September 16, 2022, via email. In that response it was stated NGC21-029 was
#189 in our complex FOIA queue. The estimated time to completion is 15
months from September 16, 2022. This is a conservative estimate, and cases
may move more quickly through the queue. If you do not receive a response
within this estimated completion time, please contact us again for a status
update.

In future correspondence, please cite both tracking numbers;
NARA-NGC-2021-000052 and NGC21-028.

Sincerely,

Ashley A. Bryan

Government Information Specialist

Office of General Counsel (NGC)

National Archives and Records Administration

8601 Adelphi Road

College Park, Maryland, 20740-6001

301-837-3642

From: National Archives and Records Administration

Sent via Email <requests@muckrock.com>

January 17, 2023

Brooke Schreier Ganz

MuckRock News

DEPT MR 103767

263 Huntington Ave

Boston, MA 02115

RE: Freedom of Information Act Request NGC21-028

Dear Ms. Ganz:

This is in response to your email dated January 16, 2023 asking for a
status update for FOIA request NGC21-028. Currently, NGC21-028 is #189 in
our complex FOIA queue. Our estimated time to completion is 6 months from
today. This is a conservative estimate, and cases may move more quickly
through the queue. If you do not receive a response within this estimated
completion time, please contact us again for a status update.

In future correspondence, please cite tracking number NGC21-028.

Sincerely,

NGC FOIA Team

Office of General Counsel

National Archives and Records Administration

foia@nara.gov

301-837-3642

From: National Archives and Records Administration

February 16, 2023

Brooke Schreier Ganz

MuckRock News

DEPT MR 103767

263 Huntington Ave

Boston, MA 02115

RE: Freedom of Information Act Request NGC21-028

Dear Ms. Ganz:

This is in response to your email dated January 16, 2023 asking for a
status update for FOIA request NGC21-028. Currently, NGC21-028 is #183 in
our complex FOIA queue. Our estimated time to completion is 6 months from
today. This is a conservative estimate, and cases may move more quickly
through the queue. If you do not receive a response within this estimated
completion time, please contact us again for a status update.

In future correspondence, please cite tracking number NGC21-028.

Sincerely,

NGC FOIA Team

Office of General Counsel

National Archives and Records Administration

foia@nara.gov

301-837-3642

From: National Archives and Records Administration

Sent via Email <requests@muckrock.com>

July 19, 2023

Brooke Schreier Ganz

MuckRock News
DEPT MR 103767
263 Huntington Ave
Boston, MA 02115

RE: Freedom of Information Act Request NGC21-028

Dear Ms. Ganz:

This is in response to your email dated July 19, 2023 asking for a status
update for FOIA request NGC21-028. Currently, NGC21-028 is in our complex
FOIA queue. Our estimated time to completion is 8 months from today. This
is a conservative estimate, and cases may move more quickly through the
queue. If you do not receive a response within this estimated completion
time, please contact us again for a status update.

In future correspondence, please cite tracking number NGC21-028.

Sincerely,

NGC FOIA Team

Office of General Counsel

National Archives and Records Administration

foia@nara.gov

301-837-3642

From: National Archives and Records Administration

Sent via Email <requests@muckrock.com>

February 16, 2024

Brooke Schreier Ganz

MuckRock News
DEPT MR 103767
263 Huntington Ave
Boston, MA 02115

RE: Freedom of Information Act Request NGC21-028

Dear Brooke Schreier Ganz:

This is in response to your email dated February 16, 2024 asking for a
status update for FOIA request NGC21-028. Currently, NGC21-028 is #98 in
our complex FOIA queue. Our estimated time to completion is 8 months from
today. This is a conservative estimate, and cases may move more quickly
through the queue. If you do not receive a response within this estimated
completion time, please contact us again for a status update.

In future correspondence, please cite tracking number NGC21-028.

Sincerely,

NGC FOIA Team

Office of General Counsel

National Archives and Records Administration

foia@nara.gov

301-837-3642

Files

There are no files associated with this request.