How you can use eviction data and public records to report on housing during the pandemic

Two years ago, the COVID-19 pandemic devastated the economic livelihoods of many Americans. Stopgap local, state and federal measures made it hard for landlords to push people out of their homes, but experts warned that tens of millions were still at risk of facing eviction.

At Searchlight New Mexico, we wanted to know the extent to which the government’s aid programs made a difference and kept people housed.

We looked at the entire housing system, which dangled the prospect of aid but still allowed landlords to file evictions, and judges to force people out of their homes.

This isn’t new. Tenants often didn’t get a fair shake in housing courts, even before the pandemic. But COVID-19 shone a brighter light on the problem.

This guide shows how we reported two of our stories — one on rental assistance and the other on the CARES Act eviction moratorium .

Both stories rely on tenant-level eviction data. New Mexico courts post the dockets for eviction cases online, but there’s no way to easily download the bulk data. So we wrote a computer program to open every docket and download the details into a spreadsheet.

We scraped fields including the names of the tenants and landlords, the addresses and the dates cases were filed. You can find our code here as an example, or reach out to the team at MuckRock and Stanford’s Big Local News, which has built a Python library for scraping local electronic court records websites.

If you need help finding where your local courts store eviction records, a housing attorney or the court system might help.

When you’ve got that data, you can acquire other datasets and compare them. In general, our framework followed three basic steps:

Acquire other datasets with variables that overlap with the eviction data. For example: location-based data sets (code violations by address, average income by zip code, etc.), or person-based data sets (other types of court records, rental assistance awards, etc.)
Match up the eviction data and the other data based on common variables.
Manually verify the findings.

Here are two projects where we did just that.

Evictions and the Emergency Rental Assistance Program

The federal government’s $46.5 billion Emergency Rental Assistance Program was a massive pandemic aid effort aimed at helping people pay their rent and stay housed. But we wanted to know whether the program actually worked in New Mexico or if it just benefitted landlords who evicted tenants anyway.

To answer this question, we needed to obtain the ERAP data, match it up with the eviction records and verify its accuracy.

First, we identified the public agencies in New Mexico that administered the ERAP programs. We sent public records requests for the data, which were eventually fulfilled. We put the ERAP data and the eviction data into the same Microsoft Excel file. We cross-referenced the names of ERAP awardees and eviction case defendants to identify people who received rental assistance and whose landlord also tried to evict them. Finally, we reviewed court records to verify those matches were correct.

Step 1: Get the ERAP data

In New Mexico, ERAP was administered by one state and two local agencies. We sent public records requests to each agency for all of the data they had on individual ERAP applications and awards. In the end, we decided to only use data from the state agency because it was provided in an Excel spreadsheet that made it easy to analyze.

It took some wrangling to get the data from the state agency – at first, they told us they would not provide tenant-level information. So, we requested a data dictionary and then negotiated with their staff to identify what data they could share. They ultimately provided us with tenant-level data. To make sure we understood what the data showed, we talked extensively with the agency’s public information officer.

Once we got the data, we needed to formulate a quantitative question that it could answer. The data showed who applied for and received ERAP, and when they applied and received it. But we wanted to know if the ERAP money kept them housed. But there aren’t any public records showing where people live at different times. This is where the eviction data is useful.

With tenant-level eviction data, we re-framed the question: How many tenants were in the process of applying for rental assistance when they received an eviction notice from their landlord?

Step 2: Match the ERAP data with the eviction data

First, we determined how to match up the ERAP data and the eviction data. There wasn’t a universal variable linking the two datasets. We considered using addresses but the addresses were often written in different formats (St vs. Street, Unit 1 vs. Unit #1, etc.). So we decided to use tenant names. This could lead to false positives (tenants with the same name who were actually different people) but we knew we could catch these when manually verifying the matches.

Then, we merged the datasets into one spreadsheet. The datasets were small enough that we could analyze them completely in Excel. However, to reduce the file size and make them easier to manage, we deleted extraneous information (data on utility assistance payments, information about interim steps in the application process, etc.).

Finally, we matched up the data.We used Excel’s XLOOKUP() function to match the names from rental assistance awards with the names of eviction cases. If there was a match, the formula copied over the eviction court case number, as well as the date the case was filed. We used the DATEDIF() function to calculate the times between the application date, the eviction case filing date and the award payment date.

We also filtered out ERAP applications that were not approved. Isolating and calling out landlords for trying to evict a tenant who was behind on rent and ineligible for assistance didn’t seem fair.

Step 3: Verify

Next, we needed to verify that the person who applied for rental assistance was actually the same person who had an eviction filed against them. We manually looked up the name of each person in public court records and verified that the address on the eviction filing was the same as the address on the rental assistance application. This search returned every single eviction case filed against them. If we identified a case that Excel’s XLOOKUP() function missed, we manually added it into our database.

We also needed to verify that the landlord tried to evict the tenant for failing to pay rent, and not for something else (like damaging the apartment or breaking a condition of the lease). We opened up the case documents for every eviction filing to check the reason for eviction.

Additionally, we recorded the outcome of each case to note if the tenant was forced out, the case was dismissed, or something else happened.

Evictions and the CARES Act

Federal and state governments took an unprecedented step during the pandemic to ban evictions against some tenants who couldn’t pay rent. We wanted to know if landlords followed these rules.

Answering that question meant counting how many evictions violated the laws. We decided to do this for the moratorium that was part of the CARES Act, in part because it was relatively easy to answer using available documents. (For example, the CDC’s order required tenants to prove during a hearing they couldn’t pay because of the pandemic, but, in most cases in New Mexico, eviction hearings are not recorded. So there’s no way to assess whether the court followed the CDC’s protocols.)

Between late March and July 2020, the law banned landlords, covering about a third of the country’s rental properties, from filing to evict tenants under two conditions: The eviction was filed because the tenant didn’t pay rent and the property was at least partially federally-backed.

Further reading: How Policymakers (and Courts) Sabotaged Eviction Moratoria

Step 1: Get data on federally-backed properties in your area

The term “federally-backed” means that the property was either directly subsidized by the federal government or its mortgage was securitized by agencies like Fannie Mae and Freddie Mac. During the CARES Act, the NLIHC published and maintained a database listing apartments in federally-backed buildings. (The data doesn’t include single-family units.) NLIHC staff can provide you with spreadsheets of the data that show which buildings were federally-backed during specific time periods.

Further reading: Can You Be Evicted During Coronavirus? Here’s How to Find Out.

Step 2: Match the eviction addresses with addresses of federally-backed properties

There are a few ways to match eviction addresses to addresses from the NLIHC database, keeping in mind the data is not perfect – some addresses might be misspelled or the same building might go by different names. We found OpenRefine helpful for clustering properties by address and name.

Match the data using programming languages:

Before you go any further, try joining the data using a tool like SQLite or Microsoft Access. These tools allow you to tell your computer to look at data in one spreadsheet and match it to data in another. If you have very clean eviction data, with only one address for each property, you may get a lot of matches.

Another way to do this is to “fuzzy match” or match two pieces of information that are very similar to each other but may differ in capitalization, spelling errors or order. Max Harlow of the Financial Times created a great tool to do that called CSVMatch and tutorials to teach you how to use it.

The last way to do this is less glamorous and more time consuming: Match the data manually. This ever-effective technology is not to be underestimated. In the end, we decided to split the work between two people and go through each federally-backed address in Albuquerque manually to find the corresponding properties and addresses in the eviction data. We even double-checked addresses using Google Maps and, in a few cases, drove by the property to see how far addresses extend past the rental office address.

At the end of this step, we had a list of federally-backed properties whose landlords appeared to have illegally evicted tenants during the CARES Act.

Step 3: Verify

Next, we needed to verify if the property was federally backed at the time of the eviction filing. This information is in a property’s actual mortgage documents, which you can obtain for each property to double-check the data you received from the NLIHC. We found building-specific mortgage information at the county clerk’s office by looking at the property’s history and checking when the mortgage began (the “assignment”) and ended (the “release”). We went to the county clerk’s office in-person to print the documents and then read through them with a housing attorney who better understood the legal language.

With the actual mortgage documents in hand and the dates double-checked, your data work should be “bulletproof” or held up to fact-checking standards. On the other hand, if property had a federally-backed mortgage that ended before or during the CARES Act, you’ll have to adjust to measure which evictions at the property fit into the correct time frame.

Then, we needed to verify that the tenant was evicted for failing to pay rent, and not for another reason. The CARES Act moratorium only applied to situations where the landlord tried to evict the tenant for failing to pay rent. In order to verify this, we looked up each case and downloaded the complaint the landlord filed.

A word of caution: this docket says “rent due,” but we found cases of rent due evictions in which the tenant was being evicted for breaking their lease in other ways, disqualifying their eviction from the CARES Act protection.

Always check your data with outside sources to be sure the data mean what you think they do.