L-AI Bot, a chatbot that hones Brazilian information requests before they’re filed • MuckRock

When someone sends a records request, one of the biggest questions is whether it will be completed. Many records laws around the world give agencies at least a month to respond, and no one wants to waste time asking for information that will be denied.

How do we better gauge in advance whether a request for information will be accepted or denied? And how do we improve the process for requests for information to increase the chances of a successful response?

It was with these questions in mind that I began developing a project for the Brazilian Association of Investigative Journalism (Abraji) to help journalists interested in data and documents from Brazilian public authorities.

The good news is that, in Brazil, all requests for information and resources sent and responded to by any federal government agency are proactively published on government websites, both in CSV and PDF formats.

The project I created consists of organizing this information available online and making access to it easier, using technology and journalistic curation, so that anyone interested knows how the Brazilian government decides on a given subject as it pertains to public records requests.

I have developed two main platforms for the public: an interactive dashboard, developed by the project Data Fixers, and a chatbot that gives tips on access to information precedents, the L-AI Bot (LAI is the acronym for Law of Access to Information, in Brazilian Portuguese).

The dashboard allows any user to search for keywords and find government decisions, in addition to providing general transparency statistics by agency. The chatbot answers questions on specific topics of interest to users, such as “What was decided about the transparency of data on visitors to Palácio do Planalto?” (headquarters of the Presidency of the Republic in Brazil) or “How can I get data on access to weapons in Brazil?”. In some cases, the chatbot gives direct access to information, but the most common is that it tells the “transparency status” of that information — that is, the government’s understanding of whether such a document is considered public or not, based on previously responded requests.

The first tests of the chatbot were error-prone, both due to the problems of hallucinations that we all know from experiences using platforms like ChatGPT, but also because the decision texts are very long and contain details that are of no interest to the users. I was also bothered by the lack of sources in the answers. Basically, the user would need to “take the chatbot at its word”, without direct access to the government’s decision.

For this reason, I reached out to MuckRock to try to find a solution that summarized the text of each of the decisions on requests for information from the Brazilian federal government. We used the **GPT 3.5 Turbo Add-O**n and gave the following instructions, which had to be followed for more than 800 decisions in PDF format hosted as documents on DocumentCloud:

“Create formal and technical summaries of Brazil’s agencies decisions on appeals of information requests registered under the Brazilian Access to Information Law (LAI). Process the texts with these topics in the summary:

1 - Title containing the following information: name of the agency, followed by a summary with the main information and decision (example: Army Command: access to weapon data is granted);

2 - Protocol number, which always has a similar pattern (example: 23546.060760/2023-17);

3 - Date of decision (always in the format DD/MM/YYYY)

3 - Information requested (the information that was initially requested by the applicant);

4 - Decision on the appeal (whether it was granted, partially granted, denied, etc., and the main reason for this decision);

5 - Initial response from the agency (the reason the agency used to deny information);

6 - Precedent (Succinctly, say whether the decision generates any transparency precedent - do not use adjectives such as important or historic, just say the facts).”

I then started feeding the chatbot with the decisions summarized in these six points and specifically instructed that the response always contain the link to the full government decision, hosted on DocumentCloud. The answers became much more precise and, now, with a link to the source document. This way journalists can check, on their own, the tip brought by the chatbot - and think more strategically about the requests for information they will register.

You can see the interactive dashboard and L-AI chatbot for yourself, and apply similar approaches in other countries to help requesters globally. Visit DataFixers homepage to learn more about our work and other projects.