12466 Tags
la nacion
1 Project
DockIns: Machine Learning on Deadline for Journalists
4 Articles

Cómo correr Sidekick
Alguna vez ¿has tenido una pila de documentos y has querido comenzar a concentrarte rápidamente en una parte determinada de material? ¿Te gustaría contar con ayuda para trabajar solamente en los contratos, o quizás, los informes policiales que detallan un determinado tipo de encuentro, o bien, poder dividir rápidamente las cartas de respaldo de aquellas negativas dirigidas a un político sobre un tema clave?

Reconocimiento de Entidades (NER) sobre textos en español
Como periodistas trabajando con documentos y bases de datos, nos encontramos con que la información más interesante se oculta en aquellos documentos que son largos, no estructurados o incompletos.

Testing two Named Entity Recognition models on Spanish documents
As journalists dealing with data and document sets, we find that the most interesting information is usually hidden in large, unstructured, and incomplete sets of documents. Especially information in public contracts: what the government is buying, how much money is being spent, and who are the suppliers. To answer these questions, four media organizations joined forces under the JournalismAI Collab and experimented with different machine learning tools and techniques in order to build a platform that helps investigative reporters understand and process unstructured documents to get useful insights. This platform ended up being “Dockins”.

Categorize DocumentCloud collections in real-time with SideKick
Ever get a pile of documents and want to start quickly honing in on a certain segment of material? Wish you had a little help pulling out just contracts, or maybe police reports that detail a certain type of encounter? With MuckRock’s DocumentCloud platform, that’s a challenge we know all too well — and we have a new solution to help.