Crowdsourcing enables media outlets to carry out investigations that were previously impossible. For the Argentine VozData team, keeping the crowd engaged was one of the challenges.
When Florencia Coelho, a blogger and journalist at Argentina’s La Nación newspaper, read about how the British newspaper, the Guardian, analyzed the expenses of members of parliament by using crowdsourcing to digitalize the data, she was excited. She imagined the potential impact they could have if they carried out something similar in Argentina. Once the discussion started circulating at La Nación, there was no going back. The newspaper founded VozData, its data digitization project. Coelho, the project manager of VozData decided to put their own government’s documents under scrutiny by digitizing senate expenses in order to hold those in power to account. Even if the Guardian was the inspiration, copying the model was not enough to make VozData work in Argentina. They had to create their own path.
VozData carried out the digitization of over 30,000 senate expenses documents from PDF files using crowdsourcing between 2011 and 2014. In 2015 they expanded their activities and digitized over 20,000 telegrams from the election primaries. They are also carrying out private investigations where over 40,000 audio files have been transcribed.
When the project was set up they couldn’t find an appropriate software program to convert the PDF files. So Coelho and her team decided to turn to the crowd for help: in their first investigation alone, over 30,000 files needed to be manually entered and uploaded.
Volunteers from universities, transparency NGOs and members of the public – all together Vozdata recruited over 1,000 volunteers from all over the country. VozData has a fortunate position in Argentina' media landscape as a project of the prestigious national broadsheet, La Nación. This gives VozData a level of protection that other smaller independent projects might not have had. It also provided a well-established outlet and audience to share their released files with.
“I think that being a media house helped us,” says Coelho. “We used the main La Nación Twitter account to get volunteers and we also used the Facebook account.” Using the social media elements of La Nación gave VozData access to a huge audience from which to start their recruitment campaign. Otherwise, creating such a big network takes even more time and effort.
Building robust virtual communities
It quickly became clear that there was a core group of about 200 people who individually revised more than 50 documents each. When there was a need to build momentum to finish the first tranche of senate accounts before a national holiday, Coelho embraced social media again and made a hashtag: “#the news that people want to know is open data, here is something you can do for your country for this national holiday, besides wearing a pin of the flag.” Thus, VozData managed to attract a lot more help. A datathon at La Nación and events in two separate universities enabled volunteers to come together. For VozData it was a chance to meet some of their “super-users.”
“It was good to get to know the people behind the computer,” says Coelho. “I tried to create a social hub and realized sometimes you need a physical event to build up speed. I then tried to access hacker events and hackathons to be involved in these kinds of networks.”
Nevertheless, keeping those involved engaged whilst ensuring quality proved to be a challenge for Coelho and her team. It became clear that many security parameters had to be in place to protect those involved and the integrity of the data being digitized. “We make sure that our platform is very secure with users having to login and ensure that everything is password protected,” insists Coelho. “You need a process where you can get help from a huge community but have protocols in place for the information that you get from them.” Screening people who want to sabotage is not an option. “What we do instead is ensure that every document needs to be verified by three different users. We also use a software tool called Canonical that is able to detect typos and helps suggest alternatives to avoid mistakes.”
Changing the public sphere
As a result, VozData has grown in momentum and capacity since Coelho’s initial idea in 2009. It is now running a crowdsourcing data release project using teams - the only one of its kind in Argentina covering a wide variety of topics.
“Argentina has no freedom of information laws and this is the first time crowdsourcing has been used in the country or elsewhere in Latin America, to open up government data,” says Coelho. VozData had a unique position when the project was initiated, as there was no precedence for this type of freedom of government information and access to data project. There were therefore few official obstacles and the data within the expenses forms was relatively detailed. La Nación published dozens of articles about the senate expenses.
But what makes this project exciting to Coelho is also what makes it potentially damaging in the eyes of those in power. The government reacted quickly by changing the information provided in the expenses that followed 2010-2012. “These were amended to show less information,” exclaims Coelho. “Instead of explaining travel expenses (who, how, what, when, where), they now only put ‘See Annex’ that is not published.”
There have also been some more positive outcomes. In 2015 VozData carried out data digitization of 20,000 election telegrams from the primaries coming in from local authorities.
“These telegrams contain information on how many people voted, how many votes went to each party, if there were any void votes etc.,” explains Coelho. By carrying out this activity, VozData was able to identify areas where fraud could occur, where gaps in the electoral sheets could provide opportunities for additional figures to be added or where numbers could be changed.
“The National Election Authority recognized our crowdsourcing efforts,” says Coelho. “They decided to change the forms used for primaries and general elections to avoid mistakes and to foster transparency monitoring.”
As Coelho explains, the people involved in these crowdsourced activities were considered to be so well versed in election monitoring and transparency that they were officially taken on board to help in the national elections. “VozData volunteers were trained as election monitors as having gone through so many telegrams, they were able to determine what makes a good one and what needs to be correctly completed.” No blank spaces where anything can be added later; the total at the end; the number of envelops inside the poll box, are only some of the criteria the project found. “The aim is to make the system as hard to break as possible,” says Coelho. “We have contributed to democracy in Argentina. It is much harder to commit fraud after these changes.”