A Chinese data company has harvested information on millions of people, allegedly on behalf of Beijing's intelligence services. Analysts say democracies should pay more attention to the strategic use of open source data.
More than 2 million people around the world have had their personal data collected on behalf of Chinese intelligence services, according to a leak of a dataset made public earlier this month by an Australian cybersecurity consultancy.
The list includes prominent political figures like Boris Johnson and Narendra Modi and their families, business leaders like Ratan Tata, US military members of all ranks, senior diplomats, academics, celebrities, ordinary people, and even gangsters.
The information was scraped mostly from open sources like social media profiles by a Chinese big-data harvesting company called Shenzhen Zhenhua Data Information Technology.
Zhenhua provides data-based intelligence services. According to analysts, its two main clients are China's Ministry of State Security and the People's Liberation Army.
At first glance, the information is seemingly innocuous: birth dates, political associations and marital statuses, family trees, bank details, job applications, communications between think-tanks, military service records, most of it scraps from Twitter, Linked-In or Facebook pages.
However, Christopher Balding, an American academic who was responsible for leaking the database, told DW the information is being strategically compiled by Chinese intelligence services for large-scale information campaigns aimed at influencing global public opinion.
"The reality is that open source intelligence is very valuable. It probably provides the majority of actionable materials that governments or intelligence agencies use," said Balding, who had been based in Shenzhen but fled to Vietnam over security concerns.
The database was leaked to Balding by an anonymous source inside China connected to Zhenhua. Balding shared it with the Australian cybersecurity firm Internet 2.0, which then interpreted and analyzed the data.
"China is known to be building a techno-surveillance authoritarian state domestically," Balding and Internet 2.0 founder Rob Potter said in a joint report on the findings, which claims to contain "the first direct evidence" of data collected by China to "monitor foreign individuals and institutions for purposes of intelligence and influence operations."
'Key information' at one click
Zhenhua, which also operates under the name "China Revival," named the database the Overseas Key Information Database (OKIDB).
The data includes 2.4 million individuals and 650,000 organizations from around the world. The information was harvested from 2.3 billion news articles and 2.1 billion social media posts.
Analysis of the database showed it contains information on at least 52,000 Americans, 35,000 Australians,10,000 Britons and 10,000 Indians.
The OKIDB appears to focus on individuals and institutions China deems influential or important, Balding said. Analysis by Internet 2.0 showed that the database used a "scoring" algorithm to track importance.
"Zhenhua is clearly classifying individuals and institutions based upon targeting," Balding said, adding that the database wasn't meant for storage, but rather was built to make it easier to search and link individuals though relationship and network mapping.
"It's clearly not a random assortment of people. We need to understand that this database was drawn up with very clear objectives in mind," he said. "It almost looks like they are drawing up a list of targets with a directory of how to access technology or influence institutions."
And up to 20% of data was not publicly available from open sources, according to analysis by Internet 2.0.
"We have reason to believe some of the data comes from unauthorized data access such as hacking but we cannot be certain," the joint report said.
What is 'hybrid warfare'?
Zehnhua is one of many Chinese data collection companies, which Balding and the analysts at Internet 2.0 say is a "unique blend of civil-military fusion pushed by China that works with private firms to engage in state policy activities such as intelligence gathering."
A report on Chinese "hybrid warfare" operations released in July 2020 by the Center for Strategic and International Studies in Washington, roughly defined the concept, in the context of the US-China rivalry, as operations that "advance Chinese interests without ever escalating competition to a conventional battle."
Under President Xi Jinping, China has tried to expand its global influence by reinforcing China-friendly narratives abroad with increased influence operations.
Beijing's propaganda apparatus, under the "International Liaison Department," has also been reorganized to focus on promoting the Communist Party's narrative though information campaigns and building rapport with foreign officials and business leaders.
Big data is everywhere
Big data collection is, of course, not limited to China. Collection of open source data is also carried out by the US and other governments, along with global corporations and tech companies.
Zhenhua's activities could be compared to the 2016 Cambridge Analytica scandal, which involved data from over 20 million Facebook accounts being harvested without users' permission.
However, analysts point out the difference with China is that the data collection being carried out on behalf of a government that demonstrates authoritarian tendencies
"Facebook is collecting data on you so they can sell you things. They can say 'click on the pair of shoes that you were looking at the other day.' The data that Zhenhua is collecting is most certainly not used for that purpose," Balding said.
"We can link this company very closely to Chinese security and military intelligence. One of the things that we see in Zhenhua's database is their ability to geo-locate a large number of people based on a variety of data."
What can the data be used for?
Analysts are exploring the possibility that Chinese intelligence is using picture location features on Facebook or Twitter to track US military members.
"We are talking about hundreds of thousands of very low-level personnel. They are following their Twitter and Facebook so they can geo-locate them," Balding said.
"I think their ability to extract usable information from those types of environments is very interesting and should be worrisome to a lot of people."
Balding warns that institutions and individuals in the West are underestimating the scale of the Chinese surveillance state and resources Beijing is pouring in to influence operations.
"When democratic countries are faced with authoritarian threats that are seeking to influence individuals, politicians or universities, they should probably rethink the standards of data privacy and data security for citizens," he said.
"There ways to provide greater protection without ending users' freedom to share things with others."
Additional reporting by William Yang.