The Department of the Navy has posted a solicitation asking contractors to bid on a project that would amass a staggering 350 billion social media posts dating from 2014 through 2016. The data will be taken from a single social media platform – but the solicitation does not specify which one.
“We seek to acquire a large-scale global historical archive of social media data, providing the full text of all public social media posts, across all countries and languages covered by the social media platform,” the contract synopsis reads. The Navy said that the archive would be used in “ongoing research efforts” into “the evolution of linguistic communities” and “emerging modes of collective expression, over time and across countries.”
The archive will draw from publicly available social media posts and “no private communications or private user data” will be included in the database. However, all records must include the time and date at which each message was sent and the public user handle associated with the message. Additionally, each record in the archive must include all publicly available meta-data, including country, language, hashtags, location, handle, timestamp, and URLs, that were associated with the original posting.
The data must be collected from at least 200 million unique users in at least 100 countries, with no single country accounting for more than 30 percent of users, the advert says.
While the stated intentions of the project may sound benign, the US government has previously expressed interest in collecting social media data for more eyebrow-raising purposes. Last year, the US Department of Homeland Security issued a notice asking contractors to bid on a database that tracks 290,000 global news sources in over 100 languages. The contract also mentioned the ability to keep tabs on “influencers,” leading some reports to speculate that the proposed database could be used to monitor journalists.