On Thursday, Reddit announced a new policy aimed at striking a balance between its ambition to lease its content to larger tech companies such as Google and user privacy
The recently declared “Public Content Policy” will now be integrated with Reddit’s pre-existing privacy policy and content policy to regulate how commercial entities and other partners access and utilize Reddit’s data. Similarly, the organization declared the launch of a subreddit committed to researchers utilizing Reddit’s data.
The announcement is made near Reddit’s initial public offering on the stock market, during which the company strategically positions itself to increase revenue from its corpus of data in addition to the advertisements that run on its platform and API utilization by developers.
According to the company’s IPO prospectus, it has generated $203 million in revenue thus far from data licensing agreements and anticipates further growth.
Although Reddit had not previously restricted access to its data for AI training, it made a shift in stance late last year. It made no sense for Reddit to continue providing “all of that value to some of the largest companies in the world for free,” Reddit CEO Steve Huffman told The New York Times, indicating the company’s intention to enter the data licensing market.
Having made significant progress in this direction, the new Public Content Policy will prohibit unauthorized access to Reddit’s data. (Reddit claims it is merely publicizing the policy it has had in place internally for some time and is not introducing any new restrictions.)
Reddit writes in its blog, “Unfortunately, an increasing number of commercial entities are misusing authorized access or gaining unauthorized access to amass large quantities of public data, including Reddit public content.”
“Add to that, these entities believe their usage of that data is unrestricted, and they disregard reasonable user removal requests, safety concerns, and legal and privacy requests.”
We must do more to restrict access to Reddit public content at scale to trusted actors who have agreed to adhere to our policies while we continue to block known bad actors. However, we must also maintain access controls for users, moderators, researchers, and other non-commercial actors acting in good faith.”
Access to Reddit data will remain available exclusively for research and non-commercial endeavours. However, entities desiring to utilize Reddit’s data for other purposes, such as AI training, must remit a fee. Reddit specifies this in a blog post containing a graphic: “A contract is required for businesses that wish to utilize Reddit data to “power, augment, or enhance their product for any commercial purposes.”
In contrast, advertisers are redirected to an ads API, which facilitates campaign management and performance monitoring.
Because the organization is essentially a sizable search engine indexable website, this new policy protects Reddit content from unauthorized acquisition while upholding users’ rights.
Reddit, for example, mandates that its partners submit the deletion decisions of users who have posted content. Users should be able to opt out of having their confidential posts used as fuel for future AI engines.
The new policy also prohibits partners from utilizing Reddit’s content to discern the identities of individuals or their personal information, including for ad targeting.
Partners are also forbidden from spamming or harassing Reddit users and conducting “background checks, facial recognition, government surveillance, or assisting law enforcement in any of the aforementioned.”
Additionally, access to pornographic media is restricted, and Reddit guarantees that its users’ personal information will not be sold, as stated in the policy.
Additionally, the organization specifies that it will never grant licenses for non-public materials, including private messages and non-public account information such as users’ emails or browsing history.
In support of researchers seeking to utilize Reddit data for non-commercial purposes, the organization has created a supplementary subreddit known as r/reddit4researchers.
The organization declares that it is collaborating with OpenMined to create a program that will direct and expand the collaboration of researchers with Reddit.