Research: AI could expose the identities of anonymous internet accounts.
A new study by scientists from Anthropic and ETH Zurich shows that modern artificial intelligence systems can identify the real-world identities of seemingly anonymous internet accounts. The research, published as a draft on the arXiv platform, demonstrates that large-scale linguistic models (LLMs) are capable of analyzing online activity and linking pseudonymous profiles to real people on a large scale.
The study, titled 'Large-scale online deanonymization with LLMs,' focuses on researching how AI agents can automate the process of 'denonymization' – that is, linking anonymous or pseudonymous accounts to real identities.
Previously, this process typically required analysts to perform many manual investigative steps, including searching for posts, analyzing writing styles, and following scattered clues on the internet. However, the research team showed that modern AI models can automate many of these steps.
In the study, the AI system analyzed public text from online platforms and extracted identity-related signals, such as personal preferences, demographic clues, writing style, and details inadvertently revealed in posts. The AI then searched for similar profiles online and assessed whether these clues matched known individuals.
To test this method, researchers built multiple datasets with pre-determined real identities. In one experiment, the AI system attempted to match users on the Hacker News forum with their LinkedIn profiles, even when obvious identifying information such as names or usernames had been removed.
Another dataset involved linking Reddit accounts using pseudonyms that were active in various communities. Meanwhile, another experiment split a person's posting history into two separate profiles to test whether the AI could recognize that both profiles belonged to the same individual.
The results showed that systems based on large-scale language models significantly outperformed traditional deanonymization techniques. In some cases, the models achieved recall rates of up to 68% with an accuracy of around 90% , meaning the AI could correctly identify many accounts while maintaining a relatively low error rate. In the same experiments, traditional methods achieved almost no significant results.
According to the researchers, these results show that AI can replicate tasks that previously required hours of work from human investigators. An AI system could automatically extract identity-related characteristics from text, search thousands of potential profiles, and infer which candidate is most likely the correct one.
This development is noteworthy because anonymity has long been considered a fundamental protection for many internet users. Pseudonym accounts are widely used by journalists, whistleblowers, social activists, and individuals who want to discuss sensitive topics without revealing their true identities.
Research suggests this protective layer – sometimes referred to as 'the blurring of reality' – may be weakening as AI systems become increasingly adept at connecting digital footprints across multiple platforms. If automated tools can perform this task quickly and at low cost, the barrier to identifying anonymous users could be drastically reduced.
Researchers estimate that the cost of identifying an online account using their experimental system could be as low as $1 to $4 per profile , meaning large-scale investigations could be conducted at a relatively low cost.
However, the authors also note that the study was conducted in a controlled environment and used only publicly available data. This work has not yet undergone peer review, and some technical details have been withheld to reduce the risk of misuse.
Nevertheless, the research findings quickly sparked debate among privacy and technology experts. Many argued that users may need to reconsider the amount of personal information they share online, even in seemingly anonymous spaces.
Looking ahead, researchers believe further investigation is needed into both the risks and safeguards against AI-powered deanonymization technology. Potential solutions could include better privacy protection tools, stronger security mechanisms from online platforms, or AI systems designed to automatically anonymize sensitive data before it is made public.
As artificial intelligence grows increasingly powerful in analyzing massive volumes of online content, this research presents a new challenge: how to balance the exploratory power of AI with the need to protect individual privacy in the digital age.
You should read it
- ★ Instructions for registering as an online citizen identity at home
- ★ Online privacy protection
- ★ Mastering Online Privacy: How IP Checks Keep You Safe
- ★ World's Largest Online Chess Platform Hacked
- ★ The future of Access and Identity Management (IAM), a field that is extremely important on the Internet