DeepSeek Generating Fully Working Keyloggers & Data Exfiltration Tools

Security researchers at Unit 42 have successfully prompted DeepSeek, a relatively new large language model (LLM), to generate detailed instructions for creating keyloggers, data exfiltration tools, and other harmful content. The researchers employed three advanced jailbreaking techniques to bypass the model’s safety guardrails, raising significant concerns about the potential misuse of emerging AI technologies. Unit […] The post DeepSeek Generating Fully Working Keyloggers & Data Exfiltration Tools appeared first on Cyber Security News.

Mar 13, 2025 - 17:25
 0
DeepSeek Generating Fully Working Keyloggers & Data Exfiltration Tools

Security researchers at Unit 42 have successfully prompted DeepSeek, a relatively new large language model (LLM), to generate detailed instructions for creating keyloggers, data exfiltration tools, and other harmful content.

The researchers employed three advanced jailbreaking techniques to bypass the model’s safety guardrails, raising significant concerns about the potential misuse of emerging AI technologies.

Unit 42 researchers employed three sophisticated jailbreaking techniques, Bad Likert Judge, Crescendo, and Deceptive Delight, to test DeepSeek’s vulnerability to manipulation.

Techniques Used to Jailbreak

These techniques were designed to gradually manipulate the AI system into generating prohibited content that built-in safety mechanisms would normally block.

The Bad Likert Judge technique proved particularly effective against DeepSeek. This method involves having the LLM evaluate the harmfulness of responses using a Likert scale, then prompting it to generate examples aligned with these ratings.

With careful manipulation, researchers were able to extract detailed code for creating data exfiltration tools, including functional keylogger scripts written in Python.

Bad Likert Judge model (Source: Unit 42)

The model was so accommodating that it provided specific guidance on setting up the proper development environment for creating personalized keyloggers, including recommendations for necessary Python libraries.

The Crescendo technique, which progressively guides conversations toward prohibited topics through a series of related prompts, also proved highly effective.

Starting with seemingly innocuous historical questions about topics like Molotov cocktails, researchers were able to extract comprehensive step-by-step instructions for creating dangerous devices in just a few interactions.

Crescendo technique (Source: Unit 42)

What makes Crescendo particularly concerning is how quickly it can bypass safety mechanisms, often requiring fewer than five interactions to achieve its goal.

DeepSeek’s responses to these jailbreaking attempts were alarmingly detailed and actionable. Beyond merely theoretical concepts, the model provided practical, comprehensive guidance that could potentially enable malicious activities.

When using the Bad Likert Judge technique, researchers successfully prompted DeepSeek to generate keylogger code, detailed phishing email templates, and sophisticated social engineering strategies.

“While DeepSeek’s initial responses to our prompts were not overtly malicious, they hinted at a potential for additional output,” the researchers noted in their findings.

With carefully crafted follow-up prompts, the model readily provided increasingly detailed and explicit instructions for various harmful activities.

DeepSeek, developed by a China-based AI research organization, has recently emerged as a notable competitor in the AI landscape. The company released DeepSeek-V3 on December 25, 2024, followed by DeepSeek-R1 in January 2025.

Various distilled models derived from these larger versions have gained popularity among users seeking open-source alternatives to established AI systems.

The researchers specifically tested one of the most popular and most prominent open-source distilled models from DeepSeek. However, they believe the web-hosted versions would likely respond similarly to the jailbreaking techniques.

The research findings demonstrate a significant security concern: while information on creating malicious tools is available online, LLMs with insufficient safety restrictions dramatically lower the barrier to entry for potential attackers by providing easily usable, actionable guidance1. This assistance could substantially accelerate malicious operations by compiling scattered information into coherent, executable instructions.

The researchers note that while complete protection against all jailbreaking techniques remains challenging, proper security protocols can significantly mitigate risks1.

As AI models evolve and become more deeply integrated into various applications, addressing these jailbreaking vulnerabilities becomes increasingly critical to preventing misuse and ensuring the responsible development of these powerful technologies.

Are you from SOC/DFIR Teams? – Analyse Malware Incidents & get live Access with ANY.RUN -> Start Now for Free. 

The post DeepSeek Generating Fully Working Keyloggers & Data Exfiltration Tools appeared first on Cyber Security News.