In January 2025, China-based AI startup DeepSeek (深度求索) released DeepSeek-R1, a high-quality large language model (LLM) that allegedly cost much less to develop and operate than Western competitors’ alternatives.
CrowdStrike Counter Adversary Operations conducted
independent tests on DeepSeek-R1 and confirmed that in many cases, it could
provide coding output of quality comparable to other market-leading LLMs of the
time. However, we found that when DeepSeek-R1 receives prompts containing
topics the Chinese Communist Party (CCP) likely considers politically
sensitive, the likelihood of it producing code with severe security vulnerabilities
increases by up to 50%.
This research reveals a new, subtle vulnerability surface for AI coding assistants. Given that up to 90% of developers already used these tools in 2025,1 often with access to high-value source code, any systemic security issue in AI coding assistants is both high-impact and high-prevalence.
Embedded censorship
The study also identified an embedded refusal
mechanism-described as a 'kill switch'-within DeepSeek-R1. In around 45% of
tests relating to requests involving Falun Gong, the model refused to generate
code, despite preparing a detailed plan during its reasoning phase. This
behaviour occurred even when using the raw open-source model, rather than the
company's API or smartphone app, indicating that the censorship is embedded in
the model's weights.
During these instances, DeepSeek-R1 would plan a response acknowledging ethical and policy implications, only to issue a short refusal message when asked to produce code. Researchers said such behaviour suggests the presence of hardcoded censorship mechanisms, rather than external moderation or content filters.
The findings, shared exclusively with The Washington Post,
underscore how politics shapes artificial intelligence efforts during a
geopolitical race for technology prowess and influence.
In the experiment, the U.S. security firm CrowdStrike bombarded DeepSeek with nearly identical English-language prompt requests for help writing programs, a core use of DeepSeek and other AI engines. The requests said the code would be employed in a variety of regions for a variety of purposes.
DeepSeek’s models were especially vulnerable to “goal
hijacking” and prompt leakage, LatticeFlow said. That refers to when an AI can
be tricked into ignoring its safety guardrails and either reveal sensitive
information or perform harmful actions it’s supposed to prevent. DeepSeek could
not be reached for comment.
When a business plugs its systems into generative AI, it
will typically take a base model from a company like DeepSeek or OpenAI and add
some of its own data, prompts and logic
.

No comments:
Post a Comment