Kaspersky warns AI password generators lack true randomness

Yesterday

Kaspersky is advising against the use of artificial intelligence tools such as ChatGPT, Llama, or DeepSeek for password generation due to identified security weaknesses.

The call to caution follows an internal analysis of passwords produced by popular large language models, raising concerns about the actual randomness and strength of passwords generated in this way. According to Kaspersky, AI-generated passwords can contain patterns or predictable elements that could make them more vulnerable to cyberattacks.

Testing of 1,000 passwords generated by models including ChatGPT, Llama, and DeepSeek found that a significant proportion failed to meet Kaspersky's standards. Specifically, 88% of passwords from DeepSeek and 87% from Llama were deemed insufficiently robust. Even passwords from ChatGPT, which performed somewhat better, were found wanting in 33% of cases.

Kaspersky highlights the widespread issue of password reuse, especially as the number of online services requiring unique passwords continues to grow. The risk is that a compromise of a single password may allow access to multiple services, exacerbated by people commonly using names, dictionary words, or simple number patterns.

To address the difficulty of creating and managing complex, unique passwords for each service, some users have looked to AI as a solution. The apparent convenience of asking a language model to generate a 'secure password' is clear; users hope to receive strings that appear random and are not based on obvious words. However, Kaspersky warns that these appearances can be deceptive.

Alexey Antonov, Data Science Team Lead at Kaspersky, described the findings from his evaluation of password generation by ChatGPT, Llama, and DeepSeek. "All of the models are aware that a good password consists of at least 12 characters, including uppercase and lowercase letters, numbers and symbols. They report this when generating passwords," says Antonov.

"DeepSeek and Llama sometimes generated passwords consisting of dictionary words, in which instead of some letters there are numbers of similar shape: S@d0w12, M@n@go3, B@n@n@7 (DeepSeek), K5yB0a8dS8, S1mP1eL1on (Llama). Both of these models like to generate the password 'password': P@ssw0rd, P@ssw0rd!23 (DeepSeek), P@ssw0rd1, P@ssw0rdV (Llama). Needless to say, such passwords are not safe," adds Antonov.

He notes that the practice of substituting numbers for letters is widely known and does not greatly increase security against brute-force attacks. ChatGPT avoided dictionary words in its output, producing passwords such as qLUx@^9Wp#YZ, LU#@^9WpYqxZ, and YLU@x#Wp9q^Z, which look random at first glance. However, further examination found repeated use of certain characters, particularly the number 9 and the letters x, p, l, and L.

Antonov reports, "Shown below is a histogram of all the symbols in 1000 generated passwords for ChatGPT – it is clear to see that almost all passwords out of 1000 contain the symbols x, p, l, L .... This doesn't look like random letters at all."

He goes on to say, "For Llama, the situation is slightly better: Llama likes the # symbol, the letters p, l, L. DeepSeek shows similar tendencies. An ideal random generator would not prefer any letter. All symbols must appear approximately the same number of times."

An additional weakness was observed in the frequency with which AI models included special characters or digits, an important element of strong password creation. "Also, the algorithms often neglected to insert a special character or digits into the password: 26% of passwords for ChatGPT, 32% for Llama and 29% for DeepSeek. While DeepSeek and Llama sometimes generated passwords shorter than 12 characters," Antonov remarked.

Such preferences and omissions introduce predictable patterns, raising the risk that cybercriminals can refine their brute-force techniques, focusing first on the combinations and characters most likely to appear in AI-generated passwords.

In 2024, Antonov applied a machine learning algorithm to assess password strength and found that nearly 60% of passwords could be cracked within an hour using modern GPUs or cloud-based tools. When this algorithm was used on the samples generated by the AI systems, the rate of weak passwords was notably high among all three models.

"The problem is LLMs don't create true randomness. Instead, they mimic patterns from existing data, making their outputs predictable to attackers who understand how these models work, notes Antonov"

Kaspersky recommends avoiding AI tools for password generation in favour of dedicated password management software. Such software, according to the company, uses cryptographically secure generators, produces passwords free of observable patterns, and stores these credentials securely in an encrypted vault requiring a single master password.

Password managers are also noted for their additional features such as auto-fill functionality, synchronisation across devices, and breach monitoring which alerts users to leaked credentials found in data breaches. These tools help users maintain secure, unique passwords for each account without the cognitive burden of remembering them all.

Kaspersky's analysis highlights the limitations of large language models in generating truly secure passwords. The predictability in the generated passwords is seen as a potential vulnerability, leading the company to recommend password managers as a more secure alternative for password creation and storage.

Share on: