Malware Application Security AI Security

OpenClaw adds NVIDIA SkillSpector to ClawHub checks

Wed, 3rd Jun 2026

OpenClaw has added NVIDIA Skill Cards and NVIDIA SkillSpector to ClawHub's skill review process, introducing two new checks for screening software skills before publication.

Every skill published through ClawHub will now include a Skill Card describing what the skill does, who published it and where it came from, OpenClaw said. ClawHub verifies the cards instead of relying on the publisher's own description.

The changes follow earlier attempts by malicious actors to upload skills containing known malware to ClawHub after the registry launched. OpenClaw had already been using VirusTotal to identify those files and automatically ban the publishers behind them.

Conventional malware scanning does not adequately address what OpenClaw called agentic risk, where a skill's stated purpose may differ from what its code actually does. That could include a tool presented as a log summariser that instead sends data elsewhere, or a skill that points an agent to a command-line tool capable of erasing production systems if used incorrectly.

Three scanners

Under the updated process, each new skill version goes through a pre-publication verification stage. An OpenAI Codex agent receives output from three scanners: OpenClaw's static analysis, VirusTotal and NVIDIA SkillSpector.

ClawScan then evaluates those findings alongside provenance, metadata and moderation history before issuing a final verdict of Clean, Suspicious or Malicious. It also generates the Skill Card attached to the skill.

SkillSpector is designed to identify issues that may not trigger a traditional malware alert. It uses static checks and AI-assisted semantic analysis to look for hidden instructions, risky code paths, dependency issues, excessive permissions and mismatches between a skill's declared purpose and its behaviour.

Findings from SkillSpector appear in ClawHub as advisories and do not by themselves block publication. The final decision remains with ClawScan, which weighs those results against the other signals.

OpenClaw initially expected substantial overlap among the three scanners, but the data showed otherwise. No pair of scanners agreed on more than 10.4% of their combined positive findings, while just 468 skills, or 0.69%, were flagged by all three at the same time.

It said 81.9% of positive findings came from a single scanner. That pattern suggests each tool detects a different kind of risk rather than exposing weaknesses in any one scanner.

Scan results

Across 67,453 latest public skill versions in its dataset, SkillSpector returned positive results on 32,856 rows, or 48.71%. That compared with 5,225 rows, or 7.75%, for VirusTotal and 4,434 rows, or 6.57%, for static analysis.

Within 25,504 rows labelled suspicious by ClawScan, SkillSpector was positive on 19,209 rows, or 75.3%. Among 206 malicious rows, VirusTotal was positive on 150 rows, or 72.8%, while SkillSpector was positive on 14 rows, or 6.8%.

Those figures point to a divide between broad risk indicators and stronger evidence of outright malicious code. OpenClaw said one skill produced 173 findings from SkillSpector but was still marked suspicious rather than malicious by ClawScan.

That distinction, OpenClaw said, shows why it uses a model acting as a judge to assess several forms of evidence together. The challenge, in its view, is separating software that presents a large risk surface from software that is intentionally harmful.

OpenClaw is also releasing a public dataset containing security scan outcomes for 67,453 latest public skill versions. The dataset was previously kept within ClawHub and is intended to help outside researchers study risks in agent skill ecosystems.

ClawHub, which OpenClaw described as one of the more widely used skill registries, runs the full ClawScan suite on thousands of published events each day. According to OpenClaw, the process consumes millions of tokens using OpenAI GPT-5.5.

"Our assumption was that the results from these three scanners would mostly overlap. Instead, they barely overlap at all," OpenClaw said.

"No pair agrees on more than 10.4% of its combined positives. Only 468 skills, or 0.69%, are flagged by all three scanners at once. 81.9% of positive findings come from a single scanner alone," it said.

"Rather than keep this corpus of scan outcomes to ourselves, we're excited to open-source it for the broader security community to help us improve," OpenClaw said.

ChatGPT

Key takeaways Explain why it matters Create action plan Future watch

Claude

Key takeaways Explain why it matters Create action plan Future watch

Perplexity

Key takeaways Explain why it matters Create action plan Future watch

Grok

Key takeaways Explain why it matters Create action plan Future watch

Share Share

Add us as a preferred source on Google

Image: Agustin Rivera and Jacob Tomlinson