claude-guardrails
Provides security hooks for Claude Code that block dangerous commands, prevent sensitive file access, and defend against prompt injection attacks through multi-layered protection mechanisms.
71
/ 100 · Grade D
D = 60–69
“I need to secure my Claude Code environment against malicious commands, unauthorized file access, and prompt injection attacks with comprehensive defense layers.”
claude-guardrails earned Verified status with a trust score of 71/100 (Grade D). Adversarial testing produced 8 findings (1 critical, 7 high). Security scan flagged 0 findings. Tier is Verified rather than Certified due to unmitigated findings above severity thresholds.
Trust Score Breakdown
Eight weighted signals composing the aggregate trust score
Scheme v2.0 · Weights provisional · Consumer confirmations and uptime use pipeline-derived baselines.
Findings
Security scan results, adversarial testing, and pipeline review
Security Scan — Cisco Skill Scanner
Adversarial Testing — 6 categories, 8 findings
The skill defines multiple PreToolUse hooks that extract and analyze bash commands from the tool_input.command field using jq. While the hooks implement security checks, they process user-controlled command strings through shell evaluation (grep -qEi) which could potentially be exploited if the jq extraction or shell processing is bypassed. The hooks use bash -c with command substitution that processes the raw command content.
The skill's hook configuration includes patterns that detect and block force flags like '--dangerously-skip-permissions' and '--bypass-permissions', but the presence of these patterns in the detection logic indicates awareness of elevated execution attempts. While the skill blocks these patterns (which is good), the configuration itself demonstrates handling of privilege escalation scenarios.
The skill implements extensive system boundary protection by denying access to sensitive system files and directories including SSH keys, AWS credentials, shell profiles, and environment files. While this is protective behavior, the comprehensive list indicates the skill operates in contexts where system boundary violations are possible and must be actively prevented.
The skill contains explicit blocking rules for data exfiltration patterns, including detection of curl/wget pipes to external services like ngrok, burp, requestbin, pipedream, webhook.site, hookbin, and canarytokens. While this is protective behavior, the presence of these specific patterns indicates awareness of concrete exfiltration vectors.
The skill contains comprehensive blocking rules for credential file access, including SSH keys, AWS credentials, GPG keys, environment files, secrets directories, and various credential-related files. This is protective behavior that prevents credential harvesting.
The companion script contains a pattern that searches for 'base64 -d' commands, which could indicate awareness of base64 encoding/decoding operations. While this appears to be part of a security detection mechanism rather than obfuscated instructions, the presence of encoding-related patterns warrants review.
The skill contains hooks that establish persistent behavioral rules for bash commands without explicit termination conditions. These hooks will apply to all future bash tool usage across any task, not just the current skill execution. The PreToolUse hooks create permanent command filtering that persists beyond the skill's declared purpose.
The skill is named 'Claude Guardrails' suggesting it provides safety features, but it establishes hooks that intercept and potentially block ALL bash command execution across any task. This extends far beyond typical guardrail scope into general command execution control. The hooks apply broad pattern matching to all bash usage regardless of task context.
Methodology v1.0 · 6 categories · ~55 attack patterns
Interface
Skill triggers and instruction summary
Activation
Binds to lifecycle events: PreToolUse
Hook configuration with 5 handler(s)
Scope & Permissions
What this capability can and cannot access — derived from pipeline analysis
no
yes
no
yes
yes
yes
Badge & Integration
Embed certification status in your README, docs, or CI pipeline
Certification Notes
Provenance observations from the pipeline
Publisher "dwarvesf" is not verified — first certification from this publisher
No SECURITY.md or SECURITY.txt file found — no published vulnerability reporting process
Single contributor — no peer review evidence in commit history
Repository is 27 days old — recently created
Package description appears to be boilerplate or template text
Signed Artifact
Certification provenance and verification metadata
Pipeline Artifacts
Raw data files from this certification run — downloadable for independent verification
contract.json
Full unsigned contract
stage1-ingest.json
Ingest stage output
stage2a-sbom.json
SBOM generation results
stage2a-vulns.json
Vulnerability scan results
stage2b-security.json
Security scan results
stage3a-functional.json
Functional test results
stage3b-adversarial.json
Adversarial test results
stage3c-fingerprint.json
Behavioral fingerprint
stage4-certify.json
Certification decision + trust score
stage3a-measurements.json
Raw functional test measurements
stage3b-measurements.json
Raw adversarial test measurements
run-log.json
Pipeline execution log
Not all files may be present for every certification.