Does anyone ever actually read or audit code? #275
Replies: 3 comments
-
|
Really appreciate the security audit discussion, this is exactly the kind of conversation that helps early agent runtimes mature responsibly. One angle this surfaced for me (and something I mentioned in the earlier OpenTelemetry observability discussion #255) is how security posture and observability tend to evolve together, especially in agent-driven Why Observability Matters Here (Technical Context)Agent runtimes like PicoClaw increasingly combine:
That combination makes traditional static hardening necessary but often not sufficient on its own. Runtime visibility becomes equally important. Where Observability Helps Security PracticallyStructured telemetry can help with: 1. Execution TraceabilityUnderstanding:
This is particularly useful for diagnosing prompt injection or tool misuse scenarios. 2. Resource & Behavior MonitoringEspecially on edge deployments:
These often surface issues earlier than logs alone. 3. Incident InvestigationDistributed tracing makes it easier to:
Lightweight Approach (Aligned With PicoClaw Philosophy)Given PicoClaw's focus on minimal footprint, I wouldn't advocate heavy observability stacks. Something like optional OpenTelemetry instrumentation could provide:
This keeps the runtime lightweight while improving operational confidence. Not a Replacement for Security FixesTo be clear: Observability complements security hardening it doesn't replace it. But in agent ecosystems specifically, visibility often:
That's why many modern AI infra stacks are converging toward built-in telemetry hooks. Curious About Maintainer DirectionWould love thoughts from maintainers or contributors:
Happy to contribute experimentation if it helps keeping things lightweight and aligned with PicoClaw's design goals. Closing ThoughtAgent runtimes are starting to resemble early cloud-native systems: Security, observability, and operational tooling tend to mature together. Getting that balance right early can make a big difference in adoption and reliability. Appreciate the ongoing discussion and excited to see how PicoClaw evolves. |
Beta Was this translation helpful? Give feedback.
-
|
PR to address observability is in - feat: add opt-in OpenTelemetry observability + Grafana/Prometheus/Loki demo stack #382 |
Beta Was this translation helpful? Give feedback.
-
|
I agree with that! Thanks i'll invest some time and I really appreciate it! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I saw this going viral on twitter for no particular reason. Other than the narrative. Immediate alarm bells. Lo and behold, what latest monster has emerged from the lairs of Shenzhen? Pico Claw is its name.
I'll attach the full research report, here:
PicoClaw Security Audit
Greyforge Labs Independent Review
Date: February 16, 2026
Status: Draft for internal review and refinement
Prepared by: Greyforge Labs
Executive Summary
Greyforge Labs initiated a full security audit of the PicoClaw codebase after observing what appeared to be unusual algorithmic amplification around the product and related hardware bundles. We treated that signal as a risk indicator, not as proof of wrongdoing.
Using local system audit tools, static code review, and controlled proof-of-concept testing, we identified multiple high-impact vulnerabilities and backdoor-like control paths that could be abused in real deployments.
The short version:
At this time, we did not find conclusive evidence of a covert, intentionally hidden command-and-control implant.
However, we did find behavior that can function as a practical backdoor surface when deployed without strict controls.
Why This Audit Was Started
Greyforge Labs was alerted to PicoClaw through what appeared to be manipulated algorithmic discovery patterns. Given known historical risks in fast-moving hardware/software ecosystems and supply-chain trust gaps, we ran a full defensive audit before any production use decision.
This report is intentionally evidence-first:
Scope and Method
Code Scope
sipeed/picoclaw(main branch source tarball).cmd/andpkg/.Audit Method
Local Tooling Used
rg,sed,find, and manual source tracing.Important Constraints
gosec,semgrep,govulncheck, etc.) were not available in-session.Findings at a Glance
0644)0.0.0.0)Detailed Narrative (Human-Readable)
1) Workspace lock is not actually locked
PicoClaw intends to restrict file operations to a workspace. In practice, the validation logic uses a simple string prefix check. That means a path that only looks similar to the workspace prefix can still pass.
Example concept:
/opt/workspace/opt/workspace-evil/secret.txtThis is a core containment failure. If the model can call file tools, this can lead to unauthorized file reads/writes.
2) Message-to-shell pipeline can become remote code execution
PicoClaw routes channel messages into the agent. The agent has an
exectool enabled. The shell tool executes commands throughsh -c(or PowerShell on Windows).If an external channel is enabled and allowlist is weak (or empty), attacker-controlled text can reach a tool-capable model loop.
This is not a theoretical edge case. It is a known unsafe architecture pattern unless hardened by strict policy controls.
3) Some channel modes behave like open ingress
The MaixCam channel opens a TCP listener. By default configurations, parts of the system bind to
0.0.0.0and some paths have no meaningful authentication boundary.That creates a "whoever can reach this port can inject events" condition.
If event handlers feed directly into agent logic, this is operationally equivalent to exposing an untrusted command inlet.
4) Web fetch tool can query internal network targets
web_fetchvalidates only that a URL uses HTTP/HTTPS. It does not deny localhost/private metadata targets.In agentic use, this can be abused for SSRF-style probing and data retrieval from internal services.
5) Security defaults are too permissive for production
Several defaults favor convenience over hardening:
These are not always immediate vulnerabilities alone, but they materially increase breach probability when combined.
Did We Confirm Intentional Backdoors?
Honest assessment
This distinction matters:
From a defender's standpoint, both are dangerous in production.
From an attribution standpoint, only the first can be alleged as intent, and current evidence does not support that claim.
Real-World Exploit Chain (Plausible)
One realistic chain:
execand/or file tools.This chain is serious enough to block production deployment until hardening is complete.
Remediation Priority (Action Plan)
Immediate (0-24h)
exectool by default in all externally reachable channel modes.High Priority (24-72h)
web_fetch:0600where appropriate).Structural (1-2 weeks)
Business and Deployment Guidance
For any SaaS or subscription model, this software should be considered unsafe-by-default until a hardened profile is enforced.
Minimum deployment posture:
Final Verdict (Current Draft)
PicoClaw in its reviewed state is not ready for high-trust production deployment without hardening.
The major risks are not subtle:
Even absent proof of intentional sabotage, these issues are sufficient to classify the platform as high-risk until remediated.
Researcher Appendix (Technical Evidence)
Controlled Local PoC Notes
We reproduced path bypass behavior in local simulation:
These checks validate exploitability of the current authorization pattern under realistic file system behavior.
Limitations and Responsible Disclosure Note
Beta Was this translation helpful? Give feedback.
All reactions