Welcome to Cloudberry Engineering
a notebook on building, breaking and securing systems.
by Gianluca Brindisi
  • Scaling Vulnerability Management with AI
  • Since there are so many to choose from, I built my own sandbox for local coding agents. I use it within my homebrew agent orchestrator running Ralph loops.

    The sandbox is this, and builds on the mental models I sketched here.

    What stands out compared to competitors:

    • Focus is user experience: it’s an abstraction on top of a container, but it’s simpler to setup with a high level config file that I sarcastically baptized Agentfile.
    • The networking boundary talks back to the agent, to avoid hallucinatory loops where the agent tries really hard to reach a remote destination that is blocked.
    # / #agents #sandbox
  • On Sandboxing Agents
  • Agent Sandboxes
  • We are moving towards a place where ticketing systems will become an important component to protect, akin to CI/CD.

    Tickets are a new source of untrusted input we need to account for when threat modeling against prompt injections.

    Ghostty only allows maintainers to create issues, seems to me they figured out a cheap and pragmatic security policy by accident.

    # / #agents
  • Claude Code Sandbox
  • How I Think About Agentic Risks
  • Coding Agents Security Theater
  • Finding vulnerabilities in modern web apps using Claude Code and OpenAI Codex. Super interesting to see some benchmarks.

    Traditional rule based detection can’t find complex vulnerabilities and even potentially detectable issues might go unnoticed as false negatives. This helps answer the question whether LLM could be integrated to cover this blind spot.

    They could! But the problem is the noise:

    AI Coding Agents Find Real Vulnerabilities: Claude Code found 46 vulnerabilities (14% true positive rate – TPR, 86% false positive rate – FPR) and Codex reported 21 vulnerabilities (18% TPR, 82% FPR). About 20 of these are high severity vulnerabilities.

    # / #llm
  • The nx Breach