|

Perplexity Open-Sources Bumblebee: A Read-Only Supply-Chain Scanner for Developer Endpoints

Attackers more and more goal the packages, editor extensions, and AI software configs on developer machines and never simply manufacturing methods. Perplexity has open-sourced an inner software it makes use of to handle this downside.

Perplexity launched Bumblebee on GitHub. The software is a read-only stock collector for macOS and Linux developer endpoints. It is written fully in Go and carries zero non-stdlib dependencies. Perplexity already makes use of it internally to guard developer methods behind its search product, Comet browser, and Computer agent.

Problem that Bumblebee Solves

If you’re a software program engineer or information scientist, you probably have dozens of packages put in regionally. You have editor extensions, browser add-ons, and probably MCP (Model Context Protocol) configs in your machine. When a brand new vulnerability surfaces, your safety workforce faces one pressing query: which developer machines are uncovered proper now?

Existing instruments don’t totally reply this. SBOMs (Software Bills of Materials) and vulnerability scanners cowl construct artifacts and repositories. EDR (Endpoint Detection and Response) merchandise monitor what processes ran or touched the community. Neither checks native developer state — lockfiles, package deal metadata, extension manifests, and AI software configs scattered throughout a laptop computer’s filesystem.

Bumblebee fills that hole. When an advisory names a package deal, extension, or model, it solutions which machines present a match of their on-disk metadata proper now. The ecosystem scope was additionally deliberate: the coated ecosystems map to current energetic supply-chain campaigns, together with the Mini Shai-Hulud collection, which hit npm, PyPI, RubyGems, Go modules, and Composer packages throughout corporations together with TanStack, SAP, and Zapier.

How Bumblebee Works

Bumblebee is a one-shot scanner. Each invocation performs a single scan and exits. Cadence is the operator’s duty — cron, launchd, systemd, or MDM fleet tooling. It outputs structured data as NDJSON (newline-delimited JSON), one per line, with diagnostics going to stderr.

The software helps three scan profiles. The baseline profile scans frequent world and consumer package deal roots, language toolchains, editor extensions, browser extensions, and MCP configs. The undertaking profile targets configured growth directories corresponding to ~/code or ~/src. The deep profile sweeps operator-supplied roots, sometimes a naked residence listing throughout an energetic incident.

Internally, Perplexity makes use of Bumblebee inside a five-step workflow. A risk sign arrives from public disclosures or third-party intel feeds. Perplexity Computer then drafts a catalog replace, getting into the sign as a structured entry with ecosystem, package deal identify, and model — and opens a GitHub PR with supply hyperlinks. A human dev evaluations and merges the PR. Bumblebee then runs on endpoints with the up to date catalog, and findings are shared with the safety workforce.

Image supply: https://www.perplexity.ai/hub/weblog/perplexity-is-open-sourcing-bumblebee

What Bumblebee Scans

Bumblebee covers 4 floor areas that present instruments sometimes deal with individually.

For language package deal managers, it reads from npm, pnpm, Yarn, Bun, PyPI, Go modules, RubyGems, and Composer. It reads lockfiles and put in package deal metadata immediately — sources like package-lock.json, pnpm-lock.yaml, go.sum, and *.dist-info/METADATA. Note that bun.lockb, Bun’s binary lockfile format, is just not parsed in v0.1; solely the textual content bun.lock format is supported.

For AI agent configs, Bumblebee reads MCP JSON host configuration information: mcp.json, .mcp.json, claude_desktop_config.json, mcp_config.json, mcp_settings.json, cline_mcp_settings.json, and ~/.gemini/settings.json for Gemini CLI. Non-JSON MCP configs corresponding to Codex config.toml and Continue YAML should not parsed in v0.1. It parses these information for server stock however doesn’t emit setting values or setting key names present in env blocks.

For editor extensions, it reads manifests from VS Code, Cursor, Windsurf, and VSCodium. For browser extensions, it covers Chromium-family browsers — Chrome, Comet, Edge, Brave, and Arc — plus Firefox.

Why Read-Only

npm packages can carry postinstall scripts that execute mechanically on npm set up. A scanner that invokes npm to test publicity has already triggered the assault it was trying for. Bumblebee avoids this fully by by no means working set up scripts or lifecycle hooks, by no means invoking npm, pnpm, bun, or pip, by no means studying utility supply information, and performing no course of or community monitoring. It is just not an EDR.

Output and Exposure Catalog

Each package deal document contains the hostname, OS, structure, ecosystem, package deal identify, model, supply file, and a confidence area. Confidence is excessive when precise id and model got here from canonical metadata, medium when id is dependable however model or supply is partial, and low when solely a config path or spec reference is discovered.

Security groups provide their very own publicity catalogs — easy JSON information specifying ecosystem, package deal identify, and affected variations. When Bumblebee finds a match, it emits a discovering document together with severity, catalog ID, and proof. Each discovering is totally traceable again to which catalog entry triggered it. The repo additionally features a threat_intel/ listing with maintained publicity catalogs constructed from public supply-chain marketing campaign reporting.

Getting Started

Bumblebee requires Go 1.25 or later. Install with:

go set up github.com/perplexityai/bumblebee/cmd/bumblebee@newest

After set up, bumblebee selftest verifies the binary works appropriately in opposition to embedded fixtures. The software is licensed below Apache License 2.0. The present launch is v0.1.1.

Key Takeaways

  • Bumblebee is Perplexity’s open-sourced, read-only developer endpoint scanner for supply-chain publicity checks.
  • It covers npm, pnpm, Yarn, Bun, PyPI, Go modules, RubyGems, Composer, MCP configs, editor extensions, and browser extensions.
  • Three scan profiles — baseline, undertaking, and deep — help routine stock and energetic incident response.
  • The software by no means executes set up scripts or invokes package deal managers, stopping scan-triggered assaults.
  • Built in Go with zero non-stdlib dependencies; obtainable now on GitHub below Apache 2.0.


Check out the GitHub Repo and Technical detailsAlso, be at liberty to observe us on Twitter and don’t neglect to affix our 150k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

Need to companion with us for selling your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar and so forth.? Connect with us

The publish Perplexity Open-Sources Bumblebee: A Read-Only Supply-Chain Scanner for Developer Endpoints appeared first on MarkTechPost.

Similar Posts