XBOW is a fully autonomous AI-driven penetration tester that reached the top spot on HackerOne’s US leaderboard by operating without human input in real black-box bug bounty programs.
Initial benchmarking used public CTF challenges and a custom simulator to measure progress, followed by real-world open source zero-day discovery in white-box scenarios.
To tackle diverse production environments, XBOW was dogfooded on HackerOne like an external researcher, leading to rapid rank advancement.
Scaling discovery involved parsing program scopes with LLMs, prioritizing high-value targets through a scoring system, and deduplicating domains via SimHash and imagehash techniques.
Automated validators, combining LLMs and custom programmatic checks (e.g., headless browser tests), ensured precision and minimized false positives.
XBOW submitted 1,060 fully automated vulnerabilities, with hundreds confirmed by program owners; severity breakdown included critical, high, medium, and low issues.
Notable discovery: an unknown vulnerability in Palo Alto’s GlobalProtect VPN affecting over 2,000 hosts.
Transparency measures: upcoming blog series will detail key technical findings in depth.
Get notified when new stories are published for "🇺🇸 Hacker News English"