AI Harness Engineer
פורסם 24 במאי · 0 מועמדים
התפקיד במילים פשוטות
מהנדס/ת AI Harness יעבוד/ת על בניית ארכיטקטורות סוכנים, יצירת מערכות אוטומטיות למחקר, ופיתוח מערכות בקאנד מורכבות. התפקיד כולל גם שיפור מערכות ה-AI המשמשות את הלקוחות וגם את המערכת המטאית שבה AI מריץ ניסויים על AI באופן אוטומטי. העבודה תכלול פרויקטים מיוחדים בשיתוף פעולה הדוק עם ה-CTO ופתרון בעיות מורכבות עבור לקוחות.
- 7+ years of backend engineering
- Ability to understand and communicate tradeoffs
- Extensive experience using coding agents like Claude Code to do tasks formerly too large to tackle in a reasonable timeframe
- Hands-on experience building with LLM APIs and tool-design
- Python and Kubernetes experience
- Familiarity with Prometheus / DataDog / Elastic / other observability stacks
- Open source contributions to agent frameworks, MCP servers, or eval tooling
- You've reverse-engineered Claude Code (or read the leaked source) to figure out how a specific subsystem works
חולץ מתיאור המשרה · מתעדכן אוטומטית
למי זה מתאים
התפקיד מתאים למהנדסים/ות עם ניסיון של 7+ שנים בפיתוח בקאנד ויכולת להבין ולתקשר פשרות טכניות. נדרש ניסיון נרחב בשימוש בסוכני קידוד כמו Claude Code. התפקיד פחות מתאים למי שאינו/ה חש/ה בנוח עם שינויים מהירים ואימות מחדש של הנחות קודמות.
תיאור המשרה המלא
המשרה המקורית · נשמר לעיוןAI Harness Engineer (Backend Developer)
What uses fewer tokens: giving an agent a dedicated MCP server, or putting it in a sandbox with bash and a whitelisted set of network endpoints? What are the tradeoffs beyond token cost?
We're looking for engineers to answer questions like this. You should have hands-on experience in challenging technical domains, such as designing distributed systems, database internals, low-level performance work, Linux kernel drivers, reverse engineering binaries, or anything else which is deeply technical with little room for error.
You will apply your skills to building agentic architectures, creating auto-research harnesses, shipping complex backend systems, and more.
Why this role is different
We operate at a velocity that makes some people uncomfortable.
Unlike traditional backend work, best practices can change overnight when Anthropic releases a new model. You need to be comfortable with rapid change and regularly invalidating previous assumptions.
To do this sustainably, we've invested heavily in automated agent harnesses for benchmarking and improving agents. A big part of this role is not only improving the agent that our customers use, but also improving the meta-version of it: the harness where AI can run experiments on AI, automatically.
We’re a fast growing startup, so you will do things beyond a traditional backend role. You’ll work closely with our CTO on special projects, and you’ll likely meet some of our customers to troubleshoot interesting edge cases in agent behaviour.
Examples features you could work on
Agent to Agent communication
Agent orchestrator/supervisor
Agent memory
Optimizing the cost of high-frequency agents (agents that wake up every X minutes)
Webhook-triggered agents
Evals Infrastructure
Auto-research loops
Grouping alerts into unique incidents - without invoking an LLM on each alert
What we expect you already know - MUST
7+ years of backend engineering
Ability to understand and communicate tradeoffs
Extensive experience using coding agents like Claude Code to do tasks formerly too large to tackle in a reasonable timeframe
Nice to have
Hands-on experience building with LLM APIs and tool-design
Python and Kubernetes experience
Familiarity with Prometheus / DataDog / Elastic / other observability stacks
Open source contributions to agent frameworks, MCP servers, or eval tooling
You've reverse-engineered Claude Code (or read the leaked source) to figure out how a specific subsystem works
If you're a strong engineer and this role excites you, please reach out even if you don't check every box.
About us
We build SRE Agents to help companies reduce cloud downtime, by catching new errors early, finding their root causes, and automatically fixing them. We have a wide range of customers today, including several household names whose products you use.
Show more
Show less
שאלות על המשרה
- המשרה לא ציינה שכר. אנחנו מציגים שכר רק כשהמעסיק מפרסם אותו.
- 7+ years of backend engineering, Ability to understand and communicate tradeoffs, Extensive experience using coding agents like Claude Code to do tasks formerly too large to tackle in a reasonable timeframe