התפקיד במילים פשוטות
התפקיד מתמקד במחקר מעמיק של מודלי שפה גדולים (LLM) כדי לשפר את האבטחה של סוכני AI. תעצב ניסויים מורכבים על הפעלות ותכונות ניתנות לפירוש כדי לחשוף מנגנונים מאחורי התקפות כמו jailbreak והזרקת הנחיות עקיפה. לאחר מכן, תתרגם תובנות אלו לאותות שניתן להשתמש בהם לזיהוי וניתוח תגובות מודל.
- Deep learning expertise with a track record of non-trivial research (industry or academia) in LLMs or other domains (e.g., CV, speech)
- Strong experimental design and scientific writing
- Comfort pre-registering hypotheses, testing causal claims, proposing novel directions in a fast-changing field
- PhD or equivalent research experience in the industry (5+ years in a leading research team)
- Familiarity with AI frameworks (e.g., HuggingFace Transformers, LangChain, scikit-learn, PyTorch)
חולץ מתיאור המשרה · מתעדכן אוטומטית
למי זה מתאים
התפקיד מתאים לחוקרים בעלי מומחיות בלמידה עמוקה ורקע מוכח במחקר משמעותי בתחום ה-LLM או תחומים אחרים. נדרשת יכולת עיצוב ניסויים חזקה וכתיבה מדעית, יחד עם תואר שלישי או ניסיון מחקרי מקביל בתעשייה (5+ שנים בצוות מחקר מוביל).
תיאור המשרה המלא
המשרה המקורית · נשמר לעיוןAbout Us: Zenity is the first and only holistic platform built to secure and govern AI Agents from buildtime to runtime. We help organizations defend against security threats, meet compliance, and drive business productivity. Trusted by many of the world’s F500 companies, Zenity provides centralized visibility, vulnerability assessments, and governance by continuously scanning business-led development environments. We recently raised $38 million in a Series B funding, solidifying our position as a leader in the industry and enabling us to accelerate our mission of securing AI Agents everywhere. About the Job: This is a research‑first role focused on deeply understanding LLM internals to improve the security of AI agents. You’ll design careful experiments on activations and interpretable features- e.g., probing, attribution & ablation/patching, representation‑geometry analyses-to uncover mechanisms behind jailbreak, indirect prompt injection, and other attacks. Then translate those insights into signals that can be used for detection and analysis of a model response. The field of LLM interpretability at scale is exploding, with several major publications in the last months, and major opportunities for innovation. Investigate model internals, including activation/features analysis, unsupervised clustering, discovery of directions in latent space, etc. It may also require training specific model parts to improve interpretability metrics. Design security‑grounded evaluations: curate datasets for different attack types, evaluate performance of different white box (model internals) methods compared to black box (input/output only) baselines. Publish and share: produce Zenity Labs posts and open artifacts; when the work is strong, aim for tier‑1 ML venues (NeurIPS, ICML, etc.) and security forums. A publication of code and/or trained models in cases of community relevant novelty. Build tools: Several open source libraries exist (like Anthropic’s attribution graphs infra), but the research in the field is very dynamic, which will require you to build and adapt tools to your own research directions. This also includes agents to automate research work and distill knowledge from designed experiments.
Requirements: Deep learning expertise with a track record of non‑trivial research (industry or academia) in LLMs or other domains (e.g., CV, speech). We care that you’ve changed models or methods in meaningful ways (architecture/training/eval), not just used them. Strong experimental design and scientific writing; comfort pre‑registering hypotheses, testing causal claims, proposing novel directions in a fast-changing field. PhD or equivalent research experience in the industry (5+ years in a leading research team). Publication record or a portfolio of high‑impact open artifacts will make you stand out from the crowd. Familiarity with AI frameworks (e.g., HuggingFace Transformers, LangChain, scikit-learn, PyTorch); Experience with a production grade codebase with several contributors is a bonus. Experience in data analysis: visualization, exploration, cleanup. Knowledge in GenAI tools such as LLM Orchestrations and integration packages, Agents, RAG systems - a bonus.
שאלות על המשרה
- המשרה לא ציינה שכר. אנחנו מציגים שכר רק כשהמעסיק מפרסם אותו.
- Deep learning expertise with a track record of non-trivial research (industry or academia) in LLMs or other domains (e.g., CV, speech), Strong experimental design and scientific writing, Comfort pre-registering hypotheses, testing causal claims, proposing novel directions in a fast-changing field, PhD or equivalent research experience in the industry (5+ years in a leading research team), Familiarity with AI frameworks (e.g., HuggingFace Transformers, LangChain, scikit-learn, PyTorch)