דלג לתוכן הראשי

Lead SRE, DevOps Group (Copy)

BigPandaתל אביב-יפו, מחוז תל אביב, ישראלהיברידיFull-timeדרגה: ליד

פורסם לפני 19 ימים · 0 מועמדים

שכר לא צוין במשרה זו

שמירה, הגשה או בדיקת התאמה — כמה שניות להקמת חשבון חינם.

תובנת Willbi

התפקיד במילים פשוטות

בתפקיד זה, תוביל את אמינות, מדרגיות וביצועי הפלטפורמה של BigPanda. תהיה אחראי על פתרון תקלות, הגדרת יעדי שירות, ויישום פתרונות למניעת בעיות עתידיות. תעבוד בשיתוף פעולה עם צוותים שונים כדי להטמיע אמינות בכל שכבות המערכת.

חובה
  • 5+ years of experience as an SRE (or similar role) in a high-scale production environment, with hands-on ownership across the full stack - infrastructure and application layers
  • Experience in designing, building, and operating cloud-native systems on AWS
  • Hands-on experience with maintaining Node.JS or JVM-based applications running with MongoDB, ElasticSearch, Kafka
  • Strong coding skills and a software engineering mindset
  • Experience with infrastructure-as-code and modern container orchestration platforms
יתרון

    חולץ מתיאור המשרה · מתעדכן אוטומטית

    למי זה מתאים

    התפקיד מתאים למהנדסי SRE מנוסים עם 5+ שנות ניסיון בסביבות ייצור בקנה מידה גדול, בעלי ידע מעמיק ב-AWS, MongoDB, ElasticSearch, Kafka, וכישורי קידוד חזקים. הוא אינו מתאים למי שמחפש תפקיד ללא אחריות על תפעול ותחזוקה שוטפת.

    תיאור המשרה המלא

    המשרה המקורית · נשמר לעיון

    Location requirements:

    This role requires working out of the Tel Aviv office three days per week.

    About The Role

    As a SRE Lead at BigPanda, you will play a critical role in ensuring the reliability, scalability, and performance of the platform that powers our customers’ operations. You’ll operate at the intersection of software engineering and production operations, taking full ownership of the systems you build and run.

    This role is not just about responding to incidents — it’s about fundamentally improving how our platform behaves under real-world conditions. You will drive reliability initiatives end-to-end: defining measurable service goals, shaping engineering priorities through error budgets, and implementing solutions that prevent issues before they occur.

    You’ll work closely with teams across the organization, embedding reliability and observability into every layer of the stack. At the same time, you’ll leverage automation, modern infrastructure practices, and emerging AI capabilities to continuously evolve how we operate and scale.

    What You Will Do

    Develop deep product knowledge across our platform - Understanding its internals, failure modes, and operational behavior well enough to own incident resolution end-to-end.

    Define and track SLAs/SLOs/SLIs across critical platform services, and use error budgets to drive engineering decisions.

    Own production reliability - including on-call rotations, incident response, and post-mortems - with a focus on minimizing MTTR and preventing recurrence through systemic fixes, not just firefighting.

    Work hand-in-hand with engineering teams across the stack - infrastructure, application, and business layers - to embed reliability requirements everywhere.

    What Skills And Experience You’ll Bring To BigPanda

    5+ years of experience as an SRE (or similar role) in a high-scale production environment, with hands-on ownership across the full stack - infrastructure and application layers.

    Business-level reliability experience is a strong advantage.

    Experience in designing, building, and operating cloud-native systems on AWS.

    Hands-on experience with maintaining Node.JS or JVM-based applications running with the following: MongoDB, ElasticSearch, Kafka.

    Strong coding skills and a software engineering mindset - you build your own tools rather than waiting for someone else to.

    Experience with infrastructure-as-code and modern container orchestration platforms.

    Practical experience building or integrating AI-driven solutions (e.g., LLMs, agents, or AI-powered operational tooling).

    A true owner - you take responsibility for systems end-to-end and proactively drive improvements without waiting for direction.

    A problem solver who practices adaptability and flexibility to business needs.

    About Us

    BigPanda is a fast-growing, values-driven, global company that enables Tech Ops teams to keep the digital economy running. BigPanda’s AI-driven IT operations (aka AIOps) platform transforms IT data into insight and action. By eliminating IT noise, automating incident management, and keeping our customers’ digital services up and running around the clock, we become a mission-critical part of our customers’ IT operations.

    With BigPanda, some of the world’s largest enterprises including Hulu, Cisco, United, Abbott, Marriott, Expedia and many others are able to reduce costs and increase efficiencies, accelerate business velocity, and deliver extraordinary customer experiences.

    BigPanda is backed by top-tier investors including Sequoia, Mayfield, Battery, Insight Partners, Advent International, and Greenfield Partners.

    We have an awesome team of motivated, knowledgeable, fun-loving, and friendly Pandas. We provide comprehensive health coverage, parental leave, competitive cash and equity compensation, and a supportive, collaborative, and innovative environment to empower you to do the best work of your career.

    Our Benefits

    Competitive equity

    Hybrid work schedule

    Company funded health insurance

    6 weeks fully paid Parental Leave

    Critical Family Medical Leave

    Financial planning services

    Employee learning & development budget

    Values-based recognition (quarterly and annually)

    Social community & ERG programs

    FreeFit gym package

    Work-life harmony

    Dog friendly office

    Show more

    Show less

    אודות BigPanda
    פרופיל החברה · בקרוב

    ביקורות עובדים · בקרובעוד משרות ב-BigPanda

    שאלות על המשרה

    • המשרה לא ציינה שכר. אנחנו מציגים שכר רק כשהמעסיק מפרסם אותו.
    BigPanda
    פורסם לפני 19 ימים · 0 מועמדים