Senior Software Research Architect, AI Networking
פורסם 21 במאי · 0 מועמדים
התפקיד במילים פשוטות
בתפקיד זה, תהיה אדריכל תוכנה בכיר המתמקד במחקר ופיתוח פתרונות רשת מקצה לקצה עבור אימון והסקת מסקנות של AI מבוזר בקנה מידה גדול. תעצב ותמטב מערכות המניעות עומסי עבודה של AI גנרטיבי, ותעבוד בצומת שבין תוכנה לחומרה באשכולות GPU מתקדמים. תפקידך יכלול ניתוח פריסות קיימות, פיתוח אבות טיפוס והמלצה על שיפורים אדריכליים.
- M.Sc. or PhD (preferred) in Computer Science, Electrical/Computer Engineering, or related field—or B.Sc. with research experience and publications
- 5+ years of relevant experience
- Deep expertise in networking and communication internals (NCCL, RDMA, congestion control, routing)
- Strong software engineering skills in C++ and/or Python
- Excellent system-level design and problem-solving abilities
- Proven passion for solving sophisticated technical problems and delivering impactful solutions
- Record of publications in top-tier conferences
- Experience in designing and building large-scale AI training clusters
- Post-PhD research experience
- Practical understanding of deep learning systems, GPU acceleration, and AI model execution flows
חולץ מתיאור המשרה · מתעדכן אוטומטית
למי זה מתאים
התפקיד מתאים למועמדים בעלי תואר שני או דוקטורט במדעי המחשב, הנדסת חשמל/מחשבים, או תחום קשור, עם 5+ שנות ניסיון רלוונטי ומומחיות עמוקה ברשתות ותקשורת פנימית. הוא אידיאלי למי שמחפש להוביל מחקר ופיתוח, לפרסם פטנטים ולהציג בכנסים מובילים.
תיאור המשרה המלא
המשרה המקורית · נשמר לעיוןNVIDIA has been redefining computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology—and amazing people. Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what’s never been done before takes vision, innovation, and the world’s best talent. Being an NVIDIAN means being part of a diverse and encouraging setting that encourages everyone to perform at their peak. Come join the team and discover how you can develop a lasting influence on the world.
NVIDIA is in search of a Senior Software Architect- a creative, forward-thinking, and practical researcher to improve the framework for widespread LLM learning and prediction. As part of our dynamic E2E Architecture group, you will design and optimize systems driving generative AI workloads, working at the intersection of software and hardware on some of the most advanced GPU clusters worldwide. You will define how AI models are deployed and scaled in production using the NVIDIA Spectrum-X Networking Platform, influencing decisions from inter-node communication and compute scheduling to system-level optimization. This is an opportunity to collaborate with best-in-class engineers and researchers and shape the future of generative AI in real-world applications. Your work will make a lasting impact by enabling generative AI technologies to reach real-world applications and improve global computing capabilities.
What You’ll Be Doing
Lead research and development of end-to-end networking solutions for distributed AI training and inference at scale, with a focus on job completion time, failure resiliency, telemetry, scheduling, and placement.
Analyze current deployments, develop prototypes, and recommend architectural improvements.
Stay abreast of the latest research; become the team’s authority in emerging networking techniques and technologies.
Design, simulate, and validate new systems using novel, scalable network simulator NSX.
Develop and test prototypes on large-scale GPU clusters (e.g., Israel-1).
Collaborate across hardware, firmware, and software teams to translate ideas into real networking product features.
Publish patents and present research at leading conferences.
What We Need To See
M.Sc. or PhD (preferred) in Computer Science, Electrical/Computer Engineering, or related field—or B.Sc. with research experience and publications.
5+ years of relevant experience.
Deep expertise in networking and communication internals (NCCL, RDMA, congestion control, routing).
Strong software engineering skills in C++ and/or Python.
Excellent system-level design and problem-solving abilities.
Outstanding communication and collaboration skills across technical domains.
Ways To Stand Out From The Crowd
Proven passion for solving sophisticated technical problems and delivering impactful solutions.
Record of publications in top-tier conferences.
Experience in designing and building large-scale AI training clusters.
Post-PhD research experience
Practical understanding of deep learning systems, GPU acceleration, and AI model execution flows.
, , JR2007773
Show more
Show less
שאלות על המשרה
- המשרה לא ציינה שכר. אנחנו מציגים שכר רק כשהמעסיק מפרסם אותו.
- M.Sc. or PhD (preferred) in Computer Science, Electrical/Computer Engineering, or related field—or B.Sc. with research experience and publications, 5+ years of relevant experience, Deep expertise in networking and communication internals (NCCL, RDMA, congestion control, routing), Strong software engineering skills in C++ and/or Python, Excellent system-level design and problem-solving abilities