Senior Software Engineer, Data Center Workloads – Infrastructure
פורסם לפני 20 ימים · 0 מועמדים
התפקיד במילים פשוטות
בתפקיד זה, תפתח ותריץ כלי תוכנה ואוטומציה כדי לאפיין את צריכת החשמל, הביצועים והתנהגות הכוננים במערכות NVIDIA בקנה מידה של ארונות שרתים. תבצע עומסי עבודה של AI ברחבי המערכת כדי לנתח ולמטב את ההתנהגות ברמת המערכת.
- B.Sc. or M.Sc. in Computer Science, Electrical Engineering, or a related field
- 5+ years of software engineering experience
- Strong programming skills in Python
- at least one system-level language such as C/C++
- Experience developing automation and test infrastructure for complex hardware/software systems
- Experience with NVIDIA platforms, GPU systems, or rack-scale AI infrastructure
- Background in power, thermal, performance, or storage/drive characterization
- Experience with workload automation, cluster orchestration, or lab infrastructure
- Familiarity with AI benchmarks, training/inference workloads, and system stress methodologies
- Experience in post-silicon validation, production testing, or system bring-up
חולץ מתיאור המשרה · מתעדכן אוטומטית
למי זה מתאים
התפקיד מתאים למהנדסי תוכנה עם ניסיון של 5+ שנים בפיתוח אוטומציה ותשתיות למערכות חומרה/תוכנה מורכבות, ובעלי הבנה טובה בארכיטקטורת מערכת. הוא פחות מתאים למי שאין לו ניסיון מעשי בהרצת, איתור באגים או אופטימיזציה של עומסי עבודה של AI או HPC.
תיאור המשרה המלא
המשרה המקורית · נשמר לעיוןAt NVIDIA, we are pioneers in innovation, transforming computer graphics, PC gaming, and accelerated computing for over 25 years. Our team is driven by powerful technology and outstanding people who expand the limits of what’s achievable. Now, we are unlocking the potential of AI to usher in the next era of computing.
As part of our engineering organization, you will play a key hands-on role in developing and executing software-driven characterization workflows on NVIDIA rack-scale systems. This role is focused on running AI workloads across the full stack to analyze, characterize, and optimize power, performance, and drive behavior at system level. This is an opportunity to work at the intersection of software, infrastructure, silicon, and large-scale AI platforms, with direct impact on next-generation NVIDIA systems.
What You’ll Be Doing
Develop and run software tools, automation, and workloads to characterize power, performance, and drive behavior across NVIDIA rack-scale systems.
Execute AI and system-level workloads to stress and evaluate behavior across the stack, including GPUs, CPUs, networking, storage, firmware, drivers, and system software.
Build automated frameworks for data collection, telemetry, validation, correlation, and analysis of characterization results.
Investigate system behavior under different workloads and operating conditions to identify bottlenecks, anomalies, and optimization opportunities.
Work closely with hardware, firmware, driver, system software, performance, and validation teams to define characterization methodologies and debug cross-stack issues.
Support bring-up, validation, and readiness activities for new rack-scale platforms and AI infrastructure.
Create clear documentation, test flows, and repeatable processes to improve coverage, efficiency, and reproducibility.
What We Need To See
B.Sc. or M.Sc. in Computer Science, Electrical Engineering, or a related field.
5+ years of software engineering experience, preferably in system software, infrastructure, validation, or performance-focused environments.
Strong programming skills in Python and at least one system-level language such as C/C++.
Experience developing automation and test infrastructure for complex hardware/software systems.
Hands-on experience running, debugging, or optimizing AI, HPC, or large-scale system workloads.
Good understanding of system-level architecture, including interactions across hardware, firmware, drivers, operating systems, and application layers
Experience working in Linux environments and with scripting, telemetry, logging, and data analysis tools.
Strong debugging and problem-solving skills, with the ability to work across multiple engineering disciplines.
Good communication skills and the ability to drive technical work in a fast-paced, cross-functional environment.
Ways To Stand Out From The Crowd
Experience with NVIDIA platforms, GPU systems, or rack-scale AI infrastructure.
Background in power, thermal, performance, or storage/drive characterization.
Experience with workload automation, cluster orchestration, or lab infrastructure.
Familiarity with AI benchmarks, training/inference workloads, and system stress methodologies.
Experience in post-silicon validation, production testing, or system bring-up.
, , JR2017132
Show more
Show less
שאלות על המשרה
- המשרה לא ציינה שכר. אנחנו מציגים שכר רק כשהמעסיק מפרסם אותו.
- B.Sc. or M.Sc. in Computer Science, Electrical Engineering, or a related field, 5+ years of software engineering experience, Strong programming skills in Python, at least one system-level language such as C/C++, Experience developing automation and test infrastructure for complex hardware/software systems