Lead/Staff AI Runtime Engineer (Ukraine)
Capgemini Engineering
At Capgemini Engineering, the world leader in engineering services, we bring together a global team of engineers, scientists, and architects to help the world’s most innovative companies unleash their potential. From autonomous cars to life-saving robots, our digital and software technology experts think outside the box as they provide unique R&D and engineering services across all industries. Join us for a career full of opportunities. Where you can make a difference. Where no two days are the same.
Your Client
Our client is at the forefront of revolutionizing AI computing by re-engineering infrastructure at the system level. Its architecture, combined with sophisticated software intelligence, abstraction, and an orchestration layer, enables developers to leverage a diverse array of compute resources, achieving efficient and reliable computing at a fraction of the cost. Founded by industry veterans from Nvidia, Apple, Tesla, Intel, and Zoox, it's shaping the future of AI.
As the Lead/Staff AI Runtime Engineer, you’ll play a pivotal role in the design, development, and optimization of the core runtime infrastructure powering distributed training and deployment of large AI models. This is a hands-on leadership role - ideal for a systems-minded software engineer who thrives at the intersection of AI workloads, runtimes, and performance-critical infrastructure.
Your Role
Your Client
Our client is at the forefront of revolutionizing AI computing by re-engineering infrastructure at the system level. Its architecture, combined with sophisticated software intelligence, abstraction, and an orchestration layer, enables developers to leverage a diverse array of compute resources, achieving efficient and reliable computing at a fraction of the cost. Founded by industry veterans from Nvidia, Apple, Tesla, Intel, and Zoox, it's shaping the future of AI.
As the Lead/Staff AI Runtime Engineer, you’ll play a pivotal role in the design, development, and optimization of the core runtime infrastructure powering distributed training and deployment of large AI models. This is a hands-on leadership role - ideal for a systems-minded software engineer who thrives at the intersection of AI workloads, runtimes, and performance-critical infrastructure.
Your Role
- Own the core runtime architecture supporting AI training and inference at scale.
- Design resilient and elastic runtime features (for example, dynamic node scaling and job recovery) within the custom PyTorch-based stack.
- Optimize distributed training reliability, orchestration, and job-level fault tolerance.
- Profile and enhance low-level system performance across training and inference pipelines.
- Improve packaging, deployment, and integration of customer models in production environments.
- Design and maintain libraries and services that support the full model lifecycle: training, checkpointing, fault recovery, packaging, and deployment.
- Implement observability hooks, diagnostics, and resilience mechanisms for deep-learning workloads.
- Champion best practices in CI/CD, testing, and software quality across the AI Runtime stack.
- Work cross-functionally with Research, Infrastructure, and Product teams to align runtime development with customer and platform needs.
- Guide technical discussions, mentor junior engineers, and help scale the AI Runtime team’s capabilities.
- 8+ years of experience in systems or software engineering, with deep exposure to AI runtime, distributed systems, or compiler/runtime interaction.
- Experience in delivering PaaS services.
- Proven experience optimizing and scaling deep-learning runtimes (such as PyTorch, TensorFlow, or JAX) for large-scale training or inference.
- Strong programming skills in Python and C++; experience with Go or Rust is a plus.
- Familiarity with distributed training frameworks, low-level performance tuning, and resource orchestration.
- Experience working with multi-GPU, multi-node, or cloud-native AI workloads.
- Solid understanding of containerized workloads, job scheduling, and failure recovery in production environments.
- Contributions to PyTorch internals or open-source deep learning infrastructure projects.
- Intel OpenVINO
- Familiarity with LLM training pipelines, checkpointing, or elastic training orchestration.
- Experience with Kubernetes, Ray, TorchElastic, or custom AI job orchestrators.
- Background in systems research, compilers, or runtime architecture for high-performance computing (HPC) or machine learning.
- Start-up experience.
- Ability to travel to the EU.
- We care about all our employees and want them to feel as comfortable as possible. That's why we offer them health insurance from the first days, regardless of the probationary period.
- The gift from the company - Christmas holidays from 25 December to 31 December.
- Сooperation with Superhumans center and Veteran HUB. Capgemini Engineering has supported the launch of psychological rehabilitation department of Superhumans. Our team also donated over UAH 500 000 prosthetics for three Ukrainian defenders. Currently, we support psychological counseling provided by the Veteran Hub, and we have implemented an internal policy making the company friendly to military and veterans with the assistance of the Hub.
Як відгукнутися?
Щоб відгукнутися на цю вакансію, вам необхідно авторизуватися на нашому сайті. Якщо у вас ще немає облікового запису, будь ласка, зареєструйтесь.
Розмістити резюмеСхожі вакансії
Формувальник тіста в гіпермаркет "Велмарт"/ нічна зміна (Святошинський р-н)
Retail Group,
Київ,
26 900 ₴
6 хвилин тому
Маєте досвід роботи в ритейлі або мрієте розвиватись у цій сфері? Шукаєте команду досвідчених фахівців, які люблять свою роботу, горять ідеєю, реалізовують цікаві проєкти та із задоволенням діляться знаннями? Тоді приєднуйтесь до нас! Retail Group — провідний торговельний холдинг України з 25-річним досвідом у ритейлі, який управляє мережами "Велмарт", "Велика Кишеня" та "ВК Експрес". У складі холдингу 59 магазинів загальною...
Адміністратор готелю IQ
Mercure Kyiv Congress Hotel,
Київ,
38 хвилин тому
IQ Hotel це сучасний готель, розташований у діловий частині міста Київ. Ми постійно розвиваємося і шукаємо талановитих та амбіційних співробітників для посади "Адміністратор готелю" Ви наш кандидат, якщо: Володієте розмовною англійською мовою, Маєте досвід роботи в готельній сфері або туризмі від 1 року (бажано); Розглянемо студентів профільного навчання; Досвід роботи з програмою Servio буде перевагою, але ми готові навчати, при...
Бухгалтер по нарахуванню доходів
Леруа Мерлен Україна / Leroy Merlin,
Київ,
2 години тому
Леруа Мерлен Україна — міжнародна компанія-рітейлер, відома і визнана в усьому світі. Ми спеціалізуємося на торгівлі товарами для будівництва, ремонту та облаштування житла. Леруа Мерлен Україна входить до складу ADEO — міжнародної холдингової групи, яка об'єднала 35 автономних компаній в 12 країнах Азії, Європи і Південної Америки. Наразі ми шукаємо собі в команду Бухгалтера по нарахуванню доходів. Обов'язки: Облік та...