Description
The Role
Nebius is hiring a Senior Software Engineer to design, build, and own backend systems that power metrics, monitor large-scale infrastructure, and develop a comprehensive infrastructure maintenance platform. This role requires strong production experience, sound system design judgment, and the ability to operate and improve critical services.
Your responsibilities will include:
- Design and build services and agents that provide deep visibility into large-scale server fleets and data center engineering systems
- Evolve metrics, aggregation, and alerting pipelines, with a focus on signal quality and reliability
- Design and operate maintenance and remediation systems that enable safe, predictable fleet-wide changes and keep infrastructure healthy
- Investigate production incidents hands-on, including on-host Linux debugging, and drive root-cause fixes
- Collaborate closely with hardware, networking, and data center operations teams to improve reliability
What we expect you to have:
- 5+ years of professional software engineering experience
- Strong production experience with Python and Go, or the ability to ramp up quickly
- Solid Linux fundamentals and comfort debugging live systems
- Ability to write reliable, maintainable code and dig into complex, ambiguous problems
- Experience building and operating production systems at scale
It will be an added bonus if you have:
- Ubuntu experience, including internal tooling and packaging workflows (e.g., building Debian packages)
- CCNA (Cisco Certified Network Associate) or equivalent networking experience
Key employee benefits:
- Health insurance: 100% company-paid medical, dental, and vision coverage for employees and families.
- 401(k) plan: up to 4% company match with immediate vesting.
- Parental leave: 20 weeks paid for primary caregivers, 12 weeks for secondary caregivers.
- Remote work reimbursement: up to $85/month for mobile and internet.
- Disability & life insurance: company-paid short-term, long-term and life insurance coverage.
Compensation
- We offer competitive salaries, ranging from $130k- $170k base + quarterly performance bonuses.
Join Nebius Today!
Company
Nebius provides an AI-focused cloud platform enabling scalable GPU clusters (from single GPU to thousands of NVIDIA GPUs) with pre-configured drivers, InfiniBand networking, and orchestrators like Kubernetes or Slurm. It offers fully managed services (MLflow, PostgreSQL, Apache Spark), cloud-native tooling (Terraform, API, CLI), ready-to-go solutions, and expert support. Nebius also runs data centers and is active in AI research collaborations and open-source AI ecosystem examples (vLLM, CRISPR-GPT references) and has partnerships with NVIDIA as Reference Platform Cloud Partner.
Related postings
Affirm
Senior Software EngineerSpainDanaher Corporation
Senior Software EngineerKoto City, Tokyo, JapanBloomreach
Senior Software Engineer for Campaigns Team (Engagement)CzechiaCollective
Senior Software EngineerSan Francisco, CA, USA