NebiusNebius

Data Center Manager

Added 3 months ago

Description

The Role

The Data Center Manager owns end‑to‑end reliability, safety, capacity, and performance for one of our flagship U.S. sites. You’ll lead a high‑performing, multi‑disciplinary operations team and partner tightly with Design, Build, Network, Security, Capacity Planning, and the DC orgs to deliver world‑class availability and cost efficiency.

Your responsibilities will include:

  • Lead day-to-day data center operations in a 24/7 mission-critical environment
  • Manage and develop a team of 15–20 Data Center Technicians
  • Oversee installation, break/fix activities, and field change orders
  • Ensure timely delivery of tasks aligned with KPIs and operational milestones
  • Monitor infrastructure performance; drive troubleshooting, incident response, and root cause analysis
  • Own incident management, including resolution and post-incident reviews
  • Plan and execute capacity expansion (rack, block, and site growth)
  • Maintain physical security, access controls, and compliance with standards
  • Partner cross-functionally (Engineering, Build, Site Selection, Operations)
  • Manage vendors and contractors to deliver high-quality, cost-effective solutions
  • Drive continuous improvement across processes, efficiency, and reliability
  • Support hiring and team scaling efforts

We expect you to have:

  • 5+ years of experience in data center operations; 2+ years in a leadership role
  • Strong knowledge of servers, storage, networking, and data center infrastructure
  • Experience with power systems (UPS, backup), cooling, and physical infrastructure
  • Proficiency in Linux environments
  • Experience with incident management and operational processes in high-availability environments
  • Strong project management experience (budgeting, vendor management, resource planning)
  • Understanding of security, compliance, and disaster recovery best practices
  • Ability to work cross-functionally and drive execution across teams
  • Strong leadership, communication, and problem-solving skills
  • Ability to lift up to 50 lbs and support on-site operational needs
  • Willingness to participate in a 24/7 on-call rotation
  • Bachelor’s degree in IT, Computer Science, or related field (or equivalent experience)

It would be an added bonus if you have:

  • Familiarity with ITIL / ITSM processes
  • Experience with GPU clusters, HPC, or cloud infrastructure
  • Understanding of data center network traffic patterns (east-west and north-south)
  • Experience with data center management and monitoring tools

Key employee benefits:

  • Health insurance: 100% company-paid medical, dental, and vision coverage for employees and families.
  • 401(k) plan: up to 4% company match with immediate vesting.
  • Parental leave: 20 weeks paid for primary caregivers, 12 weeks for secondary caregivers.
  • Remote work reimbursement: up to $85/month for mobile and internet.
  • Disability & life insurance: company-paid short-term, long-term and life insurance coverage.

Compensation

We offer competitive salaries, ranging from  $140k- $180k base + quarterly performance bonuses.

Join Nebius Today!

Company

Nebius provides an AI-focused cloud platform enabling scalable GPU clusters (from single GPU to thousands of NVIDIA GPUs) with pre-configured drivers, InfiniBand networking, and orchestrators like Kubernetes or Slurm. It offers fully managed services (MLflow, PostgreSQL, Apache Spark), cloud-native tooling (Terraform, API, CLI), ready-to-go solutions, and expert support. Nebius also runs data centers and is active in AI research collaborations and open-source AI ecosystem examples (vLLM, CRISPR-GPT references) and has partnerships with NVIDIA as Reference Platform Cloud Partner.

See more data center manager jobs