CloudRaftCloudRaft

Observability Specialist

Added 4 hours ago

About CloudRaft:

CloudRaft is a premier consulting company specializing in AI, cloud-native solutions, observability, and platform engineering. We partner with global startups and enterprises to solve their complex problems. Learn more about us at www.cloudraft.io.

The Opportunity:

We're seeking a passionate engineer with expertise in open source observability to join our dynamic team. This is your chance to work as a founding engineer and be part of building a rocket ship! The ideal candidate has 4-8 years of strong experience in various open source and commercial Observability products.

Location: Remote

What We're Looking For:

  • Observability Expertise:

    • Expert in implementing and integrating observability solutions in services using products like Vector, Fluentd, OpenTelemetry, Prometheus, Loki, VictoriaMetrics, Thanos, Tempo, Opensearch, and Grafana
    • Deep understanding of service reliability, KPIs, and metrics
    • Ability to optimize large-scale telemetry pipelines and backends
    • Strong in distributed computing fundamentals, storage technologies, and time series databases
    • Contribution to open source projects is a plus.
  • Kubernetes and Cloud Experience:

    • Professional experience running Kubernetes in on-premises and cloud environments
    • Hands-on production experience in designing and managing Kubernetes clusters
    • Good understanding of on-premises and cloud platforms
  • Programming experience:

    • You should have programming knowledge in languages like Golang or Rust
    • Able to write libraries and contribute to open source
    • Ready to start immediately and make an impact from day one

Required:

    • 4-8 years of experience in SRE, particularly in implementing Observability at scale
    • Strong troubleshooting skills for resolving system issues in production environments
    • Implementation experience with SRE concepts such as SLIs and SLOs
    • Ability to represent the organization, collaborate with, and coach customer teams
    • Passion for sharing knowledge through technical writing and speaking at community events and conferences

Qualifications:

- Bachelor's degree in Computer Science, IT, or a related field

- Expert understanding of Prometheus, OpenTelemetry, Datadog, Grafana, alerting, and incident management systems

- Programming skills in any modern programming language

- Experience with Infrastructure as Code

- Excellent problem-solving and communication skills

- Product mindset and customer empathy are a big plus

Benefits : 

- Competitive salary

- Premium health insurance and various health & wellness benefits

- Opportunity to work on cutting-edge technologies

- Collaborative and supportive work environment

- Chance to make a real impact on the company's success

If you're ready to take on this exciting challenge and help shape the future of observability and open source, we want to hear from you!