Description
We’re hiring our first engineer fully dedicated to the infrastructure foundation of Sieve. This is a high-ownership role for someone who thinks deeply about:
throughput and system stability
monitoring and incident response
security and least-privilege design
reducing operational burden for the entire engineering team
You’ll work directly with our CTO and our founding engineers to build the core tooling that powers all of engineering.
This role is for someone who spends their time thinking deeply about reliability, throughput, observability, and security. You’re the kind of engineer who is always anticipating failure modes, eliminating operational risk, and designing systems that don’t break.
If something goes down, you take it personally, and you thrive in that level of responsibility.
What You’ll Do
Work with engineering to design and validate the infrastructure powering PB-scale workloads
Build and maintain Terraform-managed multi-cloud deployments
Improve cloud and data security (SSO, IAM, least privilege, auditability)
Own incident response and harden systems against failure
Develop CI/CD systems that minimize user error and maximize safety
Build monitoring + alerting platforms (Prometheus, OpenTelemetry, VictoriaMetrics)
Wrap internal reliability tooling with simple UIs for engineers
Requirements
3+ years building internal infrastructure at scale
Experience on-call for Sev 0 / Sev 1 production incidents (L3 preferred)
Strong cloud experience (GCP, AWS, Oracle, Cloudflare, etc.)
Deep Infrastructure-as-Code experience (Terraform preferred)
Familiarity with Argo, Helm, Kustomize, or similar deployment tools
Experience operating observability systems (Prometheus, OTel, VictoriaMetrics)
Backend fundamentals in Python, Go, Rust, or C++
Strong networking + security intuition, including SSO implementation
High ownership mindset over critical systems
In-person at our SF HQ
Bonus
Experience building lightweight internal tooling (APIs, dashboards, Svelte)
Familiarity with object storage systems (“buckets”)
Active GitHub or portfolio projects
Benefits
401k + Full Health Insurance
Breakfast, Lunch, and Dinner covered and your choice of snacks
Ubers covered home
Company
Sieve is a video data research lab that curates and licenses large-scale video datasets for AI training. It records video from scratch and aggregates it from multiple sources, filters for quality, indexes billions of videos with detectors and embeddings, and annotates with dense labels and pairings. Customers include leading AI labs, Fortune 100 companies, and fast-growing AI startups. The company provides ready-to-use datasets or custom datasets, free data samples, and purchase-based access with SLA-based delivery via S3-compatible transfer, emphasizing compliance and scalable, secure delivery.
Related postings
Multiply Labs
Hardware Reliability EngineerDogpatch, San Francisco, CA 94107, USA and 1 otherOpenAI
Software Engineer, ReliabilitySan Francisco, CA, USAAbbVie
Reliability EngineerUnited States and 1 otherAnduril Industries
Reliability EngineerUnited States