42dot42dot

ML Platform Engineer (Autonomous Driving)

Added 17 hours ago

We are looking for the best

At 42dot, our AD ML Platform Engineers build the core data platform and ML training / eval platform for the cutting edge algorithms in autonomous driving. We develop the distributed system of a scalable data platform for large-scale dataset (millions of scenes), as well as high-performance data serving SDKs for ML model training / evaluation. The platforms we deliver could highly improve the efficiency of ML model development lifecycle, including training, evaluation, deployment, as well as monitoring in the cloud environment.

Responsibilities

  • Develop a high scale, reliable data platform to manage, visualize, search and serve large-scale datasets for ML model training, fine tune and validation.

  • Develop advanced autonomous driving data SDK, including scene data search, datasets preparation, dataset loading, etc.

  • Build up the data lakehouse for autonomous driving scene dataset, including the sensor data, calibration data, as well as annotation data

  • Dig into performance bottlenecks all along the data processing pipelines, from data processing latency, data search latency to Test Procedure (TP) coverage.

  • Bootstrap and maintain infrastructure for data platform components—data processing pipeline, database, data lakehouse and data serving.

  • Collaborate with cross-functional teams, including ML algorithm, ML application, and Cloud Infra to align ML Platforms with overall autonomous driving system architecture.

Qualifications

  • Bachelor's degree or higher in Computer Science, Engineering, Robotics, or a similar technical field.

  • Minimum of 5 years of experience in Data Engineering or ML Platform roles

  • Proficient in Python and solid experience in Python SDK development

  • Solid working experience in Databases (e.g., MongoDB, PostgreSQL, etc)

  • Hands-on experience with data pipeline job orchestration with Databricks Workflows or Apache Airflow, as well as integrating data pipelines with machine learning models

  • Extensive experience with data technologies and architectures such as Data Warehouse (e.g., Hive) or Lakehouse (e.g., Delta Lake)

  • Experience with Apache Spark or other big data computing engines

Preferred Qualifications

  • Experience with autonomous vehicle sensor data (e.g., LiDAR, camera, radar)

  • Experience with ML model training lifecycle (e.g., data preparation, model training / validation / deployment, etc)

  • Understanding of modern AI frameworks (e.g., PyTorch, TensorFlow etc.)

  • Understanding data governance principles, data privacy regulations, and experience implementing security measures to protect data

Interview Process

  • Resume Screening - Coding Test - Virtual Interview (approximately 1 hour) - Onsite or Virtual Interview (approximately 3 hours) - Final Offer

  • Please note that the interview process may vary depending on the position and is subject to change based on scheduling and other circumstances.

  • Interview schedules and results will be communicated individually via the email address provided in your application.

Additional Information

  • Please upload all required documents in PDF format.

  • Veterans and applicants eligible for employment protection will receive preferential consideration in accordance with applicable laws and regulations.

  • In compliance with the Act on Employment Promotion and Vocational Rehabilitation for Persons with Disabilities, registered individuals with disabilities will receive preferential consideration.

  • 42dot does not accept unsolicited resumes from search firms. We will not pay any fees for resumes submitted without prior agreement.

  • A 3-month probationary period may apply.

※ Please make sure to review the information below before applying.