InetumInetum

Junior Data Engineer

Added 2 months ago

Description

We are looking for a Junior Data Engineer to join us and contribute to the development of modern, real‑time data processing capabilities. You will help transition existing data and ML workflows from batch processing to scalable streaming solutions. The role involves hands‑on engineering, close collaboration with Data Scientists, and operational responsibility for production data pipelines.

Technology Environment

  • Modern real‑time data streaming technologies used for ML model inference
  • Distributed data processing frameworks supporting scalable, low‑latency pipelines
  • Containerized workloads orchestrated in cloud‑native environments
  • Monitoring and observability tools for ensuring reliability and performance of data pipelines
  • Python‑based ecosystem supporting ML model integration and lifecycle management

Key Responsibilities

  • Transform batch inference workflows into streaming pipelines.
  • Define streaming semantics to replace batch windows, including micro‑batching, windowing, and state management.
  • Design Kafka topic structures, partitioning strategies, and consumer group patterns for prediction workloads.
  • Implement checkpointing, backpressure handling, and delivery‑guarantee strategies (at‑least‑once / exactly‑once).
  • Package and version ML model artifacts for streaming jobs, supporting safe rollouts and rollbacks.
  • Tune performance for throughput and latency, including batching strategies and resource allocation.
  • Deploy and operate streaming jobs with monitoring and alerting (lag, throughput, error rates).
  • Integrate streaming outputs into downstream ETL/BI systems.
  • Collaborate with Data Scientists on CI/CD for streaming models and monitor model performance/drift.

Team & Collaboration

  • You will work in a distributed delivery model closely aligned with the central AI/BI team in Germany.
  • Daily collaboration through MS Teams, Jira, Confluence.
  • Agile methodologies (Scrum/Kanban) in cross‑functional squads.

Qualifications

  • Practical experience with Kafka (producers/consumers, topic design, partitions, retention).
  • Experience with Spark Structured Streaming or similar streaming frameworks.
  • Familiarity with migrating batch inference to streaming architectures.
  • Experience running containerized workloads in Kubernetes.
  • Strong Python skills and understanding of common ML libraries.
  • English and Polish level B1 or higher.

Nice to have:

  • Basic monitoring/logging experience (ELK, metrics) and performance tuning.
  • Experience with Kafka Streams.
  • Familiarity with feature stores or retraining orchestration.

Additional Information

This position offers a hybrid work model. Office location: Warszawa, Poznań, Lublin

The position includes participation in an on‑call duty.

Company

Inetum helps public and private organizations navigate digital transformation with consulting, technology, and solutions across industries, supported by a global delivery model and AI focus.

See more junior data engineer jobs in Poland + remote