Data Engineer - Apache Airflow - Remote.
- Entreprise
- Randstad (Schweiz) AG
- Lieu
- Lausanne
- Date de publication
- 07.10.2024
- Référence
- 4659463
Description
- Lausanne, Vaud
- Contract
Job Details
Here at Randstad Digital, we are looking for a data engineer with experience of deploying Apache Airflow and pushing data through a complex environment including Snowflake
Key Responsibilities:
Design, deploy and maintain the data ingestion and pipeline orchestration system using Apache Airflow.
Develop and maintain connectors to integrate various in-house solutions, including APIs, file-based systems, and SQL databases.
Oversee containerized environments using Docker or Podman, with an emphasis on security, speed, and robustness.
Implement and manage efficient data transformations using Python-based tools.
Ensure reliability and performance by debugging and optimizing complex systems.
Essential Skills:
Apache Airflow: Extensive experience with data processing orchestrators, with a focus on Apache Airflow.
Containers: Proficiency in container management (Docker, Podman), with experience in Kubernetes or OpenShift as a plus.
Python: Strong coding abilities, particularly in Python, and experience with data processing libraries like pandas, polars, duckdb, and great-expectations.
Linux Environment: Experience managing Linux server environments (RHEL experience is a plus).
Snowflake (Preferred): Hands-on experience with Snowflake or similar SQL-based databases.
Data Warehousing & Data Architecture: Strong understanding of data architecture, data warehousing, and modern data formats such as Parquet, Avro, and Arrow.
Preferred Qualifications:
Experience in software engineering, with at least 2 years focused on data processing.
Familiarity with HPC/parallel processing.
Experience with TDD (Test-Driven Development) and DQC (Data Quality Checks).
Previous work experience in a scientific IT environment is a plus.
Skills & Attributes:
Ability to work independently on assigned tasks, while collaborating effectively on system design.
Strong problem-solving skills with a willingness to debug and optimize complex systems.
Documentation is key - proactive in creating and maintaining documentation as an off-line communication tool.
Here at Randstad Digital, we are looking for a data engineer with experience of deploying Apache Airflow and pushing data through a complex environment including Snowflake
Key Responsibilities:
Design, deploy and maintain the data ingestion and pipeline orchestration system using Apache Airflow.
Develop and maintain connectors to integrate various in-house solutions, including APIs, file-based systems, and SQL databases.
Oversee containerized environments using Docker or Podman, with an emphasis on security, speed, and robustness.
Implement and manage efficient data transformations using Python-based tools.
Ensure reliability and performance by debugging and optimizing complex systems.
Essential Skills:
Apache Airflow: Extensive experience with data processing orchestrators, with a focus on Apache Airflow.
Containers: Proficiency in container management (Docker, Podman), with experience in Kubernetes or OpenShift as a plus.
Python: Strong coding abilities, particularly in Python, and experience with data processing libraries like pandas, polars, duckdb, and great-expectations.
Linux Environment: Experience managing Linux server environments (RHEL experience is a plus).
Snowflake (Preferred): Hands-on experience with Snowflake or similar SQL-based databases.
Data Warehousing & Data Architecture: Strong understanding of data architecture, data warehousing, and modern data formats such as Parquet, Avro, and Arrow.
Preferred Qualifications:
Experience in software engineering, with at least 2 years focused on data processing.
Familiarity with HPC/parallel processing.
Experience with TDD (Test-Driven Development) and DQC (Data Quality Checks).
Previous work experience in a scientific IT environment is a plus.
Skills & Attributes:
Ability to work independently on assigned tasks, while collaborating effectively on system design.
Strong problem-solving skills with a willingness to debug and optimize complex systems.
Documentation is key - proactive in creating and maintaining documentation as an off-line communication tool.