What You’ll Do
- As a Data Engineer, you will play a critical role in designing, developing, and implementing data pipelines and data integration solutions using Spark, Scala, Python, Airflow and Google Cloud Platform (GCP).
- You will be responsible for building scalable and efficient data processing systems, optimizing data workflows, and ensuring data quality and integrity.
- Monitor and troubleshoot data pipelines to ensure data availability and reliability
- Conduct performance tuning and optimization of data processing systems for improved efficiency and scalability
- Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
- Work closely with data scientists and analysts to provide them with the necessary data sets and tools for analysis and reporting
- Create data tools for analytics team members that assist them in building and optimizing our product into an innovative industry leader.
- Stay up-to-date with the latest industry trends and technologies in data engineering and apply them to enhance the data infrastructure
What You’ll bring
- Proven working experience as a Data Engineer with a minimum of 5 years in the field.
- Strong programming skills in Scala and experience with Spark for data processing and analytics
- Familiarity with Google Cloud Platform (GCP) services such as BigQuery, GCS, Dataproc etc.
- Experience of developing near real-time ingestion pipelines using kafka and spark structured streaming.
- Experience with Data modelling, Data warehousing and ETL processes
- Understanding of data warehousing concepts and best practices
- Strong knowledge of SQL and NoSQL systems
- Proficiency in version control systems, particularly Git.
- Proficiency in working with large-scale data sets and distributed computing frameworks
- Familiarity with CI/CD pipelines and tools such as Jenkins or GitLab CI.
- Familiarity with schedulers like Airflow.
- Strong problem-solving and analytical skills
- Familiarity with BI and Visualisation tools like Tableau or Looker
- A background in Generative Artificial Intelligence (Gen AI) is desirable but not essential