Data Engineer

Data Engineer Job Description Template

Our company is looking for a Data Engineer to join our team.

Responsibilities:

  • Be an integral and trusted member of the data engineering team;
  • Research and stay abreast of key technical developments and trends;
  • Hands on coding / implementation to enable data to flow to our data systems from both internal and external sources;
  • You are able to coach and mentor junior data engineers to be more effective individual contributors;
  • Conduct skill-based internal training sessions;
  • Develop sustainable algorithms for data storage/integration, data quality checks, data accessibility and analytics;
  • You will investigate and research data quality and integrity from data sources;
  • Ensure data pipelines are robust, fast, secure and scalable;
  • You lead the management of data collection, organize the models, and forecast future needs;
  • Work with Software Engineers and Machine Learning Engineers, call out risks, performance bottlenecks;
  • Work with the client and/or offshore team to create algorithms relating to data storage, data quality checks and data accessibility;
  • Align requirements for different work streams and tasks assigned by the client;
  • You will develop and maintain scalable platforms for tracking business intelligence, built for reliability and redundancy;
  • Own data quality and pipeline uptime. Plan for failure;
  • Management of clients expectations.

Requirements:

  • Python;
  • Platforms – Unix , Windows , Linux , Hadoop (HortonWorks);
  • Data warehousing solutions;
  • Relevant certification Advantageous;
  • Proficient in MS Excel and MS Power Point;
  • Fluent in Python and experience containerizing their code for deployment;
  • Experience in using PHP, VB6, R or Python;
  • Advanced SQL and strong knowledge of relational database is a must;
  • Data Warehousing Concepts;
  • Understanding the importance of picking the right data store for the job. (columnar, logging, OLAP, OLTP etc.,);
  • Data mining;
  • RESTful services;
  • Statistical Analysis;
  • Doing standby on a rotating basis;
  • Docker & Kubernetes.