System1 is looking for engineers with production data experience to join Data Engineer team. This team is the horizontal layer that supports business intelligence, optimization, machine learning, and external & internal reporting. We process and report on hundreds of millions of events and user attributes per day, gathered from an extremely heterogeneous set of data streams. Our bread and butter is Python and PostgreSQL, but we also utilize a range of technologies including AWS Lambda, SNS, SQS, Redis, and Dynamodb for caching and mapping, as well as Redshift, Kinesis, and Spark for large dataset munging and ad hoc analysis. The Role You Will Have:
- Design and develop data processing infrastructure and reporting pipelines
- Collaborate with the data-science team to productionize ML pipelines
- Coordinate data models with other engineering teams
- Work with the DevOps team to increase our monitoring and alerting coverage as needed
- Identify scale bottlenecks and successfully overcome them
- Contribute to technical specification documents for data architecture projects
- Contribute to ongoing maintenance of existing infrastructure and investigate issues and failures
What You Will Bring:
- 5+ years of full-time Python development experience
- 3+ years PostgreSQL or Redshift experience. Understanding of relational data models is a must
- 3+ years experience working with Spark or other big data architectures (Hadoop, Apache Storm, etc…) in high-volume environments ( running the big data solutions in AWS is a plus)
- Extensive experience building and managing ETL pipelines on cloud based platforms from inception to production rollout
- Experience ensuring a high degree of reliability and data integrity for mission critical systems
- Experience with large migration projects is a plus
- Excellent communication skills and a collaborative mindset
by via developer jobs - Stack Overflow
No comments:
Post a Comment