Data is at the core of our business, providing insights into the effectiveness of our products and enabling the technology that powers them. We build and operate the platform used by the rest of the company for streaming and batch computation and to train ML models. Were building an ecosystem where consumers and producers of data can depend on each other safely. We thrive to build high quality systems we can be proud to open source and an amazing experience for our users and ourselves. We regard culture and trust highly and are looking forward to welcoming your contribution to the team.
If youre passionate about building large scale data processing systems, and you are motivated to make an impact in creating a robust and scalable data platform used by every team, come join us. You will jump into an early stage team that builds the data transport, collection and orchestration layers. You will help shape the vision and architecture of WeWork's next generation data infrastructure, making it easy for developers to build data-driven products and features. You are responsible for developing a reliable infrastructure that scales with the companys incredible growth. Your efforts will allow accessibility to business and user behavior insights, using huge amounts of WeWork data to fuel several teams such as Analytics, Data Science, Sales, Revenue, Product, Growth and many others as well as empowering them to depend on each other reliably. You will be a part of an experienced engineering team and work with passionate leaders on challenging distributed systems problems.
Responsibilities
- Building and operating large scale data infrastructure in production (performance, reliability, monitoring)
- Designing, implementing and debugging distributed systems
- Thinking through long-term impacts of key design decisions and handling failure scenarios
- Building self-service platforms to power WeWorks Technology
Requirements
- Experience with one or more of the following languages and functional programming in general: Scala, Haskell, Java, JavaScript
- Experience with one or more of the following technologies:
- Distributed logging systems (Kafka, Pulsar, Kinesis, etc)
- Stream processing. Flink, Spark, Storm, Beam, etc
- Batch processing: Spark, Hadoop,
- IDL: Avro, Protobuf or Thrift
- MPP databases (Redshift, Vertica, )
- Query execution (Columnar storage, push downs): Hive, Presto, Parquet, ...
- Workflow management (Airflow, Oozie, Azkaban, ...)
- Cloud storage: S3, GCS, ...
- Understanding of distributed systems concepts and principles (consistency and availability, liveness and safety, durability, reliability, fault-tolerance, consensus algorithms)
- Eager to learn new things and passionate about technology
Bonus points:
- Experience with contributing to open source software
- Experience with the following Cassandra, DynamoDB, RocksDB/LevelDB, Graphite, StatsD, CollectD
Critical Competencies for Success
- Focused on team over individual achievements.
- Building software incrementally and make consistent progress.
- You love to learn. mentor and teach others.
- Empathetic, you build long-lasting relationship characteristic of highly efficient teams.
- You keep up-to-date with the latest developments in the field.
We are an equal opportunity employer and value diversity in our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.
by via developer jobs - Stack Overflow