Job Description/ General Message
- The data engineering in SmartNews plays a key role in accelerating the service/business developments. Great efforts are paid to building a highly efficient and highly flexible data service for analytical and operational purposes.
- To serve the internal users from analytics and product dev teams, the data engineers provide high-level user-interface for simplifying the accessing, integration and consolidation of various data sets, and also build the platforms for executing tasks processing massive data in terms of TB per day.
- In order to achieve the best cost-performance we are always looking for optimal solutions to fulfill the requirements of SLA/SLO better, and eagerly adopting advanced technologies in the software engineering, database and especially, the big data open-source community.
- TB
- SLA/SLO
Responsibilities
- Engineer, build and implement data models, processing pipelines for data processing or management, and investigate new algorithms to increase efficiency for data ingestion or integration.
- Work closely with multiple analytics, product-dev, and business-ops teams to understand ways to improve the data model designs, and optimize the query-models in algorithmic/logical level.
- Own and maintain the key data management portfolios such as data schemas, metadata, semantic and relationship between data sets.
- Define, develop and manage the data lineage or provenance systems and underlying hardware and software architectures, to build durable and scalable data integration services.
Diagnose and resolve internal customers' complex technical challenges for data usage and modeling. Help other teams tuning the performance and improving stability using elegant and systematic methods.
- /
Qualifications
- BS/MS degree in Computer Science, Software Engineering or equivalent practical experience
- Strong DB designing and programming skills with a deep understanding of query algebras, data models, structures and algorithms are required for building consolidated and flexible solutions
- Familiar with relational and multidimensional data models, be professional on writing and tuning SQL or other query languages
- Rich experience with one or more programming languages such as Java, Scala, C++ or python; familiar with agile development and manage testing skills
- Understand the basic concepts and have good experience of parallel/distributed query engines such as Hive, Pig, and Presto
- Need certain knowledge on the management of database models and dependencies, relationships, especially DBA-like experience
- Deep understanding of modern DBMS technologies and ecosystems
- Good understanding and experience with analytical DBMS (OLAP DBMS and Data Cube), be able to develop data processing programs with them in batch or a streaming manner
- Familiar with modern data stores either RDBMS or NoSQL stores (such as HBase, DynamoDB/Cassandra or Druid, etc); have experience in developing application or function-extensions of such data stores
- Be able to implement and tune complicated heavy-lifting data flows (ETLs or pipelines), familiar with certain toolings
- A capability of system design with good modularity and extensibility
- Rich experience and knowledge on database logical design methods and toolings such as ER or MD
- Be able to draft the user-understandable blueprints and precise, detailed designs in both intuitive and formal ways
- BSMS
- SQL
- JavaScalaC++Python1
- HivePigPresto
- DBA
- DBMS
- DBMSOLAP dbmsData Cube
- RDBMSNoSQLHbaseDynamoDB/CassandraDruid
- ETL
- ERMD
Preferred Qualifications
- Have experience on design and development of large-scale data processing systems, especially massive logs processing.
- Good knowledge of data integration and data warehouse design and development, such as integration patterns and high-level design or detailed logical/physical designs.
- Good knowledge and experiences on DBMS and query optimization techniques, understand the indexing, query-rewriting, paralleling executing or storage engines
- Have experiences working with data scientists/analysts and understanding the pain point of data analysis such data integrity, consistency, quality, cleansing, etc.
- Have contributions on open source community especially large user-group projects.
- /
- DBMS
Benefits and Perks
- Voluntary Trip - Working remotely twice a year
- SmartKitchen - Healthy lunch on a daily basis for free
- ChikyuCoffee - Delicious coffee provided by our Barista every day
- Event space - Free use for any kind of meet up
- Foreign language development support
- Various social insurance benefits included
- Transportation coverage (Maximum 50,000 yen)
- 11
- SmartKitchen -
- -
- (5)
by via developer jobs - Stack Overflow
No comments:
Post a Comment