Supports the CenturyLink organization in the development and implementation of support services for multi-tenant big-data-as-a-service products. Responsibilities will include provisioning and support automation development in a agile environment.
Job Description
- Working with product developers to build, test and automated the provisioning of various configuration of the Hadoop ecosystem.
- Cluster management and maintenance using a variety of tools including Cloudera Manager, Nagios, Ganglia and Graphite.
- Administer, troubleshoot, and perform problem isolation and correct problems discovered in clusters.
- Performance tuning of Hadoop clusters and ecosystem components and jobs. This includes the management and review of Hadoop log files.
- Hadoop security management and auditing.
- Working with end-to-end teams to guarantee high availability of the Hadoop clusters.
- Deploy configurations of the Cloudera Distribution of Hadoop from both the command line and Cloudera Manager.
- Support the integration of 3rd Party data movement and visualization products into the ecosystem.
- Work in a hybrid infrastructure environment.
- Manage development priorities, projects, resources, issues and risks effectively.
Qualifications
- General operational expertise such as good troubleshooting skills, understanding of system’s capacity, bottlenecks, basics of memory, CPU, OS, storage, and networks.
- The most essential requirements are: They should be able to deploy Hadoop cluster, add and remove nodes, keep track of jobs, monitor critical parts of the cluster, configure name-node high availability, schedule and configure it and take backups.
- Good knowledge of Linux as Hadoop runs on Linux.
- Firm grasp of UNIX/Linux fundamentals in relation to UNIX scripting and administration. Strong experience with CentOS or RHEL.
- Experience administrating production Big Data systems based on Hadoop (Apache, Hortonworks, or Cloudera) including related technologies such as HBase, Hive, Impala, etc.
- Experience in automation technologies, preferably Ansible.
- Expert in Linux shell scripting. Python or Scala scripting experience preferred.
- Experience in Loading from disparate data sets, Pre-processing using Hive and Pig.
- Experience in Designing, building, installing, configuring and supporting Hadoop.
- Worked on Translating complex functional and technical requirements into detailed design.
- Perform analysis of vast data stores and uncover insights.
- Responsible for maintaining security and data privacy, creating scalable and high-performance web services for data tracking.
- Responsible for high-speed querying, managing and deploying HBase, test prototypes and oversee handover to operational teams, propose best practices/standards.
- Good knowledge in back-end programming, specifically java, JS, Node.js and OOAD
- Experience in writing high-performance, reliable and maintainable code, MapReduce jobs, Pig Latin scripts.
- Good knowledge of database structures, theories, principles, and practices.
- Hands on experience in HiveQL.
- Familiarity with data loading tools like Flume, Sqoop.
- Knowledge of workflow/schedulers like Oozie.
Education
Bachelors or Equivalent
Masters or Equivalent
by via developer jobs - Stack Overflow
No comments:
Post a Comment