Qualifications
Basic
Bachelor's degree or foreign equivalent required from an accredited institution. Will also consider three years of progressive experience in the specialty in lieu of every year of education
At least 4 years of experience with Information Technology
Preferred:
At least 4 years of hands-on experience & in-depth knowledge of key AWS services like S3, EMR, EC2, SNS, Lambda, Glue, CFT etc.
At least 4 years of hands-on experience with the Implementation of Data Lake in AWS and experience with related MVP Services like S3, EMR, Glue, RedShift etc.
At least 4 years of hands-on experience provisioning and spinning up AWS EMR Clusters
A strong understanding of how to interact with AWS (AWS SDK like Boto-3, AWS API, AWS CLI, AWS CloudFormation)
At least 4 years of hands-on experience in setting up security policies for Data Lake MVP services and Data in S3
At least 4 years of hands-on experience in creating and maintain IAM Policies, Bucket Policies and ACLs will be a plus.
At least 4 years of hands-on experience in creating monitors, alarms and notifications using Cloud Watch Logs (Cron Expressions)
Good hands on Python experience. Python as a language is practically usable for anything, we are looking for application Development and Extract/Transform/Load experience using Python.
Hands on knowledge with Spark Core (RDD, Data Frames) and Spark SQL. PySpark knowledge will be a huge plus.
Experience working with AWS Cloud technologies like Glue, EMR, Lambda, AWS SDKs like Boto-3, AWS Data Pipelines
Experience/knowledge of Bash/shell scripting will be a plus.
Has built ETL processes to take data, copy it, structurally transform it etc. involving a wide variety of formats like CSV, TSV, XML and JSON
Experience writing shell scripts (Bash), Python and PowerShell for setting up baselines,
Knowledge of the Hadoop ecosystem, YARN and Hadoop Common
Expertise in newer concepts like Apache Spark and Scala programming
Experience working with Columnar File Formats like Parquet, Avro
Hands programming knowledge with Spark Core Spark Core (RDD, Data Frames) and Spark SQL. PySpark knowledge will be a huge plus.
by via developer jobs - Stack Overflow
No comments:
Post a Comment