In this role, you will build a fully automated infrastructure by leveraging cutting-edge technologies like Docker, Kubernetes, Ansible, and Terraform. The ideal candidate for this role approaches technology operations as a systems engineering discipline, obtains and analyzes data to identify trouble spots and optimization opportunities, and applies software development practices to improve the reliability of the platform and services. To succeed in this role you need to be passionate about making constant incremental improvements to systems, laser focused on availability and performance, and driven to automate all the things.
What You Will Do:
Support development initiatives to build highly automated, tuned, and reliable systems and services. Contribute to design and implementation decisions, development, and ongoing refactoring.
- Implement tools that analyze and monitor performance and availability; use your findings to make informed decisions on how to improve existing systems and processes
- Bring SRE best practices in-house (post-mortems, trend analysis, availability standards, etc.) and help set the tone around service operations and reliability
- Develop and deliver timely reports on service metrics including but not limited to availability, capacity, performance, and latency across production system
Who You Are:
- You are passionate about making better software and continuously improving the development, integration, and deployment processes
- You enjoy new technological challenges and are motivated to find creative solutions to solve them
- You are highly motivated, self-starter who thrives in a bottoms-up, fast-paced, highly technical environment
- You know how to design, implement, and iterate CI/CD tooling and techniques to improve ability to deliver software and services quickly and reliably
- Expertise in incident and problem management including timely problem identification, successful resolution, and root-cause analysis
- Strong verbal and written communication skills to communicate technology concepts and practices
- Strong expertise in monitoring tools (AppDynamics/App Insights/Sumo Logic/etc.)
- Experience with configuration management tools (Ansible, Chef, Puppet, etc)
- Strong working knowledge of containers, container orchestration, and AWS environment and tools.
by via developer jobs - Stack Overflow
No comments:
Post a Comment