POSITION: Senior Performance Engineer
DEPARTMENT: Information Technology
REPORTING SUPERVISOR: Senior Manager, Systems Management & Reliability
DIRECT REPORTS: No
FLSA: Exempt
EMPLOYMENT STATUS: Direct Hire
TRAVEL REQUIREMENTS: 10% Overnight Travel Required within the U.S.
SCHEDULE: Must be willing to work in-an on-call rotation and be flexible to work various shifts
ABOUT THE TEAM
We are a dedicated and enthusiastic IT team, responsible for supporting the business by providing any infrastructure required to meet companywide goals. IT works closely with all departments at GreatCall to help deliver new products and services, both for our existing customers, as well as potential new markets. The Systems Management & Reliability team manages communications, performance expectations, SLA uptime expectations, provides 1st & 2nd level incident triage and support of our production application infrastructure. The team implements our Enterprise monitoring strategy, developing and maintaining monitoring tools, and provide expertise to our software & product development teams.
ABOUT THE JOB
As a Senior Performance Engineer, your main responsibilities will be to focus on performance and capacity projects on our Systems Management and Reliability team. This person will be responsible for ensuring the end to end application performance and capacity of our consumer and commercial devices and solutions. Delivering highly accessible, responsive and scalable top-quality application and services performance will be key to success in this position.
RESPONSIBILITIES
- Create and define GreatCalls enterprise monitoring strategy and capability roadmap by leading the effort for enterprise monitoring solutions
- Develop and leverage automated methods of application to include infrastructure health and performance monitoring, alerting and analysis
- Conduct analysis by monitoring deviations from performance standards to include, slowdowns in activations, error reports from activation failures and slowdowns in purchases in shopping cart from website.)
- Create and maintain a policy-driven alerting methodology that minimizes false positives while ensuring identification of accurate abnormality severity levels
- Produce and provide written assessments to Product Owners to support application performance and scalability that meets customer and industry expectations
- Be the leader in incident resolution by conducting analyses of incidents to determine root cause and prevent future occurrences; Engage and lead discussions with stakeholders and technical SMEs to determine appropriate actions for resolution and prevention.
- Make recommendations to Leadership for software deployment strategies; assist in defining software development standards and efficiencies
- Assist in accurate and timely incident communication, both internally and externally, including notifications to customers
- Identify opportunities through Dynatrace and drill down to do root cause analysis for insights
- Collaborate with all levels of technology team members (VP and C-level included) to support their strategies
- Design alert configurations, scorecards, reports, and dashboards to support growing business needs
- Design the code for application performance and enterprise logging. Implement application performance to deliver high quality product releases
- Conduct trend analyses through use of APM tools to forecast infrastructure growth (compute (cpu/memory), storage and network)
- Perform load and stress analysis of end to end application performance on infrastructure, capacity planning of APIs and key services
- Other duties as assigned
QUALIFICATIONS
Education: Bachelor's degree required
Certifications: APM, Dynatrace Associate(Preferred), AWS Certified Solutions Architect Associate(Preferred)
Experience:
- Minimum 5 years of experience in performance engineering and capacity planning with in depth knowledge of identifying and debugging application/infrastructure performance problems required
- Minimum 5 years of experience in information systems operations environment in systems analysis or development required
- Minimum 3 years hands-on expertise using tools such as Dynatrace/AppDynamics/New Relic, Splunk, Elastic (ELK), Nagios, Sensu required
- Dynatrace Saas/Managed, Dynatrace AppMon, Dynatrace Synthetic, Dynatrace DC-RUM experience is highly-desirable
Knowledge/Skills/Abilities:
- Strong hands on experience in coding & scripting languages such as Java, C#, Perl, Python, Ruby
- Proficient in production monitoring concepts including synthetic, real user, application performance, system, log, distributed tracing, and dashboards
- Debugging, Monitoring, Optimization: Using AppDynamics, Dynatrace, NewRelic, LoadRunner, Webload, JMeter or similar tools.
- Knowledge of synthetic monitoring solutions in production that tracks, trends, and alerts on the performance of critical business transactions.
- Good troubleshooting and performance tuning experience with AWS components like Dynamo, Kinesis, Lambda, etc.
- Knowledge and hands on experience working with distributed caching systems like Hazelcast, Redis, etc.
- Excellent understanding of how a high transactional system scales the infrastructure load balancers, filers, firewalls
- Hands on experience with application code instrumentation (Java or C#) and database profiling.
- In depth knowledge of database fundamentals and architecture (MS SQL Server, MySQL and MongoDB)
- Independently analyze, resolve, and document complex technical problems
Personal Attributes
- Composure
- Ability to learn quickly on the fly
- Time Management skills
- Organizational and project management skills and ability to set and manage multiple priorities/projects.
- Be a thinker, help in finding performance bottlenecks, debug to get RCA and provide/recommend a working solution.
- Ability to manage conflicts
- Coachable.
- Problem Solving skills
- Excellent Communication skills both verbal and written
by via developer jobs - Stack Overflow
No comments:
Post a Comment