Site Reliability Engineer at LanceSoft Inc (Charlotte, NC)

Job Description

Job Title: Site Reliability Engineer
Work Location: Charlotte, NC, 28202

Description:

The Site Reliability Engineer is responsible for all application environments from development to production. The ideal candidate should have hands on experience learning, triaging (both proactive and reactive) and documenting application stacks, using monitoring tools (Splunk, AppDynamics, UI-session replay, Sentry, and/or others) and have expert-level proficiency in at least one area such as content delivery, application development (Java, JavaScript), networking or infrastructure. They should understand web traffic movement through all layers of infrastructure including HTTP, CDNs, load balancers and firewalls.

The Site Reliability Engineer will partner with application development and API teams to gain understanding of the application stacks, triage environment issues, design monitoring methods, and provide reporting to executive leadership Will be critical part of an SRE team which will be the single point of contact for our Agile development and product teams regarding all application reliability, performance and environment issues.

Job Responsibilities

Partner with the Agile development teams to learn and assume responsibility for documentation, logging, and monitoring for various systems
Partner with DevOps on CI/CD improvements using Bitbucket, Jenkins, OpenShift & AWS
Implementation of monitoring on various online applications using solutions such as Splunk, UI-session replay, AppDynamics, etc. and ability to determine the right toolset to accomplish monitoring goals on net new application stacks
Strong knowledge of custom alerts and ability to integrate data housed in disparate data sources to create workflow driven alerting
Administration of web servers (Node.js, NGINX, JBoss, Apache, etc.)
Continuously tune and validate quality of current tools for network, system monitoring, UI-session replay, log file parsing, and implement a toolkit that works
Assist in vulnerability scanning, RCA proposals for defects in Scrum team backlogs
Participate in routine Agile and Scrum ceremonies

Qualifications

Must have expert level knowledge of:

Content Delivery Networks (CDN)
Supporting customer facing web applications
HTTP
Application Performance Monitoring (APM)

Must have some experience with:

Leading Triages
Monitoring tools (Splunk, AppDynamics, and/or others)
SQL, Linux, Scripting, file manipulation, reporting and Visio
Big data elements like server logs, user URL's, etc
CI/CD tools such Bitbucket, Jenkins, OpenShift, & AWS CI/CD tools

Ability to communicate effectively to various levels of Sr. Management -- Technology and Business
Experience and capability to lead small teams
Ability to work off-hours and/or weekends as needed

Additional Desired Knowledge & Skills:

Experience with complex multi-system environments
Working knowledge of Agile methodologies (Scrum, Kanban, Lean, XP)
Experience supporting hybrid server environments (on-premise, AWS, Azure, etc.)
Good understanding of financial industry operations metrics and reporting practices a plus
Passion, positive attitude, engagement and desire to take over challenging assignments as part of a team to make things WORK

For more information – Please contact:

Manoj Patidar
Direct: 703-480-4125
Email: manojp@lancesoft.com
by via developer jobs - Stack Overflow

Placement papers | Freshers Walkin | Jobs daily

Labels

Search jobs and placement papers

Site Reliability Engineer at LanceSoft Inc (Charlotte, NC)

No comments:

Post a Comment

OpenSeeSame - TCS touchstone Mock written test