Vacancy expired!
Locations: TX - Plano, United States of America, Plano, Texas
Software Engineer, DevOpsWe are looking for an experienced Site Reliability Engineer with operational and/or site reliability engineering background with a passion for providing superior system availability and customer experience. We are looking for candidates who can lead a 24/7 support organization, drive reliability and performance across a massive scale by mastering the full depth of the stack. As an SRE, you will have the opportunity to tackle complex problems of scale which are unique to tech companies while using your expertise in delivery and support of critical services.Job Responsibilities:- Effectively manage troubleshooting and recovery of complex production incidents, ranging from low to critical impacts
- Drive incident resolution through a systematic problem solving approach, coupled with a strong sense of ownership and drive
- Actively participate in teams' Agile stories (project work) to streamline and enhance day to day operations of the team
- Create, manage and utilize appropriate technical procedural documentation (run books)
- Proactively monitor all of the applications and infrastructure behind Capital One's external and internal customer facing services including their availability, latency, performance, and capacity
- Influence resiliency and scalability in production environments in Amazon Web Services (AWS)
- Identify opportunities and develop proactive automated monitoring and alerting solutions by utilizing available tools (Splunk, DataDog, etc.)
- Assist with conducting Root Cause Analysis (RCA) on critical production outages, develop and implement mitigation strategies
- Utilize production support expertise to influence and support new designs, architectures, standards and methods maintaining stability and availability for large-scale distributed systems
- Proactively identify and implement opportunities for automation of routine maintenance tasks, data gathering and resolution of common issues
- Continuously seek to develop new skills and technical expertise, as well as proactively share knowledge with others
- Bachelor's Degree
- At least 2 years of experience in technology production support
- AWS Associate level certification (Solutions Architect, SysOps Administrator, or Developer)
- 2+ years of experience with Linux, UNIX, python, Ruby, Go, JavaScript, or NoSQL
- 2+ years of experience with AWS, Azure or GCP
- 2+ years experience with web API services
- 2+ years of experience with Splunk, ELK, NewRelice, DataDog monitoring and alerts
Vacancy expired!