Vacancy expired!
RESPONSIBILITIES:
- Responsible for the 99.99 uptime of the entire systems (ecommerce/stores and supply chain apps) and application and present updates/recommendations
- Quarter-back high Sev issues as required
- Analyze and recommend best solutions for high availability across all teams
- Set up the processes and procedures to quickly identify Production issues, as well as determining which Product Teams to pull in as needed
- Build the required tools, monitoring and automation to proactively detect and self-heal Production anomalies
- Actively collaborate with Operations and Product teams to review, understand and recommend updates to Gap's overall alerting and monitoring systems
- Collaborate with architects in adopting emerging technologies and application solutions
- Collaborate with Product Management (PdM) and External Service Providers to ensure SLAs for applications are being set and met
- Drive blameless postmortems, research and recommend alternative actions for problem resolution
- Help drive the required changes for reliability as the company adopts a hybrid cloud foundation
- Mentor team, grow talent in this critical area
- Bachelor's degree in Computer Science or similar, and/or related experience
- Deep, system-level understanding of storage, computing, distributed systems, networking
- Experience in Java, J2EE, Spring Boot; microservice architecture; Messaging - MQ, Kafka, Rabbit MQ; Database - SQL, NoSQL; Cloud solutions - Pivotal Cloud Foundry (PCF), Azure PaaS; API development
- Experience leading teams overseeing Production system health at high scale
- Highly collaborative, action-oriented, deep understanding of how complex software systems interact
- Demonstrates proficient understanding of business processes across a multiple-brand business
- Expertise in monitoring tools such as Splunk, New relic, Nagios, Graphite, Grafana etc
- Proficient with modern DevOps practices including CI/CD
Vacancy expired!