Vacancy expired!
- Gain deep knowledge of our complex applications.
- Serve as a primary point responsible for the overall health, performance, and capacity of one or more of our technology products.
- Familiar with design principles of monitoring and alerting systems.
- Designing, implementing, and maintaining robust monitoring and alerting to improve performance and reliability.
- Experience with automation, configuration management, and developing infrastructure as code.
- Use engineering best practices — deliver high-quality production code, utilize automated testing, and build reusable components
- Develop tools to improve our ability to rapidly deploy and effectively monitor custom applications in a large-scale Windows and Linux environment.
- Work closely with development teams to ensure that platforms are designed with "operability" in mind.
- Function well in a fast-paced, rapidly-changing environment.
- Participating in the operations on-call rotation, triaging and addressing production issues
- S. or higher in Computer Science or other technical discipline, or related practical experience.
- Programming skills (Java & Shell Script | Python, Ruby Perl or C).
- 5 or more years of experience in Unix/ Linux large-scale operations role.
- Experience in designing, analyzing, and troubleshooting large-scale distributed systems.
- Debug production issues across services and levels of the stack.
- Experience with one or more orchestration, deployment tools Docker, Ansible.
- Familiarity with Git or other source control systems.
- Experience using tools to create and manage CI (continuous integration) and CD (continuous delivery) pipelines.
- Shell or Python experience, specifically for systems automation.
- Good exp in performance Eng. tools like - Selenium, JMeter & Load runner etc.
- Working knowledge of the TCP/IP stack, internet routing and load balancing.
- Experience with monitoring alerting using technologies like New Relic, SiteScope, Netcool, Dynatrace, Extrahop, Moogsoft, Prometheus, Sensu, Nagios,Splunk,Dynatrace etc.
- Optional: Experience implementing, designing, deploying Docker, Kubernetes, Serverless (Function or Lambda’s).
- Strong interpersonal communication skills (including listening, speaking, and writing) and ability to work well in a diverse, team-focused environment with other SREs, Engineers, Product Managers, etc.
- Creative thinker and strong problem solver with meticulous attention to detail
- Highly organized, creative, motivated, and passionate about achieving results
- Strong experience with AWS (design, SDKs, best practices) – good to have
- AWS certifications – good to have.
Vacancy expired!