Vacancy expired!
- AC Wellness Network LLC is a subsidiary of Client and is the dedicated management services organization (MSO) for AC Wellness, a dedicated independent medical group that serves Client Wellness Centers in Santa Clara Valley.
- Our mission is to deliver the world's best healthcare experiences for employees.
- We're looking for a problem solver.
- Someone who takes initiative and will establish our new site reliability engineering position as a top performing role within the MSO.
- The SRE professional will be involved in new projects, starting from concept to crafting the infrastructure, toolset, and processes needed to deliver it, coordinating their implementation, monitoring the performance of a working system, and adjusting it when necessary.
- This role also involves creating training materials and training our staff to follow new guidelines and procedures.
- This person will need to have a detailed problem-solving approach, coupled with a strong sense of ownership and drive.
- This role will keep the IT passionate about what matters to our clinicians and patients: making sure the platforms and services they rely on are available when they want to use them.
- This person is responsible for the availability, performance, efficiency, change management, monitoring, emergency response, capacity planning, back-up, and disaster recovery vital to keep our technical ecosystem reliable.
- This is a new position and an outstanding opportunity to be a key pillar of Client's healthcare future.
- Taking a holistic view of system health to provide primary operational support for multiple distributed software applications and infrastructure layers.
- Handling incidents within the ACWN technical ecosystem.
- Automating manual tasks such as the provisioning of users in production and test environments.
- Collaborating with our service desk, vendors and engineering to get ahead of customer needs and innovate to continually improve.
- Participating in the evaluation of infrastructure tools.
- Working with AC Wellness engineering and vendors to ensure delivery of the non-functional requirements of availability, performance, security, compliance and maintainability.
- Develop tools that improve production monitoring, telemetry, visualization, alerting, observability, workflows and reporting.
- Define new designs, architectures, standards and methods for our healthcare ecosystems systems.
- Engage in service capacity planning and demand forecasting.
- 3 years in a DevOps or Site Reliability Engineer role.
- Firm grasp of at least one modern programming language: Go, Python, Bash, PHP, etc.
- Experience with cloud computing services: AWS or Google Cloud Platform.
- Hands-on experience with infrastructure tooling, for example: Terraform, Cloud Formation, Ansible.
- Experience with container tooling, one of: Kubernetes, Mesosphere, Docker, or Amazon ECS.
- Experience with distributed storage technologies like NFS, HDFS, Ceph, S3.
- Basic understanding of Atlassian products and particularly for Jira and Confluence.
Vacancy expired!