Job Summary:We are seeking a motivated and detail-oriented Junior Site Reliability Engineer (SRE) to join our team. This position blends SRE principles with DevOps practices, making the team a key player in enabling seamless application delivery, scalability, and reliability. You will support both Azure Cloud environments and co-located data centers, collaborating across teams to ensure efficient development, deployment, and operation of critical systems.This is a hybrid environment where you’ll work with cutting-edge tools like Kubernetes, Terraform, Ansible, Helm, and SOPS, while also working with some legacy systems on Windows platforms. You’ll contribute to automating pipelines, improving monitoring, and driving operational excellence.Duties/Responsibilities:
Cloud and Cloud Native Environment Management
Oversee and maintain Kubernetes clusters using AKS in Azure and RKE2 in co-located facilities.
Collaborate with teams to design and implement scalable, secure, and cost-efficient cloud solutions.
DevOps Pipeline Ownership
Design, maintain, and improve CI/CD pipelines in Azure DevOps, enabling streamlined application deployments and updates.
Automate repetitive tasks and enhance deployment efficiency using Ansible, Terraform, and custom scripts.
Infrastructure as Code (IaC)
Develop, test, and maintain Terraform and Ansible configurations for infrastructure and application management.
Manage Helm charts to standardize and optimize Kubernetes application deployments.
Security and Compliance
Leverage SOPS for secure management of secrets and sensitive data.
Enforce compliance with organizational and industry security standards.
Monitoring and Incident Response
Implement and refine monitoring, logging, and alerting solutions for cloud and co-located environments using tools like Nagios, PagerDuty, Prometheus and Grafana,
Actively participate in troubleshooting and resolving incidents to minimize downtime and impact.
Collaboration Across Teams
Partner with Development, Networking, and Infrastructure teams to ensure robust and efficient application delivery.
Advocate for best practices in automation, performance optimization, and reliability.
Legacy System Management
Provide operational support for legacy Windows systems as needed.
Assist in modernization efforts to migrate or re-architect legacy applications.
Skills/Abilities:
Experience managing Kubernetes clusters in both cloud and on-premises environments.
Exposure to Helm, SOPS, or similar tools.
Knowledge of monitoring tools like Prometheus, Grafana, or similar.
Basic Windows administration skills for supporting legacy systems.
Basic Linux administration skills for supporting Kubernetes, DevOps agents, and containers.
Familiarity with Azure-specific services and hybrid cloud architecture.
Analytical and problem-solving mindset.
Strong communication and collaboration skills to work across multidisciplinary teams.
Ability to adapt quickly to new technologies and challenges.
A proactive approach to identifying and addressing operational improvements.
Education and Experience:
Bachelor’s degree in Computer Science, Information Technology, or a related field, or equivalent experience.
Participate in an on-call rotation.
Familiarity with Kubernetes-based environments (AKS in Azure and/or RKE2).
Experience with CI/CD pipelines and tools (Azure DevOps preferred).
Basic knowledge of Infrastructure as Code (Terraform, Ansible).
Understanding of containerized application deployment and orchestration.
Proficiency with Git for version control.
Strong eagerness to learn and grow in a DevOps and SRE environment.
Physical Requirements (With or without reasonable accommodation):
Sitting: Over 70%
Standing: 15-40%
Fine Motor Movements: Over70 %
All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or veteran status.