<p><strong>Position: Site Reliability Engineer</strong></p>
<p><strong>Job Summary</strong></p>
<p>As a Site Reliability Engineer, you will play a critical role in ensuring the availability and performance of our customer-facing platform. You will work closely with DevOps, DBA, and Development teams to provision and maintain infrastructure, deploy and monitor our applications, and automate workflows. Your contributions will have a direct impact on customer satisfaction and overall experience.</p>
<p> </p>
<p><strong>Responsibilities and Deliverables</strong></p>
<ul>
<li>Manage, monitor, and maintain highly available systems (Windows and Linux)</li>
<li>Analyze metrics and trends to ensure rapid scalability.</li>
<li>Address routine service requests while identifying ways to automate and simplify.</li>
<li>Create infrastructure as code using Terraform, ARM Templates, Cloud Formation.</li>
<li>Maintain data backups and disaster recovery plans.</li>
<li>Design and deploy CI/CD pipelines using GitHub Actions, Octopus, Ansible, Jenkins, Azure DevOps.</li>
<li>Adhere to security best practices through all stages of the software development lifecycle</li>
<li>Follow and champion ITIL best practices and standards.</li>
<li>Become a resource for emerging and existing cloud technologies with a focus on AWS.<br>
</li>
</ul>
<p><strong>Organizational Alignment</strong></p>
<ul>
<li>Reports to the Senior SRE Manager</li>
<li>This role involves close collaboration with DevOps, DBA, and security teams.<br>
</li>
</ul>
<p><strong>Technical Proficiencies</strong></p>
<ul>
<li>Hands-on experience with AWS is a must-have.</li>
<li>Proficiency analyzing application, IIS, system, security logs and CloudTrail events</li>
<li>Practical experience with CI/CD tools such as GitHub Actions, Jenkins, Octopus</li>
<li>Experience with observability tools such as New Relic, Application Insights, AppDynamics, or DataDog.</li>
<li>Experience maintaining and administering Windows, Linux, and Kubernetes.</li>
<li>Experience in automation using scripting languages such as Bash, PowerShell, or Python.</li>
<li>Configuration management experience using Ansible, Terraform, Azure Automation Run book or similar.</li>
<li>Experience with SQL Server database maintenance and administration is preferred.</li>
<li>Good Understanding of networking (VNET, subnet, private link, VNET peering).</li>
<li>Familiarity with cloud concepts including certificates, Oauth, AzureAD, ASE, ASP, AKS, Azure Apps, Load Balancers, Application Gateway, Firewall, Load Balancer, API Management, SQL Server, Databases on Azure</li>
</ul>
<p><br><strong>Experience</strong></p>
<ul>
<li>5+ years of experience in SRE or System Administration role</li>
<li>Demonstrated ability building and supporting high availability Windows/Linux servers, with emphasis on the WISA stack (Windows/IIS/SQL Server/ASP.net)</li>
<li>3+ years of experience with CI/CD tools</li>
<li>3+ years of experience working with cloud technologies including AWS, Azure.</li>
<li>1+ years of experience working with container technology including Docker and Kubernetes.</li>
<li>Comfortable using Scrum, Kanban, or Lean methodologies.</li>
</ul>
<p> </p>
<p><strong>Education</strong></p>
<ul>
<li>Bachelor’s Degree or College Diploma in Computer Science, Information Systems, or equivalent experience.</li>
</ul>