Lead Site Reliability Engineer

EPAM Systems

  • Казахстан
  • Постоянная работа
  • Полная занятость
  • 1 мес. назад
We are seeking a highly skilled Lead Site Reliability Engineer to join our team.The ideal candidate will have a strong background in software engineering and systems engineering, with a focus on reliability and scalability in cloud environments, specifically Azure.Unlock the potential of remote work in Kazakhstan, giving you the flexibility to work from home or access our offices in Astana, Almaty or Karaganda.ResponsibilitiesDesign, implement, and maintain highly available and scalable systems across multi-region Azure cloud architecturesEnsure disaster recovery plans are in place and tested regularlyConfigure and enhance monitoring and alerting processes using Prometheus, Grafana, Alertmanager, and OpsGenieDevelop dashboards to visualize system performance and reliability metricsUtilize Terraform for infrastructure provisioning and managementImplement best practices for continuous deployment and infrastructure changesWork closely with the development team to support ongoing development effortsCommunicate with the customer's DevOps team to elaborate on requirements and collaborate on implementationsEnhance release management and CI/CD processes using JenkinsImprove system security based on recommendations from the security teamWrite and test runbooks to streamline operational tasks and incident responseManage and optimize services running on Kubernetes, Docker/Linux environmentsHandle data persistence using Cosmos DB (Mongo API & SQL API) and MS SQL ServerWork with messaging systems like RabbitMQ, Kafka, and EventHubUtilize Azure Networking for secure and efficient communicationRequirements5+ years experience as a DevOps or SRE engineerProven experience with multi-region Azure cloud architecturesProficiency in Kubernetes and containerization technologiesStrong knowledge of Cosmos DB (both Mongo API & SQL API) and MS SQL ServerFamiliarity with monitoring tools like Prometheus, Grafana, Alertmanager, OpsGenieExperience with .NET Core and ASP.NET Core applicationsCompetency in Docker and Linux environmentsExpertise in Terraform for infrastructure as codeExperience with CI/CD toolsSolid understanding of Azure Networking conceptsExcellent communication skills, both verbal and writtenStrong self-motivation and ability to self-manage tasks and projectsNice to haveExperience with Azure IoT Hub and EventHubWe offer/BenefitsWe connect like-minded people:
  • Delivering innovative solutions to industry leaders, making a global impact
  • Enjoyable working environment, whether it is the vibrant office or the comfort of your own home
  • Opportunity to work abroad for up to two months per year
  • Relocation opportunities within our offices in 55+ countries
  • Corporate and social events
We invest in your growth:
  • Leadership development, career advising, soft skills and well-being programs
  • Certifications, including GCP, Azure and AWS
  • Unlimited access to LinkedIn Learning and Get Abstract
  • Free English classes with certified teachers
  • Discounts in local language schools, including online courses for the Kazakh language
We cover it all:
  • Participation in the Employee Stock Purchase Plan
  • Monetary bonuses for engaging in the referral program
  • Medical & family care package
  • Six trust days per year (sick leave without a medical certificate)
  • Coverage of psychology sessions of your choice
  • Benefits package (sports activities, a variety of stores and services)
EPAM is a team of technologists and innovators united by a passion for technology. In Kazakhstan, we operate across all cities with offices in Astana, Almaty, and Karaganda and work with the world's leading companies from different industries. In 2023, EPAM received the Export Excellence Award at the esteemed Digital Bridge Awards, showcasing our commitment to excellence and innovation.

EPAM Systems

Похожие вакансии

  • SRE инженер (Site Reliability Engineer)

    Дневник.ру

    • Алматы
    Дневник.ру - это высоконагруженная система, которая находится online 24/7. Аудитория постоянно растет, а функционал находится в постоянном развитии. «Дневник.ру» — первая российс…
    • 9 д. назад

    Просмотреть похожие вакансии:

  • TSSR engineer

    Хуавей Текнолоджиз Казахстан

    • Алматы
    Position Description: 1.Communicate with the customer about the requirements for the TSSR report. 2.Review the TSSR report output by the remote delivery center and check the ou…
    • 19 д. назад

    Просмотреть похожие вакансии:

  • QA engineer (Onsite, Almaty)

    «Aviata» (Freedom Travel)

    • Алматы
    Aviata - одна из лидирующих компаний в области бронирования и продажи авиабилетов в Казахстане. Наш сервис включен в рейтинг Forbes "50 крупнейших интернет-компаний Казахстана". …
    • 1 д. назад

    Просмотреть похожие вакансии: