Senior SRE Engineer - Belmont, United States - RingCentral

    RingCentral
    RingCentral Belmont, United States

    2 weeks ago

    Default job background
    Description

    RingCentral's cloud-based communications platform connects more than 2 million users worldwide.

    Are you looking for an opportunity where your skills and passion make a difference and where your voice will be heard?

    We're the #1 global cloud-based, communications provider, growing at more than 30% annually, and we're looking for team members with an entrepreneurial spark We build a high-available cloud-based contact center that combines all kinds of telephony features and many digital communication channels with customers (WhatsApp, Facebook, Twitter, email, SMS, etc.) into one service. We develop a modern and reliable product that helps companies be closer to their customers and respond to their requests as quickly and efficiently as possible.

    You will be a part of the team responsible for running our product and its cloud infrastructure. You will contribute to the product and infrastructure focusing on availability, maintainability, and scalability. You will apply the best practices of site reliability engineering, operational discipline, and automation.

    You should be motivated, organized, excited about technology and SaaS products, a thorough critical thinker, and relentless in code quality, scalability, latency, and platform stability. Our culture is motivational, constructive, and positive. We value teamwork, camaraderie, and collaboration. If you're up for a fun challenge, we want to hear from you.

    Technology Stack: AWS, Kubernetes (EKS), Aurora RDS (PostgreSQL/MySQL), Kafka, Argo CD, Prometheus, Jenkins, GitLab CI, Terraform, Ansible, Python, Java, Ruby.

    Responsibilities:

    Design, plan and implement a HA and cost-effective cloud infrastructure with an IaC approach

    Develop, scale, and maintain automated CI/CD process using the GitOps

    Increase service automation to improve maintainability, scalability, and engineering productivity

    Plan system capacity and develop tooling for product on-demand scaling

    Troubleshoot and resolve software and technical issues, participate in incidents resolution, and perform root cause analysis

    Participate in an on-call process

    Plan disaster recovery procedures and develop automation for fast and reliable service restoration

    Implement security & compliance requirements

    Interact with development and architecture teams to improve service observability and performance, eliminate logging and monitoring white spots, suggest architectural and process improvements

    Evaluate and adopt new cloud-native technologies

    Qualifications:

    5+ years of technical experience in the same or similar role supporting large-scale and high-load cloud-based production systems

    Experience in the development and support of public cloud infrastructure

    Hands-on experience in running HA applications and development of the CI/CD process in Kubernetes

    Proven programming skills in Python, Go or similar

    Good knowledge of Linux environment, TCP/IP, network routing, DNS

    Familiar with SRE principles, DevOps practices, and modern cloud-native landscape

    Accuracy, attention to details, ability to follow processes

    Good communication skills

    Experience with Contact Center, VoIP solutions is a HUGE plus

    Ability to read and troubleshoot Java code if needed is a plus

    Experience in SQL/NoSQL DB's or attitude to develop skills in this field is a plus