Senior Site Reliability Engineer - Chicago, United States - R2 Global

    Default job background
    Description


    As a Senior Site Reliability Engineer, you will be responsible for designing, implementing, and maintaining highly available, secure, and scalable cloud-based systems.

    You will work closely with cross-functional teams to support our cloud transformation initiatives, leveraging AWS services and a suite of monitoring and observability tools including Splunk, Datadog, Grafana, and Prometheus.


    Key Responsibilities:


    build, and maintain scalable and reliable infrastructure on AWS cloud optimize monitoring, alerting, and logging solutions using Splunk, Datadog, Grafana, and Prometheus to ensure proactive identification and resolution of development and operations teams to automate deployment, configuration, and testing processes.regular performance analysis and capacity planning to support business growth and ensure optimal resource utilization.incident response and post-mortem activities to identify root causes and implement preventative measures.abreast of emerging technologies and best practices in cloud infrastructure, SRE, and DevOps methodologies.


    Qualifications:


    years of experience in a Site Reliability Engineer or similar role.expertise in designing, deploying, and managing cloud-based infrastructure on scripting and automation using Python, Bash, or similar languages.experience with monitoring and observability tools such as Splunk, Datadog, Grafana, and Prometheus.understanding of networking, security, and scalability principles in a cloud environment.troubleshooting and problem-solving skills, with a focus on root cause the financial industry or other regulated environments is a plus.

    This is a hybrid position based out of Chicago or Dallas.#J-18808-Ljbffr