Site Reliability Engineer - Phoenix, United States - TEK Connexion

    Default job background
    Description
    Site Reliability Engineer - Phoenix, AZ

    This team will be a great opportunity to get in with an organization that is just building out a full site reliability engineering function.

    That will allow for growth, development and long term mobility.

    Position focuses on more the Visibility and Monitoring efforts of the SRE including Orchestration and Automation


    Mandatory Skills:
    3-5 years' Experience with most of the following or similar tools - Dynatrace, Data Dog, AppDynamics, Entuity, Thousand Eyes, Solar Winds, Splunk, LogScale, Grafana, Prometheus, Amazon CloudWatch MS Azure

    3-5 years Understanding of programming languages (like Python, Java, Go) and cloud platforms (like AWS, GCP, Azure) proficient in Ansible or other automation / orchestration technologies.

    Experience establishing proactive monitoring, leveraging telemetry data to detect anomalies, identify potential issues before they impact users, and enable faster incident response.

    Extensive understanding of the complexities native to modern distributed systems, being well-versed in the challenges posed by microservices architectures, cloud-native environments, and hybrid infrastructure setups.


    Position Title:
    Site Reliability Engineer

    Position Location:
    Hybrid - 3 in office, 2 remote
    Phoenix, AZ

    Role and the key responsibilities :
    Convert business requirements into tech deliverables
    Establishing proactive monitoring, leveraging telemetry data to detect anomalies
    Develop products and views that enable Site Reliability Engineering
    Develop event triggered Orchestration and Automation
    Develop user documentation and release notes

    Required must have technical skills, tools, or experience:

    Extensive understanding of the complexities native to modern distributed systems, being well-versed in the challenges posed by microservices architectures, cloud-native environments, and hybrid infrastructure setups.

    Experience establishing proactive monitoring, leveraging telemetry data to detect anomalies, identify potential issues before they impact users, and enable faster incident response.

    3-5 years Experience with most of the following or similar tools - Dynatrace, Data Dog, AppDynamics, Entuity, Thousand Eyes, Solar Winds, Splunk, LogScale, Grafana, Prometheus, Amazon CloudWatch MS Azure

    3-5 years Understanding of programming languages (like Python, Java, Go) and cloud platforms (like AWS, GCP, Azure) proficient in Ansible or other automation / orchestration technologies.

    #J-18808-Ljbffr