Site Reliability Engineer - Phoenix, United States - Tek Doors

    Tek Doors
    Tek Doors Phoenix, United States

    4 weeks ago

    Default job background
    Description
    Job Description

    Job Description

    Site Reliability Engineer
    Phoenix, AZ
    24+ Months
    Onsite(Hybrid)
    Contract W2
    9+ Yrs

    Must-Have:
    Experience with

    Observability/Monitoring

    technologies like Splunk, SignalFx, Splunk-OnCall, Rigor, and Azure Monitoring
    Experience with one or more Cloud Platforms (Azure, GCP, AWS)

    Experience with Container technologies:
    Kubernetes, Docker, AKS
    Experience setting up monitoring in infrastructure, applications, and database
    3+ years of systems support analysis experience demonstrated through work or military experience
    2+ years of experience with one or more Agile tools used for tracking user stories or backlogs, such as JIRA
    Excellent verbal, written, and interpersonal communication skills.


    Desired Qualifications:
    Ability to interact with all levels of an organization, including management
    Strong team or technical leadership experience
    Strong verbal, written, and interpersonal communication skills
    3+ years of experience with Cloud technologies
    Incident Management System experience
    Configuration Management Tools experience
    Experience with Agile Scrum (Daily Standup, Sprint Planning, and Sprint Retrospective meetings) and Kanban


    Responsibilities:
    Introduce enterprise capabilities, tools, and innovation to improve availability in a multi-cloud ecosystem by evolving observability, monitoring, logging, CI/CD

    integration(performance,

    smoke, regression, functional, chaos, and environment propagation through automatic deployments)
    Introduce continuous improvement,

    standardization/automation,

    and capabilities to conduct destructive and resiliency testing
    Consistent track record of troubleshooting and resolving issues in live production environments and implementing strategies to eliminate them
    Driven approach to continually improving service levels
    Build and manage systems, infrastructure, and applications through automation
    Deploy, support, and monitor new and existing services, platforms, and application stacks
    Engage in improving the whole lifecycle of services from inception through deployment, operations, and refinement
    Provide hands-on technical expertise during service-impacting events
    Collaborate with other engineers on code reviews, internal infrastructure improvements, and process enhancements
    Use scalability testing to measure, tune, and optimize system performance
    Automate key SRE metrics and IT Service Operations processes including customer impact, % availability of critical business flows, SLO/SLI adherence, and error budget, automate incident process for IT Service Operations through data integrating with unified communications,

    alerting/notification

    systems
    Participate in periodic 24x7 on-call duties

    Share support responsibilities for critical applications and customer journeys onboarded to SRE including remediation of issues through Agile, conducting blameless postmortems, root cause analysis, and introducing continuous improvement solving problems once and for all with the goal of no repeats.

    Tekdoors Inc. is a leading staffing and IT consulting firm with a global presence and 16+ years of IT consulting experience. Headquartered in Arizona, we are rated the Top 10 most emerging IT consulting companies in 2024.

    Our mission is to provide top-quality IT and talent solutions to businesses of all sizes, helping them achieve their goals and stay ahead of the competition.

    With offices and operations worldwide, we have the expertise and resources to deliver customized solutions that meet the unique needs of our clients.

    Our team of experienced recruiters and consultants has a deep understanding of the IT industry and its ever-changing demands.

    #J-18808-Ljbffr