Site Reliability Engineer Cloud Engineer - Berkeley Heights, United States - Veterans Sourcing Group

    Veterans Sourcing Group
    Veterans Sourcing Group Berkeley Heights, United States

    3 weeks ago

    Default job background
    Description

    Job Description

    Job Description




    JOB TITLE:
    Site Reliability Engineer Cloud Engineer





    LOCATION:
    Berkeley Heights, NJ


    Main Skills:
    AWS Tech Stack, Elastic Kubernetes, networking load balancing, SQL database, IAM Security, Terraform


    In this role, you will help build the technology responsible for our core services with a focus in the AWS (Amazon Web Services) Cloud eco-system.

    Your work will influence the success of companies across the world.

    Members of our Technology team are experts in the field, working to evaluate and improve today's systems while building tomorrow's.

    Additionally, you will focus using analytical and troubleshooting skills to provide project participation, support continuity, and rotating on-call escalation.

    You will also lead detection and troubleshooting of issues that affect delivery of services for industry leading Digital Consumer Experience products.

    Teamwork and creativity are essential to this role.

    As a go-to point of escalation you will be challenged with resolving card holder impacting issues in a fast-paced environment.


    Essential Role Responsibilities:

    • Provide hands-on support for existing environments to include performance of the following related tasks: software installation, patch installation, upgrades, query writing, configuration, security, system monitoring and tuning, disaster recovery planning, and release deployments.
    • Responsible for managing and upgrading Amazon EKS clusters
    Implement tools and automation for build, configuration management, continuous integration (CI), deployment, and application monitoring.
    Automate and evolve infrastructure, deployment strategies and testing to support a quick turnaround of deployments.


    • Maintain Infrastructure as Code (IaC) responsible for provisioning, configuring, and scaling infrastructure in cloud environments
    Work closely with Engineering to ensure all relevant KPI's are implemented within the monitoring framework.


    • Participate in all Production Support activities during incidents and outages. Hands-on technical resource capable of resolving all technical issues within lower and upper environments and making recommendation for performance and capacity improvements.
    • Participate in capacity planning, tuning systems stability, provisioning, performance, and scaling of the application infrastructure.
    The desire to resolve issues for a 24x7 environment in a non-impacting yet fast-paced resolve time.


    Basic Qualifications for Consideration:

    • Bachelor's degree required; relevant, equivalent work experience may be substituted for degree requirement

    Experience with tech stack in AWS:
    o Elastic Kubernetes / ElastiCache / KMS

    o Networking:
    VPC / Subnets / Route Tables / Security Groups / Flow Logs

    o Compute:
    EC2 / AutoScaling / Load Balancers

    o Database:
    SQL / RDS / Dynamo / PostgreSQL

    o Other:
    IAM / S3 / CloudWatch / CloudFront


    • Working experience with Infrastructure as Code tools such as Terraform or AWS CloudFormation
    • Experience working with third-party vendors
    • Able to work effectively, both independently and as a member of a cross-functional team
    • Demonstrate a desire to automate as much as possible
    • Able to participate in on-call rotation

    PREFERRED QUALIFICATIONS:

    Containerization technologies:
    Docker or Podman


    • Continuous Integration / Continuous Delivery tools: AWS CodePipeline, Azure DevOps
    • Linux system administration – ability to manage and troubleshoot Linux systems and services, bash scripting

    Networking:
    solid understanding of routing and networking concepts


    • Experience working in an Agile development environment

    Web Servers:
    Nginx or Apache configurations and reverse proxies


    • Collaboration platforms such as JIRA, Confluence, Wiki, ServiceNow