Jobs
>
Austin

    Senior Principal Engineer Site Reliability - Austin, United States - Hispanic Technology Executive Council

    Default job background
    Description
    Senior Principal Engineer Site Reliability
    Dell Technologies customers rely on our products and services to drive progress. So, we take the service we provide extremely seriously. Service Delivery is all about making sure our technical solutions help clients fulfil their priorities, challenges and initiatives. As trusted advisors, we build in-depth knowledge of what each client wants to achieve. Then we make sure the services delivered by Dell Technologies deliver on all our promises.

    We also work closely with Sales and Global Services colleagues to develop strategic account growth plans, and to identify and pursue sales opportunities.

    Join us to do the best work of your career and make a profound social impact as a

    Senior Principal Engineer - Site Reliability Engineering

    on our

    Service Delivery

    Team in

    Austin, Texas .
    What youll achieve

    The Senior Principal Engineer- Site Reliability Engineering supporting Artificial Intelligence/Machine Learning/High Performance Compute Solutions, Service Delivery will be responsible for providing the primary management, administration, support, and ongoing maintenance of customer Platforms within a 24x7x365 datacenter environment.

    This is a technical leadership role.

    The ideal candidate will play a crucial role in managing and supporting complex solutions and platforms for our prestigious Fortune 100 clients.

    The role will be expected to work in a positive and collaborative fashion with fellow team members, senior engineering/architect staff, vendors, and customers.

    The Senior Principal Engineer will assist with process maturation, development, technical standards creation, and drive operational excellence through consistent delivery and best practices.


    You will:
    Serve as the top technical expert in deploying, upgrading, troubleshooting Artificial Intelligence/Machine Learning/High Performance Compute Solutions platforms
    Manage and maintain container platform (Kubernetes, OpenShift) infrastructure, including installation, configuration, and upgrades and optimize system performance, capacity, and availability of the environment
    Act in the capacity of an SRE / DevOps expert
    Take the first step towards your dream career

    Every Dell Technologies team member brings something unique to the table.

    Heres what we are looking for with this role:

    Essential Requirements

    Hands on experience working in an infrastructure managed services environment, supporting complex engineered solution in production with Artificial Intelligence/Machine Learning/High Performance Compute Systems and Platforms, Converged/ Hyper-Converged infrastructure along with fluency in AI/ML pipelines, Nvidia GPU optimization, InfiniBand networking, Machine Learning operating systems such as , Compute Orchestration Platform such as runai etc
    Expert-level knowledge of cluster provisioning and resource schedulers
    Programming experience with Python, Go, Ruby, Shell Scripts, PowerShell along with hands on experience with ELK, Prometheus, Grafana, Ansible, Git, or similar technologies
    Expertise in Kubernetes, OpenShift, Docker, Container Networking, and Cloud Native Platform/ Applications
    Strong Networking Fundamentals along with Converged Infra (CI)/Hyper Converged Infa (HCI) Management Certification along with hands-on experience with Amazon Kubernetes Service (AKS), Amazon EKS, Google Kubernetes Engine (GKE), Rancher
    Desirable Requirements
    BE or MS in Computer Science or Computer Engineering or acceptable combination of equivalent industry experience will be considered
    Certified Kubernetes / OpenShift Admin, NSX T Certification
    Who we are
    We believe that each of us has the power to make an impact. Thats why we put our team members at the center of everything we do.

    If youre looking for an opportunity to grow your career with some of the best minds and most advanced tech in the industry, were looking for you.


    Dell Technologies is a unique family of businesses that helps individuals and organizations transform how they work, live and play.

    Join us to build a future that works for everyone because Progress Takes All of Us.

    Application closing date: 03/22/2024


    Dell Technologies is committed to the principle of equal employment opportunity for all employees and to providing employees with a work environment free of discrimination and harassment.

    Read the full Equal Employment Opportunity Policy .

    Job ID:
    R241321

    Dells Flexible & Hybrid Work Culture

    At Dell Technologies, we believe our best work is done when flexibility is offered.


    We know that freedom and flexibility are crucial to all our employees no matter where you are located and our flexible and hybrid work style allows team members to have the freedom to ideate, be innovative, and drive results their way.

    To learn more about our work culture, please visit our

    page.

    #J-18808-Ljbffr


  • Oracle Austin, United States

    Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectur ...


  • Apple Inc. Austin, United States

    Imagine what you could do here. At Apple, great ideas have a way of becoming great products, services, and customer experiences very quickly. Bring passion and dedication to your job and there's no telling what you could accomplish Join the Apple Service Engineering team as a Sit ...


  • Apple Austin, United States

    Site Reliability Engineer - Ad Platforms · Austin,Texas,United States · Software and Services · At Apple, we work every day to build products that enrich peoples lives. Our Advertising Platforms group makes it possible for people around the world to easily access informative a ...


  • Apple Austin, United States

    Summary · Posted: May 1, 2024 · Weekly Hours: 40 · Role Number: · At Apple, we work every day to build products that enrich people's lives. Our Advertising Platforms group makes it possible for people around the world to easily access informative and imaginative content on t ...


  • Pinnacle Group, Inc. Austin, United States

    Responsibilities · We are looking for an operations engineer to join the Crypto Services SRE team. The Crypto Services SRE team is responsible for systems and services that support a vast number of both Apple's internal services as well as services that users directly use. As an ...


  • Apple Austin, United States

    Site Reliability Engineer · Austin,Texas,United States · Software and Services · The Apple Information Apps Engineering teams power some of the most widely used Apple applications, such as Apple News, Stocks, Weather, and Books. We do this at a massive, global scale. We meet o ...


  • Virtu Financial Austin, United States

    Virtu is a leading financial firm that leverages cutting edge technology to deliver liquidity to the global markets and innovative, transparent trading solutions to our clients. As a market maker, Virtu provides deep liquidity that helps to create more efficient markets around th ...


  • Trellix Austin, United States

    Job Title: · Site Reliability Engineer - Fed Ramp · Role Overview: · We are seeking a Site Reliability Engineer who will improve and maintain software development, test and live infrastructure and services. You will articulate and have experience with Linux and other *NIX- der ...


  • Thales Austin, United States

    Location: Austin, United States of America · Thales people architect identity management and data protection solutions at the heart of digital security. Business and governments rely on us to bring trust to the billons of digital interactions they have with people. Our technolog ...


  • Apple Austin, United States

    Summary · Posted: May 23, 2024 · Weekly Hours: 40 · Role Number: · At Apple, we work every day to build products that enrich people's lives. Our Advertising Platforms group makes it possible for people around the world to easily access informative and imaginative content on ...


  • Apixio Austin, United States

    Who We Are At the intersection of health plans and providers, Apixio and ClaimLogiq are creating a leading Connected Care platform to minimize reimbursement inaccuracies and high-quality patient care so they can thrive as the industry moves toward value-based reimbursement models ...


  • Pinnacle Group Austin, United States

    Responsibilities · We are looking for an operations engineer to join the Crypto Services SRE team. The Crypto Services SRE team is responsible for systems and services that support a vast number of both Apples internal services as well as services that users directly use. As an ...


  • Apple Austin, United States

    Site Reliability Engineer · Austin,Texas,United States · Software and Services · The Apple Information Apps Engineering teams power some of the most widely used Apple applications, such as Apple News, Stocks, Weather, and Books. We do this at a massive, global scale. We meet o ...


  • Virtu Financial Austin, United States

    Virtu is a leading financial firm that leverages cutting edge technology to deliver liquidity to the global markets and innovative, transparent trading solutions to our clients. As a market maker, Virtu provides deep liquidity that helps to create more efficient markets around th ...


  • Apple Austin, United States

    Site Reliability Engineer - Ad Platforms · Austin,Texas,United States · Software and Services · At Apple, we work every day to build products that enrich peoples lives. Our Advertising Platforms group makes it possible for people around the world to easily access informative a ...


  • Apple Austin, United States

    Site Reliability Engineer - Ad Platforms · Austin,Texas,United States · Software and Services · At Apple, we work every day to build products that enrich peoples lives. Our Advertising Platforms group makes it possible for people around the world to easily access informative a ...


  • Apple Austin, United States

    Site Reliability Engineer (SRE) Manager - Apple Cloud Services · Austin,Texas,United States · Software and Services · Imagine what you could do here. At Apple, great ideas have a way of becoming great products, services, and customer experiences very quickly. Bring passion and ...


  • SureCo Inc Austin, United States

    Job Type · Full-time · Description · Job Title: Site Reliability Engineer (SRE) · Location: Remote (comfortable working in the Pacific Time Zone) · SureCo is changing how people in the US take care of their health - in 2020, new regulations went into effect, allowing employe ...


  • SureCo Inc Austin, United States

    Job Type · Full-time · Description · Job Title: Site Reliability Engineer (SRE) · Location: Remote (comfortable working in the Pacific Time Zone) · SureCo is changing how people in the US take care of their health - in 2020, new regulations went into effect, allowing employers ...


  • Zenoss Austin, United States

    Zenoss is seeking an experienced Site Reliability Engineer (SRE) to join a team of Engineers and Architects creating breakthrough ITOps and AIOps Platform. We are seeking individuals with experience and knowledge in building software and tools to support our Ops and Support teams ...