-
Site Reliability Engineering
2 weeks ago
MetroStar Washington, United StatesAs a Site Reliability Engineering (SRE) Lead, you'll deliver mission-critical services that empower end users. As the ideal candidate, you'll use your extensive experience designing and implementing end-to-end continuous delivery pipelines and experience in AI/ML. You will also u ...
-
Site Reliability Engineer
2 weeks ago
Mount Indie Washington, United StatesJob Description · Job DescriptionAs aSite Reliability Engineer (SRE), youll continuously drive improvements in observability, performance, and reliability,with the goal to make an impact across the federal government. This role requires a current TS/SCI that has been obtained wit ...
-
Site Reliability Engineer III
3 days ago
GM Financial Arlington, United StatesOverview: · This is a Hybrid Opportunity at our Arlington, TX office (3 days remote, 2 days onsite) · Why GMF Technology? · GM Financial is set to change the auto finance industry and is leading the path of embarking on tech modernization we have a startup mindset, and preserv ...
-
Expert Site Reliability Engineer
2 weeks ago
Allscripts Washington, United StatesWelcome to Veradigm, where our Mission is transforming health, insightfully. Join the Veradigm team and help solve many of today's healthcare challenges being addressed by biopharma, health plans, healthcare providers, health technology partners, and the patients they serve. At V ...
-
Expert Site Reliability Engineer
4 weeks ago
Allscripts Washington, United StatesWelcome to Veradigm Our Mission is to be the most trusted provider of innovative solutions that empower all stakeholders across the healthcare continuum to deliver world-class outcomes. Our Vision is a Connected Community of Health that spans continents and borders. With the larg ...
-
Site Reliability Engineer
3 days ago
Booz Allen Hamilton Alexandria, United StatesJob Number: R · Site Reliability Engineer · The Opportunity : · Do you love finding ways to make systems more efficient? Do you find it impossible to simply maintain when you could improve? Engineering to make a system more resilient and efficient frees up time and money to bui ...
-
Site Reliability Engineer
1 week ago
Innovative Computer Solutions Group, Inc Alexandria, United StatesJob Description · Job Description · Site Reliability Engineer (SRE) mandatory · skills/qualifications: · Must be a US Citizen · • Must possess minimum 3+ years of actual experience in the industry in an SRE role · • Must possess minimum 10+ years of software engineer experienc ...
-
Site Reliability Engineer
2 weeks ago
Parsons Oman Alexandria, United StatesWe harness the power of innovation so that you can change the world and help our customers solve their most complex challenges · In a world of possibilities, pursue one with endless opportunities. Imagine NextWhen it comes to what you want in your career, if you can imagine it, y ...
-
REMOTE - Site Reliability Engineer
2 weeks ago
Harbor Compliance Washington, United StatesJob Description · Job DescriptionSite Reliability Engineer - Full-time Remote · Advance Your Career with Cutting-Edge Infrastructure at Harbor Compliance · Location: Full-time Remote (Excluding CA, CO, MT, NY) · About Harbor Compliance: · Harbor Compliance is committed to simplif ...
-
Department of Corrections Executive Leadership Washington, United StatesIntroduction · The Department of Corrections (DOC) is focused on public safety through the custody and supervision of those in our care. Corrections employees have the opportunity to positively impact the lives of others through careers in a variety of fields. Using cutting-edge ...
-
Site Reliability Engineer
4 weeks ago
ARCADIS Farragut, United StatesJob Description · Arcadis is the world's leading company delivering sustainable design, engineering, and consultancy solutions for natural and built assets. · We are more than 36,000 people, in over 70 countries, dedicated to improving quality of life. Everyone has an important ...
-
Site Reliability Engineer III
3 weeks ago
General Motors Financial Company, Inc. Arlington, United StatesAbout this role: The Site Reliability Engineering (SRE) team provides leadership, direction, and accountability for building and running large-scale software systems. As a Site Reliability Engineer, you will identify and deliver automation solutions Reliability Engineer, Liabilit ...
-
Site Reliability Engineer
1 week ago
Azimuth Corporation Springfield, United StatesJob Description · Job DescriptionAzimuth Corporation is seeking a Site Reliability Engineer, in support of a government customer in Springfield, VA. The ideal candidate will create capabilities (pipelines, containers, auditing/monitoring, HA, SLO/SLA policy docs) and maintain exi ...
-
Site Reliability Engineer
1 day ago
Azimuth Corporation Springfield, United StatesJob Description · Job DescriptionAzimuth Corporation is seeking a Site Reliability Engineer, in support of a government customer in Springfield, VA. The ideal candidate will create capabilities (pipelines, containers, auditing/monitoring, HA, SLO/SLA policy docs) and maintain exi ...
-
Site Reliability Engineer II
1 week ago
General Motors Financial Company, Inc. Arlington, United StatesAbout this role: The Site Reliability Engineering (SRE) team provides leadership, direction, and accountability for building and running large-scale software systems. As a Site Reliability Engineer, you will identify and deliver automation solutions Reliability Engineer, Liabilit ...
-
Senior Site Reliability Engineer
4 days ago
Automox Arlington, United StatesAre you ready to own something big? Automox is turning IT admins into IT heroes by replacing traditional tools with our award winning cloud-native endpoint management platform. Our product works autonomously and so do our teams. We value a 'one team' mentality where everyone's un ...
-
Site Reliability Engineer
2 weeks ago
Halvik Vienna, United StatesJob Description · Job DescriptionHalvik is a highly successful company that puts people first, and we are looking for someone just like you. We are committed to delivering smarter IT-driven solutions bolstered by quality and innovation to help our customers succeed. Come be a par ...
-
Site Reliability Engineer
1 week ago
Halvik Vienna, United StatesHalvik is a highly successful company that puts people first, and we are looking for someone just like you. We are committed to delivering smarter IT-driven solutions bolstered by quality and innovation to help our customers succeed. Come be a part of something truly special · S ...
-
Site Reliability Engineer
6 days ago
Booz Allen Hamilton Chantilly, United States Full timeSite Reliability EngineerThe Opportunity: · Do you love finding ways to make systems more efficient? Do you find it impossible to simply maintain when you could improve? Engineering to make a system more resilient and efficient frees up time and money to build more capabilities. ...
-
Site Reliability Engineer
10 hours ago
Oracle Reston, United StatesWork with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services. Responsible fo ...
Senior Principal Engineer Site Reliability - Arlington, United States - Dell
Description
Senior Principal Engineer Site ReliabilityDell Technologies customers rely on our products and services to drive progress. So, we take the service we provide extremely seriously. Service Delivery is all about making sure our technical solutions help clients fulfil their priorities, challenges and initiatives. As trusted advisors, we build in-depth knowledge of what each client wants to achieve. Then we make sure the services delivered by Dell Technologies deliver on all our promises.
We also work closely with Sales and Global Services colleagues to develop strategic account growth plans, and to identify and pursue sales opportunities.
Join us to do the best work of your career and make a profound social impact as aSenior Principal Engineer - Site Reliability Engineering
on our
Service Delivery
Team in
Austin, Texas .
What you'll achieve
The Senior Principal Engineer- Site Reliability Engineering supporting Artificial Intelligence/Machine Learning/High Performance Compute Solutions, Service Delivery will be responsible for providing the primary management, administration, support, and ongoing maintenance of customer Platforms within a 24x7x365 datacenter environment.
This is a technical leadership role.The ideal candidate will play a crucial role in managing and supporting complex solutions and platforms for our prestigious Fortune 100 clients.
The role will be expected to work in a positive and collaborative fashion with fellow team members, senior engineering/architect staff, vendors, and customers.
The Senior Principal Engineer will assist with process maturation, development, technical standards creation, and drive operational excellence through consistent delivery and best practices.
You will:
Serve as the top technical expert in deploying, upgrading, troubleshooting Artificial Intelligence/Machine Learning/High Performance Compute Solutions platforms
Manage and maintain container platform (Kubernetes, OpenShift) infrastructure, including installation, configuration, and upgrades and optimize system performance, capacity, and availability of the environment
Act in the capacity of an SRE / DevOps expert
Take the first step towards your dream career
Every Dell Technologies team member brings something unique to the table.
Here's what we are looking for with this role:
Essential Requirements
Hands on experience working in an infrastructure managed services environment, supporting complex engineered solution in production with Artificial Intelligence/Machine Learning/High Performance Compute Systems and Platforms, Converged/ Hyper-Converged infrastructure along with fluency in AI/ML pipelines, Nvidia GPU optimization, InfiniBand networking, Machine Learning operating systems such as , Compute Orchestration Platform such as runai etc
Expert-level knowledge of cluster provisioning and resource schedulers
Programming experience with Python, Go, Ruby, Shell Scripts, PowerShell along with hands on experience with ELK, Prometheus, Grafana, Ansible, Git, or similar technologies
Expertise in Kubernetes, OpenShift, Docker, Container Networking, and Cloud Native Platform/ Applications
Strong Networking Fundamentals along with Converged Infra (CI)/Hyper Converged Infa (HCI) Management Certification along with hands-on experience with Amazon Kubernetes Service (AKS), Amazon EKS, Google Kubernetes Engine (GKE), Rancher
Desirable Requirements
BE or MS in Computer Science or Computer Engineering or acceptable combination of equivalent industry experience will be considered
Certified Kubernetes / OpenShift Admin, NSX T Certification
Who we are
We believe that each of us has the power to make an impact. That's why we put our team members at the center of everything we do.
If you're looking for an opportunity to grow your career with some of the best minds and most advanced tech in the industry, we're looking for you.
Dell Technologies is a unique family of businesses that helps individuals and organizations transform how they work, live and play.
Application closing date: 03/22/2024
Dell Technologies is committed to the principle of equal employment opportunity for all employees and to providing employees with a work environment free of discrimination and harassment.
#LI-Remote
#J-18808-Ljbffr