- Responsible for reliability and support of Container Platform on-prem and external clouds (Azure /AWS /Google)
- Monitor and troubleshoot Container platform (Openshift) and Azure (AKS) environment performance issues, connectivity issues, security issues, etc.
- Perform deep dives into systemic and latent reliability issues, Incident management, problem management
- Identifying, analyzing, and resolving infrastructure vulnerabilities and application deployment issues.
- Perform blameless RCA, partner with engineering and operation teams across the organization to roll out fixes. Requirements:
- Responsible for application onboarding and provide troubleshooting support through the lifecycle of the applications on the container platform.
- Identify and drive opportunities to improve automation to reduce TOIL and improve operational excellence.
- Partner with risk, and compliance teams to bring visibility and implement right controls and remediation of vulnerabilities.
- Ensure resiliency during implementation and identify/fix resiliency problems by collaborating with engineering teams.
- Be a key stakeholder in the design of cloud services and work with Architecture, engineering, product teams
- BS /MS degree in Computer Science or related technical field involving systems or equivalent practical experience.
- Minimum 5+ years of hands-on experience supporting Kubernetes /Openshift / AKS /EKS Container platform.
- Experience with Python, Ansible, Golang, and shell scripting
- Kubernetes /Openshift /Terraform certifications are a plus
- strong experience in major services related to Compute, Storage, Network and Security
- Experience with monitoring tools like Prometheus and Dynatrace, as well as cloud native tools like Azure Monitor and Log Analytics
- strong understanding and background of working with a complex IAM infrastructure, including Active Directory, Azure AD Connect, Azure AD, and Ping Identity or other SSO solutions.
- Advanced knowledge of Linux OS, DNS, DHCP, Kerberos and Windows Authentication
- Experience with CI/CD tools git /Jenkins, GitOps model
- Excellent understanding of Linux /Windows operating systems administration
- Experience in Container security and vulnerability remediation.
- Systematic problem-solving approach, sense of ownership and drive
- Ability to juggle competing priorities and adapt to changes in project scope.
- Excellent interpersonal, organizational and communication (written, verbal, and presentation) skills are a must.
- Proven ability to work independently with minimal supervision and as part of a team with direct responsibilities.
- Experience in Openshift, CSP Kubernetes services such as AKS and EKS
- Experience in Terraform, ArgoCD, Tekton, and K-native technologies.
- Experience in agile deployment methodologies (GitOps)
- Knowledge of various container runtimes
- Familiarity with the operator deployment pattern.
- Experience working in a highly available multi-datacenter environment
- Experience working with monitoring tools such as Prometheus, Splunk, Dynatrace, Sysdig, or similar tools.
- Understanding of cost management, inventory management, FinOps model
location: Plano, Texas
job type: Contract
salary: $ per hour
work hours: 8am to 5pm
education: Bachelors
responsibilities:
- Responsible for application onboarding and provide troubleshooting support through the lifecycle of the applications on the container platform.
- Identify and drive opportunities to improve automation to reduce TOIL and improve operational excellence
qualifications:
- Experience level: Experienced
- Minimum 5 years of experience
- Education: Bachelors
skills: - Reliability
- OpenShift (5 years of experience is required)
- Kubernetes (5 years of experience is required)
- Python (5 years of experience is required)
- ansible (5 years of experience is required)
- CI/CD (5 years of experience is required)
- Azure (5 years of experience is required)
Equal Opportunity Employer: Race, Color, Religion, Sex, Sexual Orientation, Gender Identity, National Origin, Age, Genetic Information, Disability, Protected Veteran Status, or any other legally protected group status.
At Randstad Digital, we welcome people of all abilities and want to ensure that our hiring and interview process meets the needs of all applicants. If you require a reasonable accommodation to make your application or interview experience a great one, please contact
Pay offered to a successful candidate will be based on several factors including the candidate's education, work experience, work location, specific job duties, certifications, etc. In addition, Randstad Digital offers a comprehensive benefits package, including health, an incentive and recognition program, and 401K contribution (all benefits are based on eligibility).
Applications accepted on ongoing basis until filled. -
Senior Site Reliability Engineer
1 week ago
Epsilon Grand Prairie, United StatesSonova · Radolfzell am Bodensee, BW 78315 · posted 04/27/2024 · More... · front runner · Buyer/Planner Fertigungsdisponent:in (d/w/m) · Danaher · Bodman-Ludwigshafen, Baden-Württemberg 78351 · posted 04/27/2024 · More... · front runner · Manager: in Software Team (d/f/m) ...
-
Site Reliability Engineer
3 days ago
AllSTEM Connections Plano, United StatesSITE RELIABILTY ENGINEER · ON W2 · PLANO,TX/HOUSTON,TX/DELAWARE · HYBRID REPORTING: 3DAYS ONSITE · SKILLSET NEEDED: · AWS · BIG DATA · SPARK · PYTHON · SCRIPTING · SHELL · PERL · CONTROL-M · AUTOSYS · GRAFANA · ...
-
Senior Site Reliability Engineer
1 week ago
Epsilon Addison, United StatesAmazon TA · posted yesterday · More... · front runner · 2024 Duale Student:innen im Bereich Sustainable Science and Technology – Arbeitssicherheit (w/m/d) · Amazon TA · Prinzhöfte, Niedersachsen 27243 · posted yesterday · More... · front runner · Technology Sales Account E ...
-
Site Reliability Engineer
5 days ago
Mastech Digital Plano, United StatesJob Description · Job Description · Mastech Digital · provides digital and mainstream technology staff as well as Digital Transformation Services for all American Corporations. We are currently seeking a · Site Reliability Engineer · for our client in the · IT Services · d ...
-
Site Reliability Engineer
2 weeks ago
Toyota Plano, United StatesExcited to grow your career at Toyota? · We value our talented employees, and whenever possible strive to help one of our associates grow professionally before recruiting new talent to our open positions. If you think the open position you see is right for you, we encourage you ...
-
Site Reliability Engineer
6 days ago
Mastech Digital Plano, United StatesJob Description · Job DescriptionMastech Digital provides digital and mainstream technology staff as well as Digital Transformation Services for all American Corporations. We are currently seeking a Site Reliability Engineer for our client in the IT Services domain. We value our ...
-
Reliability and Monitoring Engineer
1 week ago
ClifyX Plano, United StatesReliability and Monitoring Engineer · Plano, TX office 3 times a week · Infosys/Toyot · Bill rate: $75 · openings - 5 · Responsible for ensuring the availability, performance, and reliability of our cloud-based infrastructure and services. The primary focus of this role is d ...
-
Site Reliability Engineer
1 week ago
Mastech Plano, United StatesMastech Digital is an IT Staffing and Digital Transformation Services company. · Mastech Digital · provides digital and mainstream technology staff as well as Digital Transformation Services for all American Corporations. We are currently seeking a · Site Reliability Engineer ...
-
Reliability and Monitoring Engineer
1 week ago
ClifyX Plano, United StatesReliability and Monitoring Engineer · Plano, TX office 3 times a week · Infosys/Toyot · Bill rate: $75 · openings - 5 · Responsible for ensuring the availability, performance, and reliability of our cloud-based infrastructure and services. The primary focus of this role is desig ...
-
Site Reliability Engineer
2 weeks ago
Fortis Talent Plano, TX, United StatesFortis Talent is seeking a Site Reliability Engineer for a Contract to Hire opportunity with one of our top clients in Plano, TX. Required skills are as follows: · This is a Contract to Hire on W2 ONLY and hybrid. (3 days onsite) · MUST HAVE 2+ years of SRE experience. · MUST HA ...
-
Site Reliability Engineer
4 days ago
Toyota Deutschland GmbH Plano, United StatesOverview · Who we are · Collaborative. Respectful. A place to dream and do. These are just a few words that describe what life is like at Toyota. As one of the world's most admired brands, Toyota is growing and leading the future of mobility through innovative, high-quality sol ...
-
Site Reliability Engineer
1 week ago
Toyota Plano, United StatesOverview · Who we are · Collaborative. Respectful. A place to dream and do. These are just a few words that describe what life is like at Toyota. As one of the world's most admired brands, Toyota is growing and leading the future of mobility through innovative, high-quality sol ...
-
Site Reliability Engineer
1 week ago
Toyota Plano, United StatesOverview Who we are · Collaborative. Respectful. A place to dream and do. These are just a few words that describe what life is like at Toyota. As one of the world's most admired brands, Toyota is growing and leading the future of mobility through innovative, high-quality soluti ...
-
Site Reliability Engineer
5 days ago
JobRialto Plano, United StatesShould be strong SRE, experience with java, AWS / DevOps / deployment strategy and monitoring tools. · Candidates should be with more hands-on experience with Dynatrace / Splunk / CICD / Grafana etc. · Looking for resource with very good application trouble shooting experience. ...
-
Site Reliability Engineer
4 days ago
BCforward Plano, United StatesJob Title: Site Reliability Engineer · Location: Plano, TX / Wilmington, DE / Houston, TX · Hybrid Model: 3 days Onsite and 2 days Remote. · Duration: 6 Months Contract to Fulltime · Skillset – · AWS · Big Data · Spark · Python · Shell / Perl Scripting · Control-M · Autosys · Gr ...
-
Site Reliability Engineer
2 days ago
Dice Plano, United StatesDice is the leading career destination for tech experts at every stage of their careers. Our client, Fortis Talent, is seeking the following. Apply via Dice today · Fortis Talent is seeking a Site Reliability Engineer for a Contract to Hire opportunity with one of our top client ...
-
Site Reliability Engineer
1 week ago
BCforward Plano, United StatesJob Title: Site Reliability Engineer · Location: Plano, TX / Wilmington, DE / Houston, TX · Hybrid Model: 3 days Onsite and 2 days Remote. · Duration: 6 Months Contract to Fulltime · Skillset · AWS · Big Data · Spark · Python · Shell / Perl Scripting · Control-M · Aut ...
-
Container Platform Reliability Engineer
1 week ago
Randstad North America, Inc. Plano, United StatesI can not work C-C or with vendors. Direct W 2 only. Please send resume to: Job Description: Responsible for reliability and support of Container Platform on-prem and external clouds (Azure / AWS / Google)Monitor Reliability Engineer, Liability, Platform, Reliability, Reliabilit ...
-
Site Reliability Engineer III
2 weeks ago
Global Payments Plano, United StatesResponsible for availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning. Creates a bridge between development and operations by applying a software engineering mindset to system adminis Reliability Engineer, Liabili ...
-
Sr. Site Reliability Engineer
2 weeks ago
Pizza Hut Plano, United States7100 Corporate Drive · **Plano, TX 75023** · **Sr. Site Reliability Engineer (Remote)** · **Description:** · Job Description - Site Reliability Engineer · If so, you might be just the person we are looking for to fill our Senior Site Reliability Engineering role at Pizza Hut ...
Container Platform Reliability Engineer - Plano, United States - Randstad USA
Description
job summary:I can not work C-C or with vendors. Direct W2 only.
Please send resume to:
Job Description: