Service Reliability Engineer - Santa Clara, United States - Software Technology Inc

Software Technology Inc Santa Clara, United States

3 weeks ago

Description

Job Description

Job Description

Position :
Service Reliability Engineer / Sr. Devops Engineer

Location :
Santa Clara, CA

Duration : 1 Year +

OK with any visa No OPT please
Local consultants only

Customer will not provide letter for H1B candidates. Please check with the candidate and employers before submitting the resume. Face to face is mandatory so please submit local candidates only.

Responsibilities:
Development and Operations (DevOps) subject matter expert for 24x7 SaaS operation

Work hand-in-hand with micro-service software developers, architects, and field integration resources to architect and deliver Ericsson's next generation TV platforms.

Contribute to the development of new tools and automation that ensures the service can be optimized and tuned with minimal human intervention.

Accountable for working upstream with micro service developers on monitoring, tools and architecture to deliver security, reliability, manageability and availability at scale
Point of

escalation/decision

maker on response level of incidents
Participate in the Core SRE on-call roster and respond with command and control incident management during

High Pri Events

while maintaining internal and external SLAs
Act as

Technical Duty Officer

who leads resolution effort of the most complex service problems from network layer to the application at scale
Drive Problem

Management/Retrospectives

("post mortems")
Strong contribution and maintenance of our knowledge base
Analyze trends and make recommendations in the areas of monitoring, incident and change management, cloud orchestration and support.
Contribute to the future growth of the team by conducting candidate screenings and assessments
Accountable for deploying services to production environments

Technologies:
Experience with Docker and SaltStack, Kubernetes orchestration tools, etc.
Knowledge of MongoDB, Cassandra databases, Kafka, IIS Servers on

Azure/AWS/Openstack
Azure, Openstack and AWS concepts and APIs
Experience designing, setting up and maintaining, refining (noise reduction, auditing) monitoring tools such as Prometheus, Prometheus exporters, Kibana, Grafana, Alertmanager, etc
Demonstrable experience in one or more languages: Powershell, Python, BASH, C#, .NET
Strong knowledge of TCP/IP networking, DNS, VPNs, HTTP, load-balancers (such as NGINX), highly available microservice architecture, CDNs
Team Foundation Server/Visual Studio, Atlassian suite (Jira, Confluence), Git
Network analysis, performance and application issues using tcpdump, Fiddler and Wireshark.

Qualifications:
Bachelor's Degree in CS, MIS, or equivalent experience
5+ years of relevant experience with Windows/Unix systems fundamentals, monitoring, cloud services, networking, storage, database, and application knowledge;
Solid communications skills both written and verbal.

Able to effectively tailor messaging to different audiences:
External Customer, Leadership, technical SME, or to Tier-1
Previous experience in customer facing roles during high stress situations
Demonstrated skills as an influencer within a previous organization
In-depth knowledge of IT concepts, strategies, and methodologies; Agile knowledge a plus
In-depth knowledge of business operations, objectives, and strategies.
Familiarity with Containers (e.g. Docker, RKT) and IaaS (e.g. AWS, Azure, Openstack).

#J-18808-Ljbffr

Reliability Engineer

1 week ago

Natron Energy Santa Clara, United States

Natron is seeking a Reliability Engineer to support the development and test of our high-power battery systems for data center UPS and EV charging applications. The occupant of this position will work with the Product Engineering, Reliability, Technology, and Operations teams to ...
Reliability Engineer

4 weeks ago

Comtech Telecom Santa Clara, United States

Comtech Telecommunications Corp. has an opportunity in Santa Clara, CA for a Reliability/Failure Analysis Engineer. In this important role, you will collaborate with a diverse team of technical professionals and interact with outside customers, providing solutions to a variety of ...
Reliability Engineer

3 weeks ago

COMTECH TELECOMMUNICATIONS Santa Clara, United States

Job Description · Job DescriptionComtech Telecommunications Corp. has an opportunity in Santa Clara, CA for a Reliability/Failure Analysis Engineer. In this important role, you will collaborate with a diverse team of technical professionals and interact with outside customers, pr ...
Reliability Engineer

3 days ago

Advanced Micro Devices , Inc. San Jose, United States

Overview: · WHAT YOU DO AT AMD CHANGES EVERYTHING · We care deeply about transforming lives with AMD technology to enrich our industry, our communities, and the world. Our mission is to build great products that accelerate next-generation computing experiences the building bloc ...
Lead Reliability Engineer

1 week ago

Celestial AI Santa Clara, United States

About Celestial AI · As the industry strives to meet the demands of the AI workloads, bottlenecks in data transfers between processors and memory have hindered progress. The Photonic Fabric based Memory Fabric provides an optically scalable solution to the 'Memory Wall' problem, ...
Lead Reliability Engineer

2 weeks ago

Celestial AI Santa Clara, United States

About Celestial AI · As the industry strives to meet the demands of the AI workloads, bottlenecks in data transfers between processors and memory have hindered progress. The Photonic Fabric based Memory Fabric provides an optically scalable solution to the 'Memory Wall' problem, ...
Reliability Engineer

7 hours ago

Apple Cupertino, United States

Summary · Posted: Apr 13, 2024 · Weekly Hours: · 40 · Role Number: · Do you ever wonder what goes into making Apple products an amazing user experience? Apple's innovative reliability team is responsible for insuring that our products exceed our customer's expectations for r ...
Reliability Engineer

3 days ago

Apple Cupertino, United States

Reliability Engineer · Cupertino,California,United States · Hardware · Do you ever wonder what goes into making Apple products an amazing user experience? Apples innovative reliability team is responsible for insuring that our products exceed our customers expectations for rob ...
Site Reliability Engineer

1 week ago

TEKsystems San Jose, United States Contract

Description: · Adobe is looking for an experienced Site Reliability Engineer to join the internal tooling team support, configure, integrate, upgrade, and automate the use of enterprise tools used across their large Engineering organization. Role will be focused on user interact ...
Service Reliability Engineer

2 weeks ago

Software Technology, Inc Santa Clara, United States

Job Description · Job DescriptionPosition : Service Reliability Engineer / Sr. Devops Engineer · Location : Santa Clara, CA · Duration : 1 Year + · OK with any visa No OPT please · Local consultants only · Customer will not provide letter for H1B candidates. Please check with t ...
Site Reliability Engineer

3 weeks ago

HCLTech San Jose, United States

About HCLTech: · HCLTech is a global technology company, home to 221,000+ people across 60 countries, delivering industry-leading capabilities centered around digital, engineering and cloud, powered by a broad portfolio of technology services and products. We work with clients ac ...
Site Reliability Engineer

3 weeks ago

Cryptoware Technologies Inc Santa Clara, United States

Job Description · Job DescriptionResponsibility · • Lead the effort of global expansion of Huobi globe spanning infrastructure. · • Work with engineering teams to make sure new features and changes are deployed quickly and safely. · • Constantly improve our system performance and ...
Senior Reliability Engineer

2 weeks ago

ServiceNow Santa Clara, United States

Company Description · At ServiceNow, our technology makes the world work for everyone, and our people make it possible. We move fast because the world can't wait, and we innovate in ways no one else can for our customers and communities. By joining ServiceNow, you are part of an ...
Reliability Engineer

2 weeks ago

Comtech Telecom Santa Clara, United States Full time Regular

Comtech Telecommunications Corp. has an opportunity in Santa Clara, CA for a Reliability/Failure Analysis Engineer. In this important role, you will collaborate with a diverse team of technical professionals and interact with outside customers, providing solutions to a variety of ...
Principal Site Reliability Engineer

2 weeks ago

Kofi Group Santa Clara, United States Direct Hire

To Apply for this Job Click Here · Principal Site Reliability Engineer · San Francisco Bay Area, CA · We are partnering with a late-stage Cloud Security company that is looking for a Principal Level SRE · The ideal candidate will have: · Strong sense of architecture and design f ...
Principal Site Reliability Engineer

5 days ago

Palo Alto Networks Santa Clara, United States

Our Mission · At Palo Alto Networks everything starts and ends with our mission: · Being the cybersecurity partner of choice, protecting our digital way of life. · Our vision is a world where each day is safer and more secure than the one before. We are a company built on the fou ...
Sr Site Reliability Engineer

5 days ago

Palo Alto Networks Santa Clara, United States

Our Mission · At Palo Alto Networks everything starts and ends with our mission: · Being the cybersecurity partner of choice, protecting our digital way of life. · Our vision is a world where each day is safer and more secure than the one before. We are a company built on the fou ...
Site Reliability Engineer

1 week ago

Lawrence Harvey Sunnyvale, United States

Site Reliability Engineer · Status: Full Time · Compensation: 120k to 145k · Hybrid Requirements: 3 days in office, 2 days remote · Lawrence Harvey has partnered with a leading Chinese fintech startup that is committed to democratizing payment services and empowering people and ...
Site Reliability Engineer

1 week ago

Advantis Global is now INSPYR Solutions Sunnyvale, United States

ABOUT THIS FEATURED OPPORTUNITY · The QoS Infrastructure Tools Team is responsible for building and maintaining tools that are essential for Site Reliability Engineers (SREs) and engineers across the organization. The team primarily develops applications using Golang for backend ...
Electrical Reliability Engineer

1 week ago

MRINetwork Jobs San Jose, United States

Job Description · Job Description · We are working with a company operating in the best of both worlds – an innovative start-up inside of a $6 billion parent company building the next generation of solar. They have developed an industry-leading building-integrated solar technol ...

Service Reliability Engineer - Santa Clara, United States - Software Technology Inc

Description

Reliability Engineer

Reliability Engineer

Reliability Engineer

Reliability Engineer

Lead Reliability Engineer

Lead Reliability Engineer

Reliability Engineer

Reliability Engineer

Site Reliability Engineer