- Write automation code for provisioning and operating infrastructure at massive scale
- Design, build and operate Cloud infrastructure to enable reliable and rapid deployment of microservices with effective monitoring and resilient operations
- Work with development teams to make sure the applications are production ready, scalable and reliable from the grounds up
- Identify and drive opportunities to improve automation for code deployment, management, and visibility of application services
- Develop tools and framework to automate operational tasks, deployment of machines, services, applications
- Establish end-to-end monitoring and alerting on all critical components of the application
- Participate in the on-call rotation supporting the platform and or the production application
- Directs root cause analysis of critical business and production issues
- Develop and mentor other SREs on standard methodology from Infra orchestration and troubleshooting application service in production
- Represent SRE in design reviews and work cross-functionally with Engineering teams on operational readiness
- Expertise in configuration management with a framework such as Terraform, Ansible, and Helm
- Strong Linux administration, internals, and network troubleshooting
- Experience in DevOps, Site Reliability, or infrastructure engineering
- Expertise in Google cloud computing (GCP) and its related services
- Proficiency with a programming language like Python and shell scripting to automate tasks
- Strong experience with CI/CD pipeline, GitHub, Jenkins, Artifactory
- Ability to diagnose and troubleshoot complex distributed systems handling high volume transactions
- Strong fundamentals in HTTP including HTTP headers and web servers
- BS or MS in Computer Science, a related field, or equivalent professional experience or equivalent military experience required
- Excellent problem solving, critical thinking, communication, and teamwork skills
- Excellent written and verbal communication, able to collaborate and rally support
- Self-disciplined, self-managed, self-motivated and strong sense of ownership, urgency, and drive
- Passion for automation and monitoring instrumentation as code
- Excellent interpersonal skills and the ability to work well in a team
- Passionate to learn, understand, and dissect new technology stack quickly on own
- Have experience on building and managing large relational database cluster (MySQL/Percona etc.) will be a plus
-
Reliability Engineer
3 days ago
Comtech Telecom Santa Clara, United States Full time RegularComtech Telecommunications Corp. has an opportunity in Santa Clara, CA for a Reliability/Failure Analysis Engineer. In this important role, you will collaborate with a diverse team of technical professionals and interact with outside customers, providing solutions to a variety of ...
-
Reliability Engineer
1 week ago
Comtech TCS Santa Clara, United StatesJob Description · Job Description · Comtech Telecommunications Corp. has an opportunity in Santa Clara, CA for a · Reliability/Failure · Analysis Engineer. In this important role, you will collaborate with a diverse team of technical professionals and interact with outside cu ...
-
Reliability Engineer
1 week ago
Comtech Telecom Santa Clara, United StatesComtech Telecommunications Corp. has an opportunity in Santa Clara, CA for a Reliability/Failure Analysis Engineer. In this important role, you will collaborate with a diverse team of technical professionals and interact with outside customers, providing solutions to a variety of ...
-
Reliability Engineer
6 days ago
COMTECH TELECOMMUNICATIONS Santa Clara, United StatesJob Description · Job DescriptionComtech Telecommunications Corp. has an opportunity in Santa Clara, CA for a Reliability/Failure Analysis Engineer. In this important role, you will collaborate with a diverse team of technical professionals and interact with outside customers, pr ...
-
Lead Reliability Engineer
1 day ago
Celestial Services Santa Clara, United StatesJob Description: · We are looking for a Lead Reliability Engineer to spearhead reliability efforts specifically tailored for datacenter and high-performance computing (HPC) applications. The ideal candidate will have a strong background in reliability engineering with a focus on ...
-
Lead Reliability Engineer
3 days ago
Celestial AI Santa Clara, United StatesAbout Celestial AI · As the industry strives to meet the demands of the AI workloads, bottlenecks in data transfers between processors and memory have hindered progress. The Photonic Fabric based Memory Fabric provides an optically scalable solution to the 'Memory Wall' problem, ...
-
Site Reliability Engineer
1 week ago
NVIDIA Santa Clara, United StatesNVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It's a unique legacy of innovation that's fueled by great technology—and outstanding people. Today, we're tapping into the unlimited potential of AI to define the next era ...
-
Service Reliability Engineer
3 days ago
Software Technology, Inc Santa Clara, United StatesJob Description · Job DescriptionPosition : Service Reliability Engineer / Sr. Devops Engineer · Location : Santa Clara, CA · Duration : 1 Year + · OK with any visa No OPT please · Local consultants only · Customer will not provide letter for H1B candidates. Please check with t ...
-
Service Reliability Engineer
1 week ago
Software Technology Inc Santa Clara, United StatesJob Description · Job Description · Position : Service Reliability Engineer / Sr. Devops Engineer · Location : Santa Clara, CA · Duration : 1 Year + · OK with any visa No OPT please · Local consultants only · Customer will not provide letter for H1B candidates. Please check wi ...
-
Reliability Engineer
3 weeks ago
Wipro Cupertino, United StatesReliability Engineer · Auston, TX or Cupertino, CA/Remote ok for locals · Permanent Role · Job Summary: · A hardware reliability team is looking for a visionary and d engineer, who can lead and execute reliability test on Main Logic Boards, identify issues with Hardware module i ...
-
Site Reliability Engineer
1 week ago
Cryptoware Technologies Inc Santa Clara, United StatesJob DescriptionJob Description · Responsibility · • Lead the effort of global expansion of Huobi globe spanning infrastructure. · • Work with engineering teams to make sure new features and changes are deployed quickly and safely. · • Constantly improve our system performance ...
-
Site Reliability Engineer
1 week ago
Cryptoware Technologies Inc Santa Clara, United StatesJob Description · Job DescriptionResponsibility · • Lead the effort of global expansion of Huobi globe spanning infrastructure. · • Work with engineering teams to make sure new features and changes are deployed quickly and safely. · • Constantly improve our system performance and ...
-
Senior Reliability Engineer
1 day ago
ServiceNow Santa Clara, United StatesCompany Description · At ServiceNow, our technology makes the world work for everyone, and our people make it possible. We move fast because the world can't wait, and we innovate in ways no one else can for our customers and communities. By joining ServiceNow, you are part of an ...
-
Electrical Reliability Engineer
1 week ago
Peak Demand San Jose, United StatesWe are working with a company operating in the best of both worlds – an innovative start-up inside of a $6 billion parent company building the next generation of solar. They have developed an industry-leading building-integrated solar technology that is being deployed with custom ...
-
Senior Reliability Engineer
1 week ago
Theery San Jose, United StatesJob Description: · Perform reliability evaluation of IC products, packages, and process technology with focus on suitability to end applications and conformance to industry standards. Perform device level failure analysis for an in-depth understanding of IC device failures. Anal ...
-
Electrical Reliability Engineer
1 week ago
Peak Demand Inc San Jose, United StatesWe are working with a company operating in the best of both worlds an innovative start-up inside of a $6 billion parent company building the next generation of solar. They have developed an industry-leading building-integrated solar technology that is being deployed with customer ...
-
Sr. Reliability Engineer
3 weeks ago
Antora Energy San Jose, United StatesAt Antora, we're on a mission to stop climate change. And we can't do that unless we tackle the 30% of global emissions that come from industry. · Antora is unlocking zero-emissions industrial energy, cheaper than fossil fuels. Antora's thermal batteries store energy from renewab ...
-
Sr. Reliability Engineer
3 days ago
Antora Energy San Jose, United StatesJob Description · Job DescriptionAt Antora, we're on a mission to stop climate change. And we can't do that unless we tackle the 30% of global emissions that come from industry. · Antora is unlocking zero-emissions industrial energy, cheaper than fossil fuels. Antora's thermal ba ...
-
Electrical Reliability Engineer
6 days ago
Peak Demand Inc San Jose, United StatesWe are working with a company operating in the best of both worlds – an innovative start-up inside of a $6 billion parent company building the next generation of solar. They have developed an industry-leading building-integrated solar technology that is being deployed with custom ...
-
Senior Reliability Engineer
1 day ago
Theery San Jose, United StatesJob Description: · Perform reliability evaluation of IC products, packages, and process technology with focus on suitability to end applications and conformance to industry standards. Perform device level failure analysis for an in-depth understanding of IC device failures. Analy ...
Sr Principal Site Reliability Engineer - Santa Clara, United States - Palo Alto Networks
Description
Job Description
Job DescriptionCompany DescriptionOur Mission
At Palo Alto Networks everything starts and ends with our mission:
Being the cybersecurity partner of choice, protecting our digital way of life.
Our vision is a world where each day is safer and more secure than the one before. We are a company built on the foundation of challenging and disrupting the way things are done, and we're looking for innovators who are as committed to shaping the future of cybersecurity as we are.
Our Approach to Work
We lead with flexibility and choice in all of our people programs. We have disrupted the traditional view that all employees have the same needs and wants. We offer personalization and offer our employees the opportunity to choose what works best for them as often as possible - from your wellbeing support to your growth and development, and beyond
At Palo Alto Networks, we believe in the power of collaboration and value in-person interactions. This is why our employees generally work from the office three days per week, leaving two days for choice and flexibility to work where you feel most effective. This setup fosters casual conversations, problem-solving, and trusted relationships. While details may evolve, our goal is to create an environment where innovation thrives, with office-based teams coming together three days a week to collaborate and thrive, together
Job DescriptionYour Career
Palo Alto Networks has been rapidly moving towards the future where cloud-based applications are increasingly common. As a Site Reliability Engineer, you will develop the frameworks and pathways to help move our internal applications to microservices. You will be a critical link between engineering and the Infrastructure Platform, building Infrastructure as Code and working in partnership with the App developers to deploy the applications in GCP, AWS and data centers across the globe.
As a member of the SRE team, you will work on producing mission-critical platforms, tools, and processes that will ensure the highest levels of availability and reliability of all our applications. We need creative and innovative problem solvers who can partner with our Application development teams to make their services more usable. Our SRE team is furnished with a standout opportunity to build tools, frameworks, and cloud platforms that will support our company's growth over the next decade. If you are a self-starter and jump on new ideas to make the platform more stable, secure and feature-rich, this is your new career.
Your Impact
Your Experience
The Team
Our engineering team is at the core of our products and connected directly to the mission of preventing cyberattacks. We are continually innovating — challenging the way we, and the industry, think about cybersecurity. Our engineers don't shy away from building products to solve problems no one has pursued before.
We define the industry instead of waiting for directions. We need individuals who feel comfortable in ambiguity, excited by the prospect of a challenge, and motivated by the unknown risks facing our everyday lives that are only mitigated by a secure digital environment.
Our Commitment
We're trailblazers that dream big, take risks, and challenge cybersecurity's status quo. It's simple: we can't accomplish our mission without diverse teams innovating, together.
We are committed to providing reasonable accommodations for all qualified individuals with a disability. If you require assistance or accommodation due to a disability or special need, please contact us at
Palo Alto Networks is an equal opportunity employer. We celebrate diversity in our workplace, and all qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or other legally protected characteristics.
All your information will be kept confidential according to EEO guidelines.
The compensation offered for this position will depend on qualifications, experience, and work location. For candidates who receive an offer at the posted level, the starting base salary (for non-sales roles) or base salary + commission target (for sales/commissioned roles) is expected to be between $170,000/yr to $275,000/yr. The offered compensation may also include restricted stock units and a bonus. A description of our employee benefits may be found here.
#LI-TD1