Jobs
>
Cupertino

    Site Reliability Engineer 1 - Cupertino, United States - Juniper Networks

    Juniper Networks background
    Description
    Job Description


    Juniper is seeking a full-time SRE to join our talented team and support high quality technology solutions that revolutionize wireless and wired networks, powered by Artificial Intelligence in the cloud.

    Juniper provides services through SaaS applications to several enterprises, including Fortune 100 and Fortune 500 customers. You will be responsible for maintaining and improving the company's production environment for rapid scaling and outstanding performance. You will keep stellar cloud uptime and reliability. Your primary responsibilities will be incident management and release management in cloud instances in various regions.

    Responsibilities
    Maintain system availability, health and service levels (SLAs, SLOs) of the large-scale cloud infrastructure, running in AWS and GCP.

    Support infrastructure components, data streaming frameworks and databases, such as Kubernetes, Flink, Storm, Spark, Kafka, Cassandra, Elasticsearch, Redis, Postgres, ArangoDB, and many others.

    Monitor, troubleshoot, analyze failures, and provide support for software engineers to debug production issues across microservices and distributed platforms. Work with development team in resolving the issues found.
    Join on-call rotation and resolution of issues in a 24x7 multi-cloud (AWS/GCP) environment.
    Monitor metrics and performance of applications and cloud infrastructure.
    Handle entire lifecycle of incident management, including reporting, analyzing, handling incidents, until its closure and writing RCAs.
    Write and update runbooks for knowledge driven automated processes and bots.
    Perform capacity planning based on performance, usage, and utilization stats.
    Follow SRE best practices and procedures.

    Required Skills
    Bachelor's degree in computer science or computer engineering or equivalent.

    1+ years hands-on experience with AWS or GCP, EC2 (GCE), IAM, S3 (GS), Docker, Kubernetes pods, Jenkins, Prometheus, CloudWatch (Stack Driver), Linux, Ansible.

    1+ years' experience in deploying code and infrastructure in AWS or GCP using continuous integration/continuous delivery (CI/CD) tools in production environments.

    1+ Administration experience of distributed computation and streaming frameworks, like Kafka, Cassandra, Elasticsearch, Flink, Storm, Spark, and cloud services EMR, Dataproc, Elasticache, AWS RDS, GCP SQL or similar.

    1+ years of automation using Python or/and Golang, or/and Rust, and shell scripting.
    1+ prior experience in developing metrics to monitor health of infrastructure and applications.
    Good understanding of Terraform or CloudFormation or any IaC code is preferred.

    Nice to have
    Any opensource development experience.
    AI Ops /Gen AI experience.
    Automation using workflow services GitHub Actions, Google Workflows, Jenkins, GitLab, Slack and Confluence/Jira.
    Microservices release operations experience.


    Minimum Salary:
    $88,800.00


    Maximum Salary:
    $127,650.00


    The pay range for this position is expected to be between $88,800.00 and $127,650.00/year; however, the base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience.

    The total compensation package for this position also includes medical benefits, 401(k) eligibility, vacation, sick time, and parental leave. Additional details of participation in these benefit plans will be provided if an employee receives an offer of employment.


    If hired, employee will be in an "at-will position" and the Company reserves the right to modify base salary (as well as any other payment or compensation program) at any time, including for reasons related to individual performance, Company or individual department/team performance, and market factors.

    Juniper's pay range data is provided in accordance with local state pay transparency regulations. Juniper may post different minimum wage ranges for permanent residency petitions pursuant to US Department of Labor requirements.
    #J-18808-Ljbffr

  • Apple

    Reliability Engineer

    23 hours ago


    Apple Cupertino, United States

    Summary · Posted: Apr 13, 2024 · Weekly Hours: · 40 · Role Number: · Do you ever wonder what goes into making Apple products an amazing user experience? Apple's innovative reliability team is responsible for insuring that our products exceed our customer's expectations for r ...

  • Apple

    Reliability Engineer

    13 hours ago


    Apple Cupertino, United States

    Summary · Posted: Apr 13, 2024 · Weekly Hours: 40 · Role Number: · Do you ever wonder what goes into making Apple products an amazing user experience? Apple's innovative reliability team is responsible for insuring that our products exceed our customer's expectations for rob ...


  • Apple Cupertino, United States

    Reliability Engineer · Cupertino,California,United States · Hardware · Do you ever wonder what goes into making Apple products an amazing user experience? Apples innovative reliability team is responsible for insuring that our products exceed our customers expectations for rob ...


  • Lawrence Harvey Sunnyvale, United States

    Site Reliability Engineer · Status: Full Time · Compensation: 120k to 145k · Hybrid Requirements: 3 days in office, 2 days remote · Lawrence Harvey has partnered with a leading Chinese fintech startup that is committed to democratizing payment services and empowering people and ...


  • Advantis Global is now INSPYR Solutions Sunnyvale, United States

    ABOUT THIS FEATURED OPPORTUNITY · The QoS Infrastructure Tools Team is responsible for building and maintaining tools that are essential for Site Reliability Engineers (SREs) and engineers across the organization. The team primarily develops applications using Golang for backend ...


  • Apple Cupertino, United States

    Site Reliability Engineer - Redis · Cupertino,California,United States · Software and Services · The Apple Service Engineering - Redis SRE team is looking for Site Reliability Engineers with experience in developing processes, tools, and automation for managing distributed sys ...


  • Apple Cupertino, United States

    Summary · Posted: Sep 6, 2023 · Role Number: · Do you love crafting sophisticated solutions to highly complex challenges? Do you intrinsically see the importance in every detail? As part of our Silicon Technologies group, you'll help design and manufacture our next-generation ...


  • Natron Energy Santa Clara, United States

    Natron is seeking a Reliability Engineer to support the development and test of our high-power battery systems for data center UPS and EV charging applications. The occupant of this position will work with the Product Engineering, Reliability, Technology, and Operations teams to ...

  • COMTECH TELECOMMUNICATIONS

    Reliability Engineer

    3 weeks ago


    COMTECH TELECOMMUNICATIONS Santa Clara, United States

    Job Description · Job DescriptionComtech Telecommunications Corp. has an opportunity in Santa Clara, CA for a Reliability/Failure Analysis Engineer. In this important role, you will collaborate with a diverse team of technical professionals and interact with outside customers, pr ...

  • Comtech Telecom

    Reliability Engineer

    4 weeks ago


    Comtech Telecom Santa Clara, United States

    Comtech Telecommunications Corp. has an opportunity in Santa Clara, CA for a Reliability/Failure Analysis Engineer. In this important role, you will collaborate with a diverse team of technical professionals and interact with outside customers, providing solutions to a variety of ...


  • Apple Sunnyvale, United States

    Summary · Posted: Apr 19, 2024 · Weekly Hours: 40 · Role Number: · Imagine what you could do here. At Apple, new ideas have a way of becoming extraordinary products, services, and customer experiences very quickly. Bring passion and dedication to your job and there's no tell ...


  • Apple Sunnyvale, United States

    Operations Reliability Engineer · Sunnyvale,California,United States · Operations and Supply Chain · Imagine what you could do here. At Apple, new ideas have a way of becoming extraordinary products, services, and customer experiences very quickly. Bring passion and dedication ...


  • Amiseq Inc. Sunnyvale, United States

    Site Reliability Engineer · Sunnyvale, CA - Hybrid · 6-12 Months W2 Contract · Job Description: · Hands on development on building n-tier applications using RESTful Services, Java/J2EE, JavaScript, Python, NoSql. · • Working knowledge of one or more cloud technologies such as AZ ...


  • Tech Mahindra Sunnyvale, United States

    Proficiency with the architecture, deployment, performance tuning, and troubleshooting large scale distributed systems on AWS · Understanding of SRE principals including monitoring, alerting, error budgets, fault analysis, and automation · Make sure to apply quickly in order to ...


  • Tech Mahindra Sunnyvale, United States

    Proficiency with the architecture, deployment, performance tuning, and troubleshooting large scale distributed systems on AWS · Understanding of SRE principals including monitoring, alerting, error budgets, fault analysis, and automation · Skilled at writing clean, high-performan ...


  • Apple Cupertino, United States

    Senior Site Reliability Engineer - Apple Services Engineering (ASE) · Cupertino,California,United States · Software and Services · Do you love engineering and running systems and infrastructure that will delight millions of customers? Imagine what you could do here. At Apple, ...


  • Apple Cupertino, United States

    Summary · Posted: Jun 4, 2024 · Weekly Hours: 40 · Role Number: · Do you love engineering and running systems and infrastructure that will delight millions of customers? Imagine what you could do here. At Apple, new ideas have a way of becoming extraordinary products, servic ...


  • Apple Cupertino, United States

    Senior Site Reliability Engineer - Apple Services Engineering (ASE) · Santa Clara Valley (Cupertino),California,United States · Software and Services · Do you love engineering and running systems and infrastructure that will delight millions of customers? Imagine what you coul ...


  • Liebherr Group Mountain View, United States

    Safety & Reliability Engineer für den Bereich Elektroniksysteme (m/w/d) Lindenberg | Job ID 70278 · Organization · Liebherr-Aerospace Lindenberg GmbH · Country · Deutschland · Entry level · Berufserfahrene · Faszinierendes schaffen: Ihre Aufgaben · Entwicklung, Nachweisführu ...


  • Celestial AI Santa Clara, United States

    About Celestial AI · As the industry strives to meet the demands of the AI workloads, bottlenecks in data transfers between processors and memory have hindered progress. The Photonic Fabric based Memory Fabric provides an optically scalable solution to the 'Memory Wall' problem, ...