Senior Site Reliability Engineer - Los Angeles - Mango

    Mango
    Mango Los Angeles

    1 week ago

    $100,000 - $150,000 (USD) per year
    Description

    We are seeking a Senior Site Reliability Engineer to own and evolve the infrastructure that supports our on-premise instruments, data systems, and machine learning pipelines. This role combines systems-level engineering with software craftsmanship, requiring deep understanding of how compute, storage, and networking layers interact under real workloads.
    About Mango, Inc.
    Mango is a new type of microscope for rapid bioburden testing.
    Description
    We are seeking a Senior Site Reliability Engineer to own and evolve the infrastructure that supports our on-premise instruments, data systems, and machine learning pipelines. This role combines systems-level engineering with software craftsmanship, requiring deep understanding of how compute, storage, and networking layers interact under real workloads.
    You will be the go-to expert for diagnosing performance issues in our on-prem system. This could be from kernel-level I/O bottlenecks to distributed service latency. In addition to building robust automation that keeps our systems consistent and observable.
    Key Responsibilities
    Infrastructure Design & Reliability
    Design, deploy, and maintain our on-premise and hybrid infrastructure which includes Dell PowerEdge and PowerVault servers, prosumer NAS units, and high-throughput data processing clusters. Implement fault-tolerant systems with reproducible deployments and clear observability.
    Performance & Systems Analysis
    Investigate complex performance issues across hardware, OS, and software boundaries. You will be using Linux toolin addition to in-house application-level metrics to uncover root causes in filesystems, caching layers, or I/O scheduling.
    Automation & Tooling
    Build automation for system provisioning, configuration management, and software deployment using Python, Go, Ansible, or similar frameworks. Develop lightweight services and tools that make reliability visible and maintainable.
    Collaboration
    Work closely with our software and hardware teams to co-design systems that meet the needs of high-resolution imaging and ML inference workloads. Translate hardware realities into software reliability guarantees.
    Observability & Incident Response
    Develop and maintain monitoring, alerting, and logging systems to ensure early detection of issues. Lead incident response and post-mortem efforts with a focus on learning and prevention.
    Documentation & Communication
    Produce clear documentation and communicate findings effectively to the broader team - from network topology diagrams to kernel tuning rationales.
    General Qualifications

    • Deep understanding of Linux systems and performance (I/O schedulers, RAID, caching, NUMA, kernel parameters).
    • Hands-on experience designing and managing on-premise servers, storage arrays, or HPC clusters.
    • Comfort with automation and software development (Python, Go, Bash, or similar).
    • Strong diagnostic and analytical skills: ability to decompose performance problems across multiple layers.
    • Proven track record of improving system reliability, throughput, and maintainability in a fast-paced environment.
    • Excellent written and verbal communication skills for cross-disciplinary collaboration.
    • Self-driven, curious, and motivated by understanding systems deeply rather than just maintaining them.
    Bonus Qualities (Not Required)
    • 5-10 years of relevant industry experience in systems engineering, SRE, or infrastructure software roles.
    • Experience tuning Linux filesystems (ext4, btrfs) and software RAID (mdadm).
    • Familiarity with containerization and orchestration (Docker, Compose, Kubernetes).
    • Knowledge of networking fundamentals (VLANs, bonding, LACP, 10 GbE/40 GbE).
    • Experience supporting data-heavy scientific or ML workloads.
    • Demonstrated technical leadership - mentoring others in debugging, reliability, or performance analysis.
    Salary

  • Work in company

    Reliability Engineer

    Only for registered members

    Must have: · Failure Analysis & Corrective Action Management (FRACAS) · Stakeholder Communication & Cross-Functional Coordination · Rail /AERO Rolling Stock Reliability Expertise · Job Description: · Be a part of Reliability Growth Team dedicated to supporting the NGHST program ...

    Philadelphia $85,000 - $145,000 (USD) per year Contract

    19 hours ago

  • Work in company

    Reliability Engineer

    Only for registered members

    We're hiring a Reliability Engineer to lead and mature the reliability strategy at one of East Tennessee's fastest-growing plastics manufacturers. · Equipment Reliability Strategy – Guide reliability and maintainability initiatives across new and existing equipment. · Maintenance ...

    Morristown $85,000 - $145,000 (USD) per year

    1 week ago

  • Work in company

    Site Reliability Engineer

    Only for registered members

    We are a growing data-driven organization is seeking a Staff Site-Reliability Engineer to join its engineering team. This role partners closely with application engineering, data, and analytics teams to design, manage, and scale cloud infrastructure across a multi-product environ ...

    Los Angeles $115,000 - $185,000 (USD) per year

    1 week ago

  • Work in company

    Site Reliability Engineer

    Only for registered members

    We're building a software platform that empowers today's commercial contractors. From service management to project execution, we're reimagining how our customers operate. · ...

    Los Angeles $120,000 - $150,000 (USD) Full time

    2 weeks ago

  • Work in company

    Site Reliability Engineer

    Only for registered members

    Zachary Piper Solutions is seeking an experienced Site Reliability Engineer (SRE) to support the deployment and sustainment of systems across classified, air-gapped, and government cloud environments. This role blends operations, security, and reliability engineering, and is well ...

    Los Angeles $140,000 - $180,000 (USD)

    2 days ago

  • Work in company

    Senior Reliability Engineer

    Only for registered members

    +We are looking for a Senior Reliability Engineer to develop and manage the plant equipment maintenance strategy. The ideal candidate will have 10+ years of engineering experience and 5+ years of maintenance or reliability engineering experience. · +Monitor equipment data using T ...

    Los Angeles $95,500 - $126,700 (USD) Full time

    2 weeks ago

  • Work in company

    Senior Reliability Engineer

    Only for registered members

    We anticipate the application window for this opening will close on -16 Jan 2026 At Medtronic you can begin a life-long career of exploration and innovation while helping champion healthcare access and equity for allYou'll lead with purpose breaking down barriers to innovation in ...

    Los Angeles $106,400 - $159,600 (USD)

    1 month ago

  • Work in company

    Site Reliability Engineer

    Only for registered members

    +Job summary · Support the deployment and sustainment of systems across classified environments · + · Deploy and maintain software in air-gapped and customer-owned cloud or on-prem environments · ,liauthenticate infrastructure configurations in AWS C2E and other classified enviro ...

    Los Angeles, CA

    2 weeks ago

  • Work in company

    Instrument Reliability Engineer

    Only for registered members

    An exciting career awaits you · At MPC, we're committed to being a great place to work – one that welcomes new ideas, encourages diverse perspectives, develops our people, and fosters a collaborative team environment. · Instrument Reliability Engineer · Job Summary · The Marathon ...

    Los Angeles $106,900 - $184,300 (USD) Full time

    1 week ago

  • Work in company

    Senior Reliability Engineer

    Only for registered members

    +Job Summary · As a Senior Reliability Engineer in Medtronic's Diabetes business, you will lead product verification and reliability test planning/designing/testing methods/equipment for infusion pump systems.+ · +ResponsibilitiesNegotiates within the business to improve overall ...

    Los Angeles $106,400 - $159,600 (USD)

    2 weeks ago

  • Work in company

    Test & Reliability Engineer

    Only for registered members

    + We're looking for a scrappy, hands-on Test & Reliability Engineer to own testing from prototype to production. · ...

    Los Angeles Metropolitan Area

    1 month ago

  • Work in company

    Instrument Reliability Engineer

    Only for registered members

    An exciting career awaits you · At MPC, we're committed to being a great place to work – one that welcomes new ideas, encourages diverse perspectives, develops our people, and fosters a collaborative team environment. · Instrument Reliability Engineer · Job Summary · The Marathon ...

    Los Angeles Metropolitan Area $90,000 - $155,000 (USD) per year

    1 week ago

  • Work in company

    Instrument Reliability Engineer

    Only for registered members

    An exciting career awaits you · At MPC, we're committed to being a great place to work – one that welcomes new ideas, encourages diverse perspectives, develops our people, and fosters a collaborative team environment. · Instrument Reliability Engineer · Job Summary · The Marathon ...

    Los Angeles $106,900 - $184,300 (USD)

    1 week ago

  • Work in company

    Senior Reliability Engineer

    Only for registered members

    We anticipate the application window for this opening will close on - 23 Feb 2026. · At Medtronic you can begin a life-long career of exploration and innovation while helping champion healthcare access and equity for all. · ...

    Los Angeles, CA

    2 weeks ago

  • Work in company

    Site Reliability Engineer

    Only for registered members

    · , the premier online service for consumers to locate, contact and verify people and businesses. Over the past couple of decades the Company has quietly become one of the largest owners of public records data in the country, distributing its products over a vast network of webs ...

    California $115,000 - $185,000 (USD) per year

    4 days ago

  • Work in company

    Site Reliability Engineer

    Only for registered members

    · About the Role · We are seeking a highly skilled Site Reliability Engineer (SRE) to join our small but high-impact infrastructure team. This role is ideal for someone who thrives in fast-paced environments, enjoys wearing multiple hats, and can take full ownership of projects ...

    California $115,000 - $185,000 (USD) per year

    4 days ago

  • Work in company

    Site Reliability Engineer

    Only for registered members

    We're looking for a passionate and experienced Site Reliability Engineer to join our team and play a crucial role in ensuring our cloud platform's security, · Reliability, · scales well.Assist in implementing and operating Microservices on Kubernetes cloud-based platforms. · Coll ...

    Irvine $115,000 - $185,000 (USD) per year Full time

    1 week ago

  • Work in company

    Fleet Reliability Engineer

    Only for registered members

    +Northwood is building a global network of next-generation ground stations, and we're looking for a Fleet Reliability Engineer who is equal parts technical expert, field operator, and builder. · +Upgrade, troubleshoot, and maintain a growing network of antennas distributed across ...

    Los Angeles, CA

    2 weeks ago

  • Work in company

    Test & Reliability Engineer

    Only for registered members

    About Vital Lyfe · Vital Lyfe is a tech company redefining water autonomy through innovation — creating a new category of personal water-making technology built to scale where infrastructure can't. · Mission · We're looking for a scrappy, hands-on Test & Reliability Engineer to o ...

    Los Angeles $70,000 - $135,000 (USD) per year

    1 week ago

  • Work in company

    Database Reliability Engineer

    Only for registered members

    WHAT YOU'LL DO · We are looking for a skilled and motivated Database Reliability Engineer to join our growing team. In this role, you will support the design, implementation, and day-to-day operations of our database infrastructure across cloud platforms including AWS and Google ...

    Los Angeles $130,000 - $150,000 (USD) Full time

    1 day ago

  • Work in company

    Site Reliability Engineer

    Only for registered members

    We are seeking a talented Site Reliability Engineer (SRE) with a strong networking background to join the Fabric team. This role is pivotal in building and maintaining the robust infrastructure necessary for secure and efficient communication between our services. · Participate i ...

    Los Angeles, CA

    1 month ago

Jobs
>
Senior site reliability engineer
>
Jobs for Senior site reliability engineer in Los Angeles