Jobs

    Site Reliability Engineer - Washington, United States - Palantir Technologies

    Palantir Technologies
    Palantir Technologies Washington, United States

    3 weeks ago

    Default job background
    Description

    A World-Changing Company

    Palantir builds the world's leading software for data-driven decisions and operations. By bringing the right data to the people who need it, our platforms empower our partners to develop lifesaving drugs, forecast supply chain disruptions, locate missing children, and more.

    The Role

    We're looking for Site Reliability Engineers who can help us build, operate, and maintain high-performance, scalable, and reliable services for our production infrastructure, across both cloud & on-prem environments. Site Reliability Engineers combine engineering experience and an innate drive to improve existing systems and processes, with the creativity to develop novel solutions to evolving challenges. Our team strives to automate processes wherever possible, using whichever tools are best for the job. You'll be the experts for the environments that you operate infrastructure in, helping partner teams build & configure their software to operate reliably within.

    We strongly believe in engineering teams being responsible for the operations of their services in production. In this role, you'll work closely with engineers to advocate and participate in sensible, scalable, systems design and share responsibility with them in diagnosing, resolving, and preventing production issues.

    Core Responsibilities

    • Maintaining availability of cloud & physical Linux servers that power the Palantir platform in air-gapped production environments.
    • Design, deploy, and operate infrastructure to support customer & product requirements via modern orchestration & monitoring platforms.
    • Collaborate closely with product teams on requirements & SLOs for deploying software into air-gapped environments.
    • Identifying, troubleshooting, and solving network & systems issues.
    • Scripting to automate away routine operational tasks.

    What We Value

    • Active US Security clearance, or eligibility and willingness to obtain a US Security clearance.
    • Confidence in troubleshooting complex systems issues independently using stack traces and observability & systems tools.
    • Comfort with managing large scale production systems and technologies with configuration management, load balancing, monitoring & alerting infrastructure, and container orchestration.
    • Demonstrated ability to continuously learn and work independently, making decisions with minimal supervision while working in secure facilities.
    • Experience with containers (Docker/Podman) and orchestration (OpenShift/Kubernetes) at scale is a plus.
    • Preferred Certifications: DOD 8570 IAT Level II or greater (CISSP, Sec+), Unix/Linux Computing Environment (e.g Linux+, RHCE).
    • Proficiency with scripting in Python or Go is a plus.

    What We Require

    • 5+ years of experience with Linux system administration (RHEL or equivalent preferred).
    • Experience with cloud-based hosting platforms like AWS, Azure, or GCP and/or experience with hardware-based environments.
    • Familiarity with monitoring systems using tools like Prometheus and writing health checks.

    Life at Palantir

    We want every Palantirian to achieve their best outcomes, that's why we celebrate individuals' strengths, skills, and interests, from your first interview to your longterm growth, rather than rely on traditional career ladders. Paying attention to the needs of our community enables us to optimize our opportunities to grow and helps ensure many pathways to success at Palantir. Promoting health and well-being across all areas of Palantirians' lives is just one of the ways we're investing in our community. Learn more at Life at Palantir and note that our offerings may vary by region.

    In keeping consistent with Palantir's values and culture, we believe employees are "better together" and in-person work affords the opportunity for more creative outcomes. Therefore, we encourage employees to work from our offices to foster connectivity and innovation. Many teams do offer hybrid options (WFH a day or two a week), allowing our employees to strike the right trade-off for their personal productivity. Based on business need, there are a few roles that allow for "Remote" work on an exceptional basis. If you are applying for one of these roles, you must work from the state in which you are employed. If the posting is specified as Onsite, you are required to work from an office.

    Palantir is committed to promoting a culture of diversity, equity, and inclusion and is proud to be an Equal Employment Opportunity and Affirmative Action employer. We believe that all Palantirians share the responsibility of upholding our commitment to these values and encourage candidates from a wide range of backgrounds, perspectives, and lived experiences to join us in solving the world's hardest problems. Palantir does not discriminate based upon race, religion, color, national origin, gender (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics. Palantir is committed to working with and providing reasonable accommodations to qualified individuals with physical and mental disabilities. Please see the United States Department of Labor's EEO poster, EEO poster supplement and Pay Transparency Notice for additional information.

    Palantir is committed to making the job application process accessible to everyone. If you are living with a disability (visible or not visible) and need to request a reasonable accommodation for any part of the application or hiring process, please reach out and let us know how we can help.



  • Saint-Gobain Washington, United States

    Consistent with CertainTeed Gypsum Vision, Mission, Values and Objectives, the Reliability Engineer identifies and quantifies Line 1 and Line 2 root cause failure(s), and drives permanent solutions to address systemic or chronic mechanical deficiencies to world class levels of sa ...

  • KMS Solutions

    Reliability Engineer

    2 weeks ago


    KMS Solutions Washington, United States

    Job Description · Job Description · Reliability Engineer · KMS Solutions, LLC is a technical · management/solutions · company that specializes in engineering, analysis, and cyber security. Founded in 2005, KMS is a certified small business with over a decade and a half of exp ...


  • Alta It Services Washington, United States

    Site Reliability Engineering (SRE) Lead · 100% Remote · US Citizenship required per government contract Must be able to obtain a DHS Public Trust clearance As a Site Reliability Engineering (SRE) Lead, you'll deliver mission-critical services that empower end users. As the ideal ...


  • Alldus Washington, United States

    Our client is a Series A startup within the Generative AI space and they are hiring an Site Reliability Engineer to join the team. Backed by one of the leading venture capital firms in the industry, this is an exciting opportunity to join a SaaS company that is revolutionizing th ...


  • Louis Dreyfus Company B.V. Washington, United States

    Port Allen, LA, United States of America · Job Reference · JR0073330 · Professional Areas · Industry · Function · Operations, Engineering and Maintenance · Louis Dreyfus Company is a leading merchant and processor of agricultural goods. Our activities span the entire value cha ...


  • Mission Box Solutions Washington, United States

    As a Site Reliability Engineer (SRE), you will play a vital role in continuously driving improvements in observability, performance, and reliability, aiming to make a substantial impact across the federal government. Our client firmly believes that exceptional technology services ...


  • System One Washington, United States

    Site Reliability Engineer · Work Location: 3 days onsite DC - JBAB, 2 days remote · Clearance: Active TS/SCI with ability to clear PSD · As a Site Reliability Engineer (SRE), you'll continuously drive improvements in observability, performance, and reliability, with the goal t ...


  • Varada Consulting Washington, United States

    Site Reliability Engineer · Job Location-Washington, DC; Hybrid · Overview: · Varada Consulting, LLC is seeking a full-time highly skilled and experienced Site Reliability Engineer (SRE) to join our team. As an SRE, you will be responsible for ensuring the reliability, scalabi ...


  • Palantir Technologies Washington, United States

    A World-Changing Company · Palantir builds the world's leading software for data-driven decisions and operations. By bringing the right data to the people who need it, our platforms empower our partners to develop lifesaving drugs, forecast supply chain disruptions, locate missin ...


  • MailerLite Washington, United States

    MailerLite is one of the fastest-growing email marketing services. We help more than 1 million businesses around the world to keep in touch with their customers. Today, we are a team of more than 170 dreamers, adventurers, and world travelers from 46+ different countries. Join us ...


  • Guidehouse Washington, United States

    Job Family · IT Architecture/Cloud (Digital) · Travel Required · Up to 10% · Clearance Required · Ability to Obtain Public Trust · What You Will Do · Site Reliability Engineer (SRE) is the subject matter expert for infrastructure and cloud operations issues. The SRE will d ...


  • MailerLite Washington, United States

    About MailerLite · MailerLite is one of the fastest-growing email marketing services, assisting over 1 million businesses worldwide in staying connected with their customers. Our diverse team of more than 170 individuals from 46+ countries embodies a culture of diversity, collabo ...


  • Evolver Federal Washington, United States

    Evolver Federal is seeking a Site Reliability Engineer. This is a senior engineering and technical role that is focused on influencing, shaping, and managing the systems and processes that are relied upon for building and deploying the GovInfo application and constituent parts. ...


  • ManTech International Corporation Washington, United States Full time

    Secure our Nation, Ignite your Future · Become an integral part of a diverse team while working at an Industry Leading Organization, where our employees come first. At ManTech International, you'll help protect our national security while working on innovative projects that offe ...


  • Specialized Group Washington, United States

    My client is developing a sales enablement tool that automatically captures recorded sales calls on Zoom, transcribes the conversation, and analyzes the content. · Requirements · 2+ years of experience in building and operating infrastructure in a cloud environment · Nice to h ...

  • KMS Solutions

    Reliability Engineer

    2 weeks ago


    KMS Solutions Washington, United States

    Reliability Engineer · KMS Solutions, LLC is a technical management/solutions company that specializes in engineering, analysis, and cyber security. Founded in 2005, KMS is a certified small business with over a decade and a half of experience supporting the Department of Defens ...


  • Red Frog Solutions Washington, United States

    Site Reliability Engineer - SRE - (TS/SCI) · Full Time Perm · Washington D.C. (Hybrid - · 3 days onsite, 2 days remote) · $180K - $200K Salary Plus Competitive Benefits · As a Site Reliability Engineer (SRE), you will play a vital role in continuously driving improvements in ...


  • Harbor Compliance Washington, United States

    Site Reliability Engineer - Full-time Remote · Advance Your Career with Cutting-Edge Infrastructure at Harbor Compliance · About Harbor Compliance: · Harbor Compliance is committed to simplifying the regulatory challenges of businesses and nonprofits through innovative technology ...


  • Arcadia Washington, United States

    Who We Are · Arcadia is the technology company empowering energy innovators and consumers to fight the climate crisis. Our software and APIs are revolutionizing an industry held back by outdated systems and institutions by creating unprecedented access to the data and clean energ ...


  • Mount Indie Washington, United States

    Mount Indie is on the search for a Lead Site Reliability Engineering (SRE) to work remotely, focusing on delivering mission critical services that empower end users. The role will involve designing and implementing end to end CI/CD pipelines using AI/ML tooling. · Responsibiliti ...