Jobs
>
Cabin John

    Site Reliability Engineer V - Cabin John, United States - ID

    ID
    ID Cabin John, United States

    3 weeks ago

    Default job background
    Description
    Company is a high-growth enterprise software company that simplifies how people prove and share their identity online.

    The company empowers people to control their data through a portable and trusted login, which means they dont need to create a new password when visiting sites that have the button.

    digital identity network has over 117 million registered members, and is used by fourteen federal agencies, agencies in 30 states and over 600 corporations for secure identity proofing and technology meets the federal standards for consumer authentication set by the Commerce Department and is approved as a NIST IAL2 / AAL2 credential service provider by the Kantara Initiative.

    In addition to helping people control their credentials and data, the companys No Identity Left Behind initiative strives to expand digital access and inclusion for all people.

    The company offers multiple pathways to identity verification online self-serve, live video chat agents, and in person.

    is passionate about building a robust identity network that does not compromise access for traditionally underserved has received numerous awards including Deloittes 2023 Technology Fast 500, Washington Business Journals Fastest Growing Companies, Entrepreneur Magazines 100 Brilliant Companies and Wall Street Journals Startup of the Year finalist.

    In recent quarters, announced it raised $132 million in Series D funding, led by Viking Global Investors with participation from CapitalG, Morgan Stanley Counterpoint, FTV Capital, PSP Growth, Auctus Investment Group, Moonshots Capital, and Scout Ventures.

    most recent round brings the total investment in to over $275 million since its founding in 2010.The Site Reliability Engineer V (SRE) will combine software and systems engineering to build and run distributed, fault-tolerant systems at scale.

    SRE's ensure our services have the appropriate reliability and uptime to protect and promote our customers experience.

    Note that candidates must be located in the Washington DC or San Francisco Bay area as this role requires an onsite presence.

    ResponsibilitiesDesign, build, implement, and maintain platform tooling that improves reliability across the entire product surface area, to improve the availability, scalability, latency, and efficiency of servicesManage end-to-end distributed systems availability and ensure high-performance of applicationsBuild automation solutions to prevent problem recurrenceBuild visibility into SLIs, SLOs, SLAs, and dependency metrics to manage operational burden and systems reportingDesign, build, implement, and maintain observability ecosystem to provide visibility across the platform services and applicationsProactively identify risks and develop engineering processes and/or tooling to reduce availability riskEvangelize best practices and mentor service owners on reliability, resiliency, and scalability for new and existing services and/or featuresParticipate in an on-call rotation and hold retroactive root cause analysis meetings, focusing on identifying remediations and product resiliency opportunitiesMinimum QualificationsAt least 7 years of experience working in medium or large scale production systemsThe ability to take a systematic approach to analyzing, troubleshooting, and diagnosing system problems to identify, locate, resolve, and repair problemsExperience in software development or systems engineering with codeExperience designing for scale and automation-forward ecosystems and solutionsPossess a breadth of engineering skills with an interest in service reliability, automation, monitoring, and capacity planningUnderstanding of modern application architecture (e.g. microservices, EDA)

    Experience with APM services and solutions (e.g. Open Telemetry, Honeycomb, New Relic, Dynatrace, AppDynamics, Datadog)Experience with time-series observability solutions (e.g. InfluxDB, Prometheus, Grafana)Experience with scaled indexed logging solutions (e.g.

    Splunk, ElasticSearch, OpenSearch)Experience running and operating Ruby on Rails applications and infrastructureDeep knowledge with major cloud services providers and solutions (Amazon Web Services, Google Cloud Platform, Microsoft Azure)Previous experience working within site reliability engineering culture (e.g.

    improving reliability through systems engineering automation, chaos testing, synthetics, and process improvement)Experience designing, building, implementing, and operating distributed systems and cloud infrastructure at scaleExperience with container computing and container orchestration (e.g.

    proprietary systems such as Google Kubernetes Engine (GKE), multi-cloud solutions such as Kubernetes, or Nomad)Experience with configuration management systems (e.g.

    Ansible, Puppet, Chef, Saltstack, Consul)Experience with virtual networking (e.g. cloud networking, service mesh, SDN)Experience in security automation (e.g. cloud proprietary solutions such as Google Secret Manager or Vault)Experience with infrastructure-as-code (e.g.

    Terraform)Strong written communication skillsAbility to work in an asynchronous environmentExperience in supporting a 24/7 operational infrastructure including on-call rotationsPreferred Qualifications Must have an obsession for building quality products Ability to thrive when there are changing priorities and shifting of gearsStrong oral and written communication skillsMust be a team player with a strong, self-managing work ethicMust be a self-starter with a passion for platform engineering, learning and continuous improvementDay to Day LifeEnsure observability tooling and integrations are providing telemetry and logging statistics across the entirety of systems and applicationsEnable the Engineering organization the ability to identify and triage operational issues, empowering teams to own and operate autonomouslyContribute to defining and executing on the Observability Roadmap in maintaining and modernizing cloud-native observability within the organizationIntegrate telemetry and logging frameworks to the cloud platformEvaluate new and existing observability technologies to ensure capabilities are inclusive of black box solutions (e.g.

    COTS) as well as Engineering-created softwareManage distributed system and application scaling activity directly (as applicable) as well as in an advisory capacity on behalf of Engineering development teamsVision:
    To be the world's leading digital identity network empowering people to control their own information and to prove their credentials across all channels: online, call center, and in-person

    Mission:
    To make the world a more trusted place by delivering the highest level of security with the least amount of friction at the lowest possible cost

    People:
    We have an audacious mission. We aim to fix the identity layer of the internet.

    Billions of people will live better lives with more trust and convenience thanks to We are like Special Forces.

    We take on the most difficult challenges with amazing teammates.
    At , we believe that an in-office culture fosters professional growth and development, mentorship, collaboration, and accelerated innovation. This position will be in-office based at one of our locations in either McLean, VA or Sunnyvale, CA.

    Working in an office together allows our culture to thrive and our team members to establish real connections with their coworkers and the opportunity for lifelong friendships.

    Our work is critical to protecting online identity and were confident that working together is how well change the world.

    The annual base salary listed below for this role is based on experience, skills, education, relevant training and geographic location.

    Company bonus, incentive for sales roles, equity, and benefits are available depending on the offers comprehensive medical, dental, vision, health savings account, flexible spending accounts (medical, limited purpose, dependent care, commuter benefit accounts), basic and voluntary life and AD&D insurance, 401(k) with company match, parental leave, ability to participate in unlimited paid time off subject to the terms and conditions of the PTO policy, including 8 company wide holidays, short and long-term disability insurance, accident and critical illness insurance, referral bonus policy, employee assistance program, pet insurance, travel assistant program, wellbeing and childcare discounts, benefit advocates, and a learning and development benefit.

    The above represents the anticipated total rewards package for this job requisition.

    Final offers may vary from the amount listed based on qualifications, professional experiences, skills, education, relevant training, geographic location, and other job related factors.

    Pay Range$185,645$210,000 maintains a work environment free from discrimination, where employees are treated with dignity and respect. All employees share in the responsibility for fulfilling our commitment to equal employment opportunity.

    does not discriminate against any employee or applicant on the basis of age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances.

    adheres to these principles in all aspects of employment, including recruitment, hiring, training, compensation, promotion, benefits, social and recreational programs, and discipline.

    In addition, 's policy is to provide reasonable accommodation to qualified employees who have protected disabilities to the extent required by applicable laws, regulations and ordinances where a particular employee works.

    Upon request we will provide you with more information about such accommodations.

    Please review our Privacy Policy, including our CCPA policy, at If you provide with any personally identifiable information you confirm that you have read and agree to be bound by the terms and conditions set out in our Privacy participates in E-Verify.



  • Teaching Strategies, LLC Bethesda, United States

    Job Description · Job DescriptionBe a Part of our Team · Join a working family that is dedicated to the mission of the work we do · Teaching Strategies is an innovative edtech organization focused on connecting teachers, children, and families. As front runners in the early child ...

  • AES Corporation

    Reliability Engineer

    3 weeks ago


    AES Corporation Arlington, United States Full time

    AES's mission is to improve lives by accelerating a safer and greener energy future. We are a global, agile, cohesive organization with an employee engagement level akin to a startup company. AES businesses throughout the world are often recognized as great places to work. Our pe ...


  • Humana Bethesda, United States

    Humana · Senior Site Reliability Engineer · Austin , · Texas · Apply Now · Become a part of our caring community and help us put health first · The Senior Site Reliability Engineer maintains, integrates and analyzes software applications within the organization. The Senior S ...


  • Marriott Bethesda, United States

    Job Description · JOB SUMMARY · Lead role in the Monitoring and Performance Management function at Marriott. Performs detailed performance analysis of applications and infrastructure in support of incident and problem investigation and application release management. Develops so ...


  • Marriott International Bethesda, United States

    Job Description · JOB SUMMARY · Lead role in the Monitoring and Performance Management function at Marriott. Performs detailed performance analysis of applications and infrastructure in support of incident and problem investigation and application release management. Develops sol ...


  • Marriott International Bethesda, United States

    Job Number · Job Category Information Technology · Location Marriott International HQ, 7750 Wisconsin Avenue, Bethesda, Maryland, United States VIEW ON MAP · Schedule Full-Time · Located Remotely? Y · Relocation? N · Position Type Management · JOB SUMMARY · Lead role in the Moni ...


  • Teaching Strategies Bethesda, United States

    Be a Part of our Team · Join a working family that is dedicated to the mission of the work we do · Teaching Strategies is an innovative edtech organization focused on connecting teachers, children, and families. As front runners in the early childhood education market, we build ...


  • ID McLean, VA, United States

    Company is a high-growth enterprise software company that simplifies how people prove and share their identity online. The company empowers people to control their data through a portable and trusted login, which means they don't need to create a new password when visiting sites ...


  • Teaching Strategies Bethesda, United States

    Be a Part of our Team · Join a working family that is dedicated to the mission of the work we do · Teaching Strategies is an innovative edtech organization focused on connecting teachers, children, and families. As front runners in the early childhood education market, we build ...


  • Marriott Bethesda, United States

    Job DescriptionJOB SUMMARYLead role in the Monitoring and Performance Management function at Marriott. Performs detailed performance analysis of applications and infrastructure in support of incident and problem investigation and application release management. Develops solutions ...


  • Marriott International Bethesda, United States

    Job Number · Job Category Information Technology · Location Marriott International HQ, 7750 Wisconsin Avenue, Bethesda, Maryland, United States VIEW ON MAP · Schedule Full-Time · Located Remotely? Y · Relocation? N · Position Type Management · JOB SUMMARY · Lead role in the Moni ...


  • Marriott Bethesda, United States

    Job Number · Job Category Information Technology · Location Marriott International HQ, 7750 Wisconsin Avenue, Bethesda, Maryland, United States VIEW ON MAP · Schedule Full-Time · Located Remotely? Y · Relocation? N · Position Type Management · JOB SUMMARY · Lead role in t ...


  • Marriott Bethesda, United States

    Job DescriptionJOB SUMMARYLead role in the Monitoring and Performance Management function at Marriott. Performs detailed performance analysis of applications and infrastructure in support of incident and problem investigation and application release management. Develops solutions ...


  • Marriott Bethesda, United States

    Job Number Job Category Information TechnologyLocation Marriott International HQ, 7750 Wisconsin Avenue, Bethesda, Maryland, United States VIEW ON MAPSchedule Full-TimeLocated Remotely? YRelocation? NPosition Type ManagementJOB SUMMARYLead role in the Monitoring and Performance M ...


  • Teaching Strategies Bethesda, United States

    Be a Part of our Team · Join a working family that is dedicated to the mission of the work we do · Teaching Strategies is an innovative edtech organization focused on connecting teachers, children, and families. As front runners in the early childhood education market, we build ...


  • Marriott International Bethesda, MD, United States

    Job Number Job Category Information TechnologyLocation Marriott International HQ, 7750 Wisconsin Avenue, Bethesda, Maryland, United States VIEW ON MAPSchedule Full-TimeLocated Remotely? YRelocation? NPosition Type ManagementJOB SUMMARYLead role in the Monitoring and Performance M ...

  • Saint-Gobain

    Reliability Engineer

    3 weeks ago


    Saint-Gobain Washington DC, United States

    Consistent with CertainTeed Gypsum Vision, Mission, Values and Objectives, the Reliability Engineer identifies and quantifies Line 1 and Line 2 root cause failure(s), and drives permanent solutions to address systemic or chronic mechanical deficiencies to world class levels of sa ...


  • Marriott Bethesda, United States

    Job Description · JOB SUMMARY · Lead role in the Monitoring and Performance Management function at Marriott. Performs detailed performance analysis of applications and infrastructure in support of incident and problem investigation and application release management. Develops so ...


  • Marriott Bethesda, United States

    Job Description · JOB SUMMARY · Lead role in the Monitoring and Performance Management function at Marriott. Performs detailed performance analysis of applications and infrastructure in support of incident and problem investigation and application release management. Develops so ...


  • Marriott Bethesda, United States

    Job Description · JOB SUMMARY · Lead role in the Monitoring and Performance Management function at Marriott. Performs detailed performance analysis of applications and infrastructure in support of incident and problem investigation and application release management. Develops so ...