Jobs
>
New York City

    SRE/DevOps Engineer - New York, United States - Versana

    Versana
    Versana New York, United States

    2 weeks ago

    Default job background
    Description
    About Us: Versana is an industry-backed fintech on a mission to make the syndicated loan market better. By digitally capturing agent banks' data on a real-time basis, Versana provides unprecedented transparency into loan level details and portfolio positions, bringing efficiency and velocity to the entire market. Through our platform, participants can rest assured they are accessing the loan market's most credible source of deal information.
    About You: Versana is seeking a motivated SRE/DevOps Engineer with strong observability experience to join our growing Platform Engineering squad. The squad's goal is to manage public cloud, improve DevOps practices, and monitor Versana's real-time syndicated loan data platform. The ideal candidate will have a deep understanding of cloud-native applications, distributed computing, CI/CD implementation, and observability tools and practices.
    Key Responsibilities: - Design, implement, and maintain observability and event management tools - Monitor system performance, create incident response plans, and implement observability practices to gain insights into system behavior - Implement and monitor service-level objectives (SLOs) and indicators - Improve system reliability and resiliency - Conduct post-incident reviews and implement necessary changes to prevent system failures - Assist teams in implementing observability tools and leveraging available telemetry data to troubleshoot and resolve incidents and problems - Leverage observability and event management to improve key incident management metrics, such as mean time to detect and mean time to restore services - Continually optimize systems and workflows by improving architecture, infrastructure, automation, CI/CD, and observability - Collaborate with developers to ensure applications are designed with DevOps best practices in mind - Participate in weekend support for cloud infrastructure upgrades and/or releases
    Must Have: - 5+ years of experience as a Site Reliability Engineer or similar role - 3+ years of experience in at least one coding language such as Java, JavaScript, Python, GoLang, or .NET - 3+ years of work experience with public cloud (Azure, AWS or GCP) - 3+ years of direct experience with observability tools like Datadog, Elasticsearch, and Grafana Labs, etc. - 3+ years of experience with containerization and orchestration technologies like Docker and Kubernetes - 2+ years of experience in development and management of CI/CD pipelines (e.g., Azure DevOps, Gitlab CI/CD, Github Actions, Jenkins, etc) - 2+ years of experience with Infrastructure-as-code tools like Terraform, Azure Bicep, Cloud Formation, etc. - 1+ years of experience with site reliability tools like Gremlin, Chaos Mesh, or similar - Proven track record leveraging core observability concepts, end-user monitoring, and infrastructure monitoring with SaaS solutions - Experience with messaging services like Kafka or Azure Event Hubs - Good understanding of the Linux operating system - Ability to partner with multi-functional teams and pivot quickly - Strong communication, analytical, and problem-solving skills - Curiosity and motivation to learn
    Nice to Have: - Certifications in cloud technologies - Experience with Azure cloud or Azure DevOps - Experience with Datadog or similar modern observability tools

    Key Responsibilities

    • Design, implement, and maintain observability and event management tools
    • Monitor system performance, create incident response plans, and implement observability practices to gain insights into system behavior
    • Implement and monitor service-level objectives (SLOs) and indicators
    • Improve system reliability and resiliency
    • Conduct post-incident reviews and implement necessary changes to prevent system failures
    • Assist teams in implementing observability tools and leveraging available telemetry data to troubleshoot and resolve incidents and problems
    • Leverage observability and event management to improve key incident management metrics, such as mean time to detect and mean time to restore services
    • Continually optimize systems and workflows by improving architecture, infrastructure, automation, CI/CD, and observability
    • Collaborate with developers to ensure applications are designed with DevOps best practices in mind
    • Participate in weekend support for cloud infrastructure upgrades and/or releases

    Must Haves

    • 5+ years of experience as a Site Reliability Engineer or similar role
    • 3+ years of experience in at least one coding language such as Java, JavaScript, Python, GoLang, or .NET
    • 3+ years of work experience with public cloud (Azure, AWS or GCP)
    • 3+ years of direct experience with observability tools like Datadog, Elasticsearch, and Grafana Labs, etc.
    • 3+ years of experience with containerization and orchestration technologies like Docker and Kubernetes
    • 2+ years of experience in development and management of CI/CD pipelines (e.g., Azure DevOps, Gitlab CI/CD, Github Actions, Jenkins, etc)
    • 2+ years of experience with Infrastructure-as-code tools like Terraform, Azure Bicep, Cloud Formation, etc.
    • 1+ years of experience with site reliability tools like Gremlin, Chaos Mesh, or similar
    • Proven track record leveraging core observability concepts, end-user monitoring, and infrastructure monitoring with SaaS solutions
    • Experience with messaging services like Kafka or Azure Event Hubs
    • Good understanding of the Linux operating system
    • Ability to partner with multi-functional teams and pivot quickly
    • Strong communication, analytical, and problem-solving skills
    • Curiosity and motivation to learn

    Nice to Haves

    • Certifications in cloud technologies
    • Experience with Azure cloud or Azure DevOps
    • Experience with Datadog or similar modern observability tools
    Equal Opportunity Employer We are committed to providing equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.
    This policy applies to all terms and conditions of employment, including recruiting, hiring, placement, promotion, termination, layoff, recall, transfer, leaves of absence, compensation and training.

  • Open Systems Technologies

    SRE/DevOps Engineer

    2 weeks ago


    Open Systems Technologies New York, United States

    A financial firm is looking for an SRE/DevOps Engineer to join their team in New York, NY. · Compensation: $150-200k · Responsibilities · Design, implement, and manage AWS cloud infrastructure using Terraform and CloudFormation · Develop and maintain CI/CD pipelines using GitL ...

  • Open Systems Technologies

    SRE/DevOps Engineer

    3 weeks ago


    Open Systems Technologies New York, United States

    A financial firm is looking for an SRE/DevOps Engineer to join their team in New York, NY.Compensation: $150-200kResponsibilitiesDesign, implement, and manage AWS cloud infrastructure using Terraform and CloudFormationDevelop and maintain CI/CD pipelines using GitLab for seamless ...

  • Tandym Group

    DevOps/SRE

    3 weeks ago


    Tandym Group New York, United States

    An asset manager in New York City is actively seeking a self-motivated and hardworking professional to join their staff as their new DevOps / SRE. · Responsibilities · The DevOps / SRE will: · Build and maintain the infrastructure that supports the firm's trading systems · Col ...

  • Rogo

    DevOps/SRE Engineer

    1 week ago


    Rogo New York, United States

    Why Rogo? · Rogo will be the biggest Financial Services Artificial Intelligence company in the world. We're creating a category-defining AI company built on top of foundational AI models like GPT-4. Exceptional early users: high-paying contracts with the world's largest investme ...

  • Unknown Planner

    DevOps SRE

    2 weeks ago


    Unknown Planner New York, United States

    OUR IMPACTWe are Compliance Engineering, a global team of more than 300 engineers and scientists who work on the most complex, mission-critical problems.We:- build and operate a suite of platforms and applications that prevent, detect, and mitigate regulatory and reputational ris ...

  • HexaQuEST Global, Inc.

    SRE/ Devops Engineer

    2 weeks ago


    HexaQuEST Global, Inc. Jersey City, United States

    7 - 12 Years DevOps/SRE experience required in a tech forward environment, having demonstrated progressively innovative contributions in system architecture. · Experience building deployment automation within AWS-hosted services and infrastructure. · Experience with 2 or more scr ...

  • HexaQuEST Global Inc.

    SRE/ Devops Engineer

    2 weeks ago


    HexaQuEST Global Inc. Jersey City, United States

    7 - 12 Years DevOps/SRE experience required in a tech forward environment, having demonstrated progressively innovative contributions in system architecture. · Experience building deployment automation within AWS-hosted services and infrastructure. · Experience with 2 or more scr ...

  • Diverse Lynx

    DevOps SRE

    2 weeks ago


    Diverse Lynx Jersey City, United States

    Job Title: DevOps SRE · Location: Jersey City, NJ(Onsite) · Duration: Full-Time · Skill: DevOps · Minimum Experience: 6 - 8 Years · Job Description:7+ Yrs. of experience in SRE, CICD DevOps tools, integration & process. · Exposure to AWS or Google or Azure Cloud platform (Kuber ...

  • Benchmark IT LLC

    Sr. DevOps/SRE

    3 weeks ago


    Benchmark IT LLC New York, United States

    Sr. · Ensure you read the information regarding this opportunity thoroughly before making an application. · DevOps/SRE · – Hybrid: Our direct client, a fast-growing FinTech firm in New York City, is looking for a Senior DevOps/SRE Engineer to develop and maintain the productio ...

  • Benchmark IT LLC

    Sr. DevOps/SRE

    3 weeks ago


    Benchmark IT LLC New York, United States

    Sr. · Ensure you read the information regarding this opportunity thoroughly before making an application. · DevOps/SRE – Hybrid: Our direct client, a fast-growing FinTech firm in New York City, is looking for a Senior DevOps/SRE Engineer to develop and maintain the production a ...

  • Diverse Lynx

    Azure Devops/SRE

    2 weeks ago


    Diverse Lynx New York, United States

    Role Name: Azure Devops/SRE · Location: Remote · Type: Contract · JD: · .Net/C# · Windows $ IIS · Jenkins & Powershell · APM tolls - prefer datadg, but equivalents linke Dynatrace,New Retic, App Dynamics · Diverse Lynx LLC is an Equal Employment Opportunity employer. All qualifi ...

  • HERE

    Senior SRE/DevOps

    2 weeks ago


    HERE New York, United States

    Who You Are You are a well-rounded engineer with varied experience to pull from - you know some of our stack very well: Node, Google Cloud Platform, Firebase, Janus Gateway, and AWS. · You care deeply about the user. You want to blow their mind. · You love working with and comm ...

  • HERE

    Senior SRE/DevOps

    3 weeks ago


    HERE New York, United States

    Who You Are You are a well-rounded engineer with varied experience to pull from - you know some of our stack very well: Node, Google Cloud Platform, Firebase, Janus Gateway, and AWS. · You care deeply about the user. You want to blow their mind. · You love working with and comm ...


  • Benchmark IT LLC New York, United States

    Sr. DevOps/SRE Hybrid: Our direct client, a fast-growing FinTech firm in New York City, is looking for a Senior DevOps/SRE Engineer to develop and maintain the production and development environments for a multi-party application. This role will utilize strong DevOps principles a ...

  • Zelis

    SRE DevOps Engineer

    3 weeks ago


    Zelis Convent Station, United States

    At Zelis, we're passionate about building software that solves problems. We count on our Site Reliability Engineers (SREs) to empower our users with a rich feature set, high availability, and stellar performance level to pursue their missions. We are searching for someone who bri ...

  • LE018 Zelis Healthcare, LLC

    SRE DevOps Engineer

    3 weeks ago


    LE018 Zelis Healthcare, LLC Convent Station, United States

    At Zelis, we're passionate about building software that solves problems. We count on our Site Reliability Engineers (SREs) to empower our users with a rich feature set, high availability, and stellar performance level to pursue their missions. We are searching for someone who bri ...


  • Deutsche Telekom Ag New York, United States

    Als · Senior Devops Engineer für Cloud Application Operation · (CAO) (m/w/d) haben Sie folgende spannende Verantwortungs- und Aufgabenbereiche: · Erstellen von technischen CAO Lösungskonzepten und deren Umsetzung in Kundenprojekten · Aufbau und Konfiguration von CAO Umgebungen ...

  • HAN IT Staffing

    SRE devops

    1 week ago


    HAN IT Staffing Iselin, United States

    Role: SRE Engineer · Location: Plano, TX (Day 1 Onsite) · Duration: Contract · Note: Look for local candidates. · JD: · - Should be strong SRE, experience with java, AWS / DevOps / deployment strategy and monitoring tools. Candidates should be with more hands-on experience with D ...


  • Deutsche Telekom AG New York, United States

    Als · Senior Devops Engineer für Cloud Application Operation (CAO) · haben Sie folgende spannende Verantwortungs- und Aufgabenbereiche: · Erstellen von technischen CAO Lösungskonzepten und deren Umsetzung in Kundenprojekten · Aufbau und Konfiguration von CAO Umgebungen und Bera ...

  • Zelis

    DevOps SRE

    3 weeks ago


    Zelis Convent Station, United States

    Job Description · Essential Duties and Functions · Gather and analyze metrics from both operating systems as well as applications to assist in performance tuning and fault finding. · Run the production environment by monitoring availability and taking a holistic view of system ...