Jobs
>
Remote

    Senior Site Reliability Engineer - Remote, United States - DFIN

    DFIN
    DFIN Remote, United States

    2 weeks ago

    Default job background
    Full time
    Description

    Donnelley Financial Solutions (DFIN) is a leader in risk and compliance solutions, providing insightful technology, industry expertise and data insights to clients across the globe. We're here to help you make smarter decisions with insightful technology, industry expertise and data insights at every stage of your business and investment lifecycles. As markets fluctuate, regulations evolve and technology advances, we're there. And through it all, we deliver confidence with the right solutions in moments that matter.



    Summary:


    We are looking for technical team members at all levels who want to push themselves to deliver best in market SaaS solutions. We offer a challenging environment where you will have to grow, adapt and use your skills consistently. Our customers rely on us in the moments that matter. Engineering delivers on that promise.

    The Senior Site Reliability Engineer is responsible for ensuring our SaaS products are fast, stable and optimized for our customers. SRE's at DFIN take on availability, performance, managing change, monitoring, response and are guardians of non-functional requirements.

    You either have an infrastructure background with a programmatic, automated mindset or are someone that comes with a software engineering background with infrastructure experience. The SRE goal is to build automated systems that reduce or eliminate manual work to keep our products up and running and performing optimally. We are looking for someone who thrives on collaboration within the team and across other groups and can operate independently to deliver solutions.



    Responsibilities:

    • Champion and implement a culture of SRE to maintain a high-quality platform infrastructure
    • Champion and implement application and infrastructure monitoring and alerting to prevent client impacting issues by ensuring system availability, performance and scalability to maintain SLOs and SLAs
    • Optimize application performance at scale
    • Automate everything including system operational runbooks
    • Define and support continuous integration and deployment pipelines (CI/CD) aligned to branching and quality assurance strategies
    • Dive deep into technology and stay on the forefront of the latest tools, technologies, and strategies; help evaluate, prototype, and integrate them into work processes
    • Perform with broad independence and deliver on project milestones and tasks on schedule while communicating progress regularly
    • Build strong relationships with SRE team members and software engineering teams to hold each other accountable for quality expectations
    • Learn continuously and apply lessons learned
    • Evangelize best practices, eliminate bottlenecks, and improve process


    Qualifications:

    • 5+ years experience writing software in any modern software language such as C# .NET, Java
    • 5+ years experience creating automated deployments with tools such as Azure DevOps Pipelines, Ansible, Jenkins or other scripting languages to manage infrastructure, software build and deployment in a continuous integration (CI) / continuous delivery (CD) environment
    • 5+ years experience writing scripts in PowerShell or Python/Bash to automate system operations as runbooks for Windows and Linux environments.
    • 5+ years experience implementing production performance, availability, and scalability monitoring and alerting best practices using a tool such as New Relic, Dynatrace, DataDog or AppDynamics
    • 5+ years experience as a global admin of Azure including cloud cost management
    • 5+ years of experience supporting public client facing revenue generating systems
    • Strong DevOps focus and experience building and deploying Infrastructure as Code with Terraform or similar technology
    • Experiencing monitoring and preventing issues with databases and database queries (SQL, Cosmos) using tools like Solarwinds Database Performance Analyzer, Idera SQL Diagnostic Manager, or Redgate SQL Monitor
    • Experience planning, coordinating, developing and executing all stages of test scripts
    • Experience securing Windows or Linux systems in 24x7 production environment
    • Experience with containerization and managing Kubernetes clusters
    • Experience with common networking, firewall and load balancing protocols
    • BS in Computer Science or equivalent work experience.


    It is the policy of Donnelley Financial Solutions to select, place and manage all its employees without discrimination based on race, color, national origin, gender, age, religion, actual or perceived disability, veteran's status, actual or perceived sexual orientation, genetic information or any other protected status.

    If you are a qualified individual with a disability or a disabled veteran, you have the right to request a reasonable accommodation if you are unable or limited in your ability to use or access as a result of your disability. You can request a reasonable accommodation by sending an email to #BI-Remote



  • Aurora Labs Remote, United States Full time

    About Us · Aurora Labs is the development company behind Aurora—the EVM blockchain that runs on the NEAR Protocol. We are also the developers of, and integration partner behind, Aurora Cloud—a suite of products that allow Web2 companies to capture the value of Web3. · We invite ...


  • Coinbase Remote, United States

    We're a group of hard-working overachievers who are deeply focused on building the future of finance and Web3 for our users across the globe, whether they're trading, storing, staking or using crypto. Know those people who always lead the group project? We're a remote-first compa ...


  • Podium Remote, United States Full time

    · At Podium, our mission is to help local businesses win. Our lead conversion platform, powered by AI and integrations, helps local businesses convert leads faster, communicate easier, and make more sales. Every day, thousands of local businesses utilize our review management, c ...


  • Roadie Remote, United States Full time

    Roadie, a UPS Company, is a logistics management and crowdsourced delivery platform. Founded in 2014, Roadie offers businesses fast, flexible and asset-light logistics solutions for last-mile delivery. Roadie enables local delivery to more than 95% of U.S. households by providing ...


  • OPENLANE Remote, United States Full time

    Who We Are: · At OPENLANE we make wholesale easy so our customers can be more successful. · We're a technology company building the world's most advanced-and uncomplicated-digital marketplace for used vehicles. · We're a data company helping customers buy and sell smarter with cl ...


  • Symbotic Remote, United States

    Who we are · With its A.I.-powered robotic technology platform, Symbotic is changing the way consumer goods move through the supply chain. Intelligent software orchestrates advanced robots in a high-density, end-to-end system - reinventing warehouse automation for increased effic ...


  • SS&C Technologies Holdings Remote, United States Full time

    Job Description · Senior Site Reliability Engineer · Locations: Jacksonville, FL | Hybrid or Florida | Georgia | Texas | Remote · Get to Know the Team: · SS&C Advent Software is looking for a motivated and experienced Site Reliability Engineer to help with improving the architect ...


  • Laserfiche Remote, United States

    Job Description · Job DescriptionSite Reliability Engineers (SREs) at Laserfiche are responsible for keeping our Laserfiche Cloud systems online and performant for our customers. They react quickly to reported issues within the systems, promote and implement proactive monitoring ...


  • Brooksource Remote, United States

    Contract to Hire * · *Remote (EST Time Zone)* · Our Fortune 15 health care client is seeking a Site Reliability Engineer (SRE) to assist them as they fully transition to the cloud. You will play a critical role in ensuring the reliability, scalability, and performance of their sy ...


  • Sojern Remote, United States Full time

    Position Summary: · Sojern is looking for a Senior Site Reliability Engineer in the US to collaborate with Software Engineering teams located primarily in the Pacific Time Zone. An ideal candidate would have extensive experience building cloud infrastructure on Google Cloud with ...


  • Zocdoc Remote, United States Full time

    · Our Mission · Healthcare should work for patients, but it doesn't. In their time of need, they call down outdated insurance directories. Then wait on hold. Then wait weeks for the privilege of a visit. Then wait in a room solely designed for waiting. Then wait for a surprise b ...


  • Edge & Node Remote, United States Full time

    Edge & Node stands as the revolutionary vanguard of web3, a vision of a world powered by individual autonomy, shared self-sovereignty and limitless collaboration. Established by trailblazers behind The Graph, we're on a mission to make The Graph the internet's unbreakable foundat ...


  • Sunrun Remote, United States Full time

    Everything we do at Sunrun is driven by a determination to transform the way we power our lives. We know that starts at the individual employee level. We strive to foster an environment you can thrive in through our commitment to diversity, inclusion and belonging. · Objective: · ...


  • Spekit Remote, United States Full time

    Headquartered out of Denver, CO, we're a small but mighty team on a mission to completely reinvent the future of learning at work. · Introducing Spekit: the new way to learn in today's remote, digital workplace. Say goodbye to distracted zoom training sessions and lengthy LMS co ...


  • Lumin Digital Remote, United States Full time

    Our Site Reliability Engineers (SRE) are good developers with an operations mindset. They enjoy reducing or completely eliminating manual tasks, are excellent problem solvers, and know automation is the key to operating a large-scale system. · SREs make sure that our application ...


  • Modern Health Remote, United States Full time

    · Modern Health · Modern Health is a mental health benefits platform for employers. We are the first global mental health solution to offer employees access to one-on-one, group, and self-serve digital resources for their emotional, professional, social, financial, and physical ...


  • commercetools Remote, United States Full time

    commercetools - we are: · Engaged: We didn't become the fastest growing, highest ever valued SaaS software company in digital commerce with nearly 100% year-over-year growth by sitting on the sidelines. · Inspired: We continually explore what's possible. As the founder of the hea ...


  • DFIN Remote, United States Full time

    Donnelley Financial Solutions (DFIN) is a leader in risk and compliance solutions, providing insightful technology, industry expertise and data insights to clients across the globe. We're here to help you make smarter decisions with insightful technology, industry expertise and d ...


  • Cisco ThousandEyes Remote, United States Full time

    Who We Are · The name ThousandEyes was born from two big ideas: the power to see things not ordinarily possible and the ability to collect insights from a multitude of vantage points. As the world continues its digital transformation and relies more on cloud services and the Inte ...


  • Oscar Health Remote, United States Full time

    Hi, we're Oscar. We're hiring a Senior Site Reliability Engineer II, Infrastructure Metal to join our Engineering team. · Oscar is the first health insurance company built around a full stack technology platform and a focus on serving our members. We started Oscar in 2012 to crea ...