-
Site Reliability Engineer
1 week ago
Coinbase Remote, United StatesWe're a group of hard-working overachievers who are deeply focused on building the future of finance and Web3 for our users across the globe, whether they're trading, storing, staking or using crypto. Know those people who always lead the group project? We're a remote-first compa ...
-
Site Reliability Engineer
2 weeks ago
Roadie Remote, United States Full timeRoadie, a UPS Company, is a logistics management and crowdsourced delivery platform. Founded in 2014, Roadie offers businesses fast, flexible and asset-light logistics solutions for last-mile delivery. Roadie enables local delivery to more than 95% of U.S. households by providing ...
-
Site Reliability Engineer
2 weeks ago
Podium Remote, United States Full time· At Podium, our mission is to help local businesses win. Our lead conversion platform, powered by AI and integrations, helps local businesses convert leads faster, communicate easier, and make more sales. Every day, thousands of local businesses utilize our review management, c ...
-
Site Reliability Engineer
2 weeks ago
OPENLANE Remote, United States Full timeWho We Are: · At OPENLANE we make wholesale easy so our customers can be more successful. · We're a technology company building the world's most advanced-and uncomplicated-digital marketplace for used vehicles. · We're a data company helping customers buy and sell smarter with cl ...
-
Site Reliability Engineer
3 days ago
Fireblocks Remote, United States Full time· The world of digital assets is accelerating in speed, magnitude, and complexity, opening the door to new ways for leveraging the blockchain. Fireblocks' platform and network provide the simplest and most secure way for companies to work with digital assets and it trusted by so ...
-
Site Reliability Performance Engineer
1 week ago
Brooksource Remote, United StatesContract to Hire * · *Remote (EST Time Zone)* · Our Fortune 15 health care client is seeking a Site Reliability Engineer (SRE) to assist them as they fully transition to the cloud. You will play a critical role in ensuring the reliability, scalability, and performance of their sy ...
-
Senior Site Reliability Engineer
3 weeks ago
SS&C Technologies Holdings Remote, United States Full timeJob Description · Senior Site Reliability Engineer · Locations: Jacksonville, FL | Hybrid or Florida | Georgia | Texas | Remote · Get to Know the Team: · SS&C Advent Software is looking for a motivated and experienced Site Reliability Engineer to help with improving the architect ...
-
Site Reliability Engineer, Cloud
2 weeks ago
Laserfiche Remote, United StatesJob Description · Job DescriptionSite Reliability Engineers (SREs) at Laserfiche are responsible for keeping our Laserfiche Cloud systems online and performant for our customers. They react quickly to reported issues within the systems, promote and implement proactive monitoring ...
-
Sr. Site Reliability Engineer
2 weeks ago
Sunrun Remote, United States Full timeEverything we do at Sunrun is driven by a determination to transform the way we power our lives. We know that starts at the individual employee level. We strive to foster an environment you can thrive in through our commitment to diversity, inclusion and belonging. · Objective: · ...
-
Principal Site Reliability Engineer
3 weeks ago
Zocdoc Remote, United States Full time· Our Mission · Healthcare should work for patients, but it doesn't. In their time of need, they call down outdated insurance directories. Then wait on hold. Then wait weeks for the privilege of a visit. Then wait in a room solely designed for waiting. Then wait for a surprise b ...
-
Staff Site Reliability Engineer
2 days ago
Arcadia (DC) Remote, United States Full timeWho We Are · Arcadia is the technology company empowering energy innovators and consumers to fight the climate crisis. Our software and APIs are revolutionizing an industry held back by outdated systems and institutions by creating unprecedented access to the data and clean ener ...
-
Site Reliability Engineer, Americas
3 weeks ago
Edge & Node Remote, United States Full timeEdge & Node stands as the revolutionary vanguard of web3, a vision of a world powered by individual autonomy, shared self-sovereignty and limitless collaboration. Established by trailblazers behind The Graph, we're on a mission to make The Graph the internet's unbreakable foundat ...
-
Senior Site Reliability Engineer
2 weeks ago
Sojern Remote, United States Full timePosition Summary: · Sojern is looking for a Senior Site Reliability Engineer in the US to collaborate with Software Engineering teams located primarily in the Pacific Time Zone. An ideal candidate would have extensive experience building cloud infrastructure on Google Cloud with ...
-
Senior Site Reliability Engineer
1 week ago
Lumin Digital Remote, United States Full timeOur Site Reliability Engineers (SRE) are good developers with an operations mindset. They enjoy reducing or completely eliminating manual tasks, are excellent problem solvers, and know automation is the key to operating a large-scale system. · SREs make sure that our application ...
-
Staff Site Reliability Engineer
1 week ago
Modern Health Remote, United States Full time· Modern Health · Modern Health is a mental health benefits platform for employers. We are the first global mental health solution to offer employees access to one-on-one, group, and self-serve digital resources for their emotional, professional, social, financial, and physical ...
-
Senior Site Reliability Engineer
2 weeks ago
DFIN Remote, United States Full timeDonnelley Financial Solutions (DFIN) is a leader in risk and compliance solutions, providing insightful technology, industry expertise and data insights to clients across the globe. We're here to help you make smarter decisions with insightful technology, industry expertise and d ...
-
Site Reliability Engineer: Postgres
2 days ago
Supabase Remote, United States Full timeSupabase is an Open Source and fully remote company building developer tools for databases. · We are seeking an experienced SRE to manage the infrastructure of our Postgres databases. We currently manage over 1M Postgres instances and are growing fast. · You will: · Help build th ...
-
Senior Site Reliability Engineer
3 weeks ago
commercetools Remote, United States Full timecommercetools - we are: · Engaged: We didn't become the fastest growing, highest ever valued SaaS software company in digital commerce with nearly 100% year-over-year growth by sitting on the sidelines. · Inspired: We continually explore what's possible. As the founder of the hea ...
-
Senior Site Reliability Engineer I
2 weeks ago
Articulate Remote, United States Full timeArticulate is looking for a Senior Site Reliability Engineer to join our amazing Platform Engineering team. The Senior Site Reliability Engineer I will be responsible for working cross-functionally to deliver and maintain scalable and reliable infrastructure. · What you'll do: · ...
-
Senior Site Reliability Engineer II
2 weeks ago
Oscar Health Remote, United States Full timeHi, we're Oscar. We're hiring a Senior Site Reliability Engineer II, Infrastructure Metal to join our Engineering team. · Oscar is the first health insurance company built around a full stack technology platform and a focus on serving our members. We started Oscar in 2012 to crea ...
Site Reliability Engineer - Remote, United States - Aurora Labs
3 weeks ago
Description
About Us
Aurora Labs is the development company behind Aurora—the EVM blockchain that runs on the NEAR Protocol. We are also the developers of, and integration partner behind, Aurora Cloud—a suite of products that allow Web2 companies to capture the value of Web3.
We invite you to be a part of our team of smart, professional, result-oriented and fun individuals. Join us to help ensure that our background processes run smoothly while we are striving to become the best in the industry.
About the team
Our infrastructure team is responsible for building and supporting critical systems required for running and accessing NEAR and Aurora networks. That includes everything on the path of RPC requests before they hit the blockchain and block production and event delivery once transactions are executed.
Load balancing, caching, queueing, transaction simulation and block production is processed by the services written and maintained by the infrastructure team. These services operate at large scale and process terabytes of data. The platform is based on open-source software, such as Kubernetes, NATS, Jetstream, Blockscout, Grafana, Postgres and Near-core, alongside a few internally developed services.
All internally developed services are written in Go and implement core pieces of functionality such as Mempool management, NEAR chunk distribution, transaction pre-processing and simulation.
About the position
This role is split between two responsibilities: site reliability (80%) and software engineering (20%).
Reliability Engineering includes:
- Ensuring high availability and failure tolerance of our infrastructure.
- Automating configuration and maintenance of software components such as K8s, NATS, Influxdb, Postgres, Cloudflare using e.g. Ansible, Terraform, Helm and kubernetes operators.
- Design and implementation of cloud-agnostic solutions without exclusively relying on specific cloud vendors.
- Validator and RPC nodes management automation.
- Optimizing the latency and throughput of the pub-sub infrastructure- Incident management, monitoring, distributed tracing and recovery automation.
Software Engineering projects include:
- Sidecars that implement infrastructure cloud-agnostic abstractions for developers.
- CLI tools for pubsub and streaming infrastructure operations.
- Time series processing engine for our transaction simulation engine.
- Indexers and blockchain event aggregation pipelines for monitoring purposes
About you
You are a reliability engineer with experience of creating and maintaining backend systems. You are familiar with the entire Linux stack and can easily find a bottleneck in a distributed system. You have developed CLI tools and backend services before and are comfortable applying your software development skills to automate your daily operations or to create a microservice on the request path of the end users.
Key Qualifications
- Strong emphasis on SRE as an engineering subject area, with proficiency in Golang.
- Successful track-record and proven experience as a backend internet services software developer.
- Knowledge of SDLC, including continuous integration and testing methodologies.
- Understanding of base internet infrastructure services including DNS, HTTP, server virtualization, server monitoring in critical, large scale distributed systems.
- Understanding of SRE principals, including monitoring, alerting, error budgets, fault analysis, and other common reliability engineering concepts, with a keen eye for opportunities to eliminate toil by code and process improvements.
- Excellent verbal and written communication skills in English.
Desired skills
- Experience with development within Kubernetes ecosystem, including operator framework, controllers and CRDs.
- Experience with streaming and pubsub systems such as NATS, Apache Kafka, Apache Pulsar.
- Hardware bootstrap and associated security.
- Structured or unstructured storage and caching.
- Automating operations processes via services and tools.
- Configuration management and fleet orchestration via Puppet, Chef, Ansible, or others.
- Cloud Services (AWS S3/EC2/CloudFront or equivalent).
Join our dedicated team of blockchain industry professionals.
Please apply today — we're standing by for your resume
In applying at this job, I confirm and acknowledge that I read and understood the Privacy Notice published at