- Support production systems and help triage issues during live sporting events
- Monitor the system and respond to incidents to maintain system SLO/SLA, review and follow up production incidents
- Write and review code, develop documentation, and debug problems, live, on complex distributed systems
- Optimize and facilitate incident response, conduct root cause analysis and blameless retrospectives
- Work closely with technical teams to implement, optimize, maintain, scale and debug workloads on Kubernetes using CI/CD, automation tools and scripting languages to deliver tools/software to improve the reliability and scalability of services
- 3+ years of experience working in an SRE leaning DevOps or full SRE roles
- 3+ years building CICD pipelines with Github Actions, Gitlab CICD, or similar
- Extensive experience with Kubernetes
- Experience in managing customer-facing systems in a 24/7 environment including escalations
- Experience triaging and escalation policies/protocols
- Strong communication and documentation skills
- Comfortable with scripting languages like Bash, Python, or similar
- Networking and routing experience
- Terraform in AWS to support global-scale services
- Improving observability in an engineering organization
- Past experience with PagerDuty or similar tools
-
Sr. Site Reliability Engineer
3 weeks ago
Outdefine San Francisco, CA, United Statesfull time $ /yr remote ???????? USD · full time $ /yr hybrid ???????? USD · #J-18808-Ljbffr ...
-
Reliability Engineer
2 weeks ago
OpenAI San Francisco, United StatesJoin the engineering teams that bring OpenAI's ideas safely to the world · The Applied Engineering team works across research, engineering, product, and design to bring OpenAI's technology to consumers and businesses. We seek to learn from deployment and distribute the benefits ...
-
Site Reliability Engineer
2 weeks ago
Vertisystem San Francisco, United StatesDuration: 6 months contract · Pay rate: $90/hr on W2 · Job Summary: · It is an exciting time to be part of the organization's CICD and Cloud Site Reliability Engineering (SRE) team. SREs operate right at the intersection of Software Engineering and Infrastructure Engineering. The ...
-
Junior Reliability Engineer
1 day ago
Jones Lange Lasalle, Inc. West Valley City, United StatesThe Junior Reliability Engineer is responsible for performing data validation around assets (HVAC, Electrical, Plumbing, etc.) that are managed by both Mobile and Static Facilities Management Technicians at all managed facilities within our West Caro Reliability Engineer, Liabili ...
-
Site Reliability Engineer
3 days ago
DigitalOcean San Francisco, United StatesDo you ever wonder what happens inside the cloud? · DigitalOcean (NYSE: DOCN) simplifies cloud computing so builders can spend more time creating software that changes the world. With our mission-critical infrastructure and fully managed offerings, DigitalOcean enables startups ...
-
Site Reliability Engineer
2 weeks ago
Pelago San Francisco, United StatesRole Overview: · At Pelago, we run a serverless architecture on AWS, with infrastructure managed using Terraform. Our system has been built to deliver our virtual clinic for Substance Use Management, and we are looking for a talented Site Reliability Engineer to join the engineer ...
-
Site Reliability Engineer
2 days ago
WEX San Francisco, United States(*) This is a remote position; however, the candidate must reside within 30 miles of one of the following locations: Boston, MA; Dallas, TX; San Francisco Bay Area, CA; Portland, ME; and Washington, D.C. · About the Team/Role · The WEX Site Reliability Engineering (SRE) team is ...
-
Site Reliability Engineer
1 week ago
Talkdesk San Francisco, United StatesAt Talkdesk, we are courageous innovators focused on helping organizations around the world create better customer experiences. Our AI-powered cloud contact center solutions optimize our customers' most critical customer service processes. We are recognized as a Contact Center as ...
-
Plant Reliability Engineer
3 weeks ago
ITech Consulting Partners San Francisco, United StatesThis opportunity is with a medium sized specialty chemical manufacturer located outside of San Francisco. The plant is PSM regulated, DCS controlled, with a very high standard of safety and overall housekeeping. Millions have been invested in the plant and more upgrades are plann ...
-
Systems Reliability Engineer
4 days ago
Cloudflare Inc San Francisco, United StatesAvailable Locations: · Remote Australia, Singapore · Production Engineering is responsible for the world's most reliable, observable, performant, and safe network ecosystem. Our customers rely on our products and systems to safely modify, troubleshoot, and release products with ...
-
Engineering Director, Reliability
2 weeks ago
StarTree San Francisco, United StatesAt StarTree we're a group of passionate individuals that desire to improve the lives of many by developing tools and technologies that support availability and speed in the world of real-time analytics. · Our aim is to make it simple for every company to delight their users - ex ...
-
Plant Reliability Engineer
3 weeks ago
Bridgeway Professionals Inc San Francisco, United StatesThis opportunity is with a medium sized specialty chemical manufacturer located outside of San Francisco. The plant is PSM regulated, DCS controlled, with a very high standard of safety and overall housekeeping. Millions have been invested in the plant and more upgrades are plann ...
-
Site Reliability Engineer
2 weeks ago
Best Secret San Francisco, United StatesAbout BestSecretGroup · We are a leading European members-only online destination for premium and luxury off-price fashion. Partnering with over 3,000 international brands, our tech-focused mindset and strong commitment to sustainability drives a truly unique experience for our m ...
-
Site Reliability Engineer
3 days ago
Instabase San Francisco, United StatesAt Instabase, we're passionate about democratizing access to cutting-edge AI innovation to enable any organization to solve previously unsolvable unstructured data problems in their industry. · With customers representing some of the largest and most complex organizations in the ...
-
Site Reliability Engineer
2 weeks ago
CAPTIVATEIQ INC San Francisco, United States[Full Time] Site Reliability Engineer - Remote at CaptivateIQ (United States) | BEAMSTART Jobs · Site Reliability Engineer - Remote · CaptivateIQ United States · Date Posted · 31 Jan, 2023 · Work Location · San Francisco, United States · Salary Offered · $139000 — $186000 yearl ...
-
Site Reliability Engineer
1 day ago
AEG San Francisco, United StatesIn order to be considered for this role, after clicking "Apply Now" above and being redirected, you must fully complete the application process on the follow-up screen. · Swish Analytics is a sports analytics, betting and fantasy startup building the next generation of predictiv ...
-
Site Reliability Engineer
2 days ago
AEG San Francisco, United StatesIn order to be considered for this role, after clicking "Apply Now" above and being redirected, you must fully complete the application process on the follow-up screen. · Swish Analytics is a sports analytics, betting and fantasy startup building the next generation of predictiv ...
-
Site Reliability Engineer
2 weeks ago
Swish Analytics San Francisco, United States Full timeSwish Analytics is a sports analytics, betting and fantasy startup building the next generation of predictive sports analytics data products. We believe that oddsmaking is a challenge rooted in engineering, mathematics, and sports betting expertise; not intuition. We're looking f ...
-
Site Reliability Engineer
1 week ago
Telestream San Francisco, United StatesJob Description · Job Description · About Us: · Welcome to the forefront of innovation at Telestream, an industry leading digital video delivery company. We are a dynamic and forward-thinking organization committed to leveraging cutting-edge cloud technologies to drive our suc ...
-
Site Reliability Engineer
1 week ago
Vertisystem San Francisco, United StatesDuration: 6 months contract · Pay rate: $90/hr on W2 · Job Summary: · It is an exciting time to be part of the organization's CICD and Cloud Site Reliability Engineering (SRE) team. SREs operate right at the intersection of Software Engineering and Infrastructure Engineering. Th ...
Site Reliability Engineer - San Francisco, United States - Swish Analytics
Description
Swish Analytics is a sports analytics, betting and fantasy startup building the next generation of predictive sports analytics data products. We believe that oddsmaking is a challenge rooted in engineering, mathematics, and sports betting expertise; not intuition. We're looking for team-oriented individuals with an authentic passion for accurate and predictive real-time data who can execute in a fast-paced, creative, and continually-evolving environment without sacrificing technical excellence. Our challenges are unique, so we hope you are comfortable in uncharted territory and passionate about building systems to support products across a variety of industries and enterprise clients.
About the team
The Swish Analytics DevSecOps and Infrastructure team is looking for an experienced Site Reliability Engineer who will support our enterprise infrastructure during non-US hours. In addition to supporting you will assist in optimizing incident response, observability, and working with technical teams to improve overall workload resiliency.
Responsibilities
Swish Analytics is an Equal Opportunity Employer. All candidates who meet the qualifications will be considered without regard to race, color, religion, sex, national origin, age, disability, sexual orientation, pregnancy status, genetic, military, veteran status, marital status, or any other characteristic protected by law. The position responsibilities are not limited to the responsibilities outlined above and are subject to change. At the employer's discretion, this position may require successful completion of background and reference checks.