Site Reliability Engineer - San Francisco, United States - Talkdesk

Talkdesk San Francisco, United States

3 weeks ago

Description

At Talkdesk, we are courageous innovators focused on helping organizations around the world create better customer experiences. Our AI-powered cloud contact center solutions optimize our customers' most critical customer service processes. We are recognized as a Contact Center as a Service (CCaaS) leader by influential research organizations including Gartner.

With $498 million in total funding, a valuation of more than $10 Billion, and a ranking of #8 on the Forbes Cloud 100 list, now is the time to be part of the Talkdesk legacy to help accelerate our success in a new decade of transformational growth.

We champion an inclusive and diverse culture representative of the communities in which we live and serve. And, we give back to our community by volunteering our time, supporting non-profits and minimizing our global footprint.

Our Engineering team follows a micro-service architecture approach to build the next generation of Talkdesk, with vertical teams responsible for all the decisions under their services.

Through our Agile Coaches, we promote agile and collaborative practices, we are huge fans of Scrum, pair programming and we won't let a single line of code reach production without peer code reviews.

We strongly believe that the only true authority stems from knowledge, not from position and we always treat others with respect, deference and patience.

We are looking for Site Reliability Engineers (SREs) who can help us design, build, and maintain high-performance, scalable, and reliable services.

As Talkdesk provides a Contact Center service, we play a very critical role in our Customer's business operations and therefore need to provide a highly available and fault tolerant service.

As an SRE at Talkdesk you will build, run, and maintain components that serve as the infrastructure foundation for the rest of Talkdesk, with the objective of having the least manual intervention possible, while also ensuring high availability and reliability of those components.

You will also partner with other product engineering teams to help make their services more performant, scalable, observable and reliable.

We believe in a DevOps philosophy where every engineering team at Talkdesk should be responsible for the software they build and deploy and SREs play a critical role in ensuring that the teams have the tools, practices, and expertise to make that happen in a blame free culture.

Responsibilities:
Design, build, harden, and maintain the core infrastructure used by all of Talkdesk's engineering teams
Automate every aspect of our infrastructure to remove as much as possible any human intervention
Help keep existing base infrastructure running smoothly
Develop effective tooling, alerts, and response to both identify and address reliability risks
Drive and promote protocols on production readiness and operational excellence
Participate in on-call rotation alongside other engineering teams (opt-in)
Partner with product engineering teams to debug production outages and carry out action items to improve reliability of those systems
Participate in design reviews and production reviews for new features, products, or pieces of infrastructure
Plan for growth of Talkdesk's infrastructure

Requirements:
Understanding of the importance of observability, and have good intuitions about what to measure and how
Know your way around a Linux/Unix system
Experience with Terraform and Packer
Ability to identify time consuming and error prone manual tasks and then build tooling to automate them
Understand large-scale complex systems from a reliability perspective
Ability to identify root causes of instability in a large-scale distributed system, across stacks
Hold yourself and others around you to higher stands when working with production
Bringing a developer mindset and applying it to infrastructure
You value simplicity

Nice to haves / Pluses:
Experience with cloud-based solutions such as Amazon AWS, Google Cloud, or Microsoft Azure
Experience with technologies such as Docker, Consul, Vault, Jenkins, Concourse, Prometheus, Nexus
Experience with PaaS-like solutions such as Heroku, Kubernetes, Docker Swarm, Mesos, or OpenStack
Experience with messaging systems such as RabbitMQ or Kafka
Operational knowledge with various data stores such as MongoDB, Postgres, Redis, Cassandra, Elasticsearch
Experience with configuration management software such as Ansible or Chef
Experience with a programming language such as Ruby, Elixir/Erlang, Go, or any JVM-based language
Experience with designing and operating IP networks
The Talkdesk story hinges on empathy and acceptance.

It is the shared goal among all Talkdeskers to empower a new kind of customer hero through our innovative software solution, and we firmly believe that the best path to success for our mission is inclusivity, diversity, and genuine acceptance.

To that end, we will hire, promote, work along, cheer for, bond with, and warmly welcome into the Talkdesk family all persons without regard to ethnic and racial identity, indigenous heritage, national origin, religion, gender, gender identity, gender expression, sexual orientation, age, disability, marital status, veteran status, genetic information, or any other legally protected status.

#J-18808-Ljbffr

Site Reliability Engineer

2 weeks ago

BHO Tech San Francisco, United States Full time

Job Description · We're the driverless car company. We're building the world's best autonomous vehicles to safely connect people to the places, things, and experiences they care about. · Our vehicles are on the road in California, Arizona, and Michigan navigating some of the mos ...
Site Reliability Engineer

3 weeks ago

Instabase San Francisco, United States

At Instabase, we're passionate about democratizing access to cutting-edge AI innovation to enable any organization to solve previously unsolvable unstructured data problems in their industry. · With customers representing some of the largest and most complex organizations in the ...
Site Reliability Engineer

1 week ago

PicnicHealth San Francisco, United States

[Full Time] Site Reliability Engineer at PicnicHealth (United States) | BEAMSTART Jobs · Site Reliability Engineer · PicnicHealth United States · Date Posted · 10 Aug, 2023 · Work Location · San Francisco, United States · Salary Offered · $160 — $190 yearly · Job Type · Full Ti ...
Site Reliability Engineer

3 weeks ago

DigitalOcean San Francisco, United States

Do you ever wonder what happens inside the cloud? · DigitalOcean (NYSE: DOCN) simplifies cloud computing so builders can spend more time creating software that changes the world. With our mission-critical infrastructure and fully managed offerings, DigitalOcean enables startups a ...
Site Reliability Engineer

3 weeks ago

Syndio San Francisco, United States

Do you want to empower organizations to fairly and equitably hire, promote, retain and compensate their employees? Syndio is a Series-C technology company committed to fairness in the workplace. Fueled by investments of $83M from Bessemer Ventures, Voyager Capital and social chan ...
Site Reliability Engineer

1 day ago

Together San Francisco, United States

As a Site Reliability Engineer (SRE) at Together, you are responsible for keeping all user-facing services and production systems running smoothly. You are a blend of a pragmatic operator and a software engineer that applies sound engineering principles, operational discipline, a ...
Site Reliability Engineer

1 day ago

Instabase San Francisco, United States

At Instabase, we're passionate about democratizing access to cutting-edge AI innovation to enable any organization to solve previously unsolvable unstructured data problems in their industry. With customers representing some of the largest and most complex organizations in the wo ...
Site Reliability Engineer

1 week ago

Wasmer San Francisco, United States

[Full Time] Site Reliability Engineer at Wasmer (United States) | BEAMSTART Jobs · Site Reliability Engineer · Wasmer United States · Date Posted · 25 Mar, 2023 · Work Location · San Francisco, United States · Salary Offered · Not Specified · Job Type · Full Time · Experience R ...
Software Engineer, Reliability

1 week ago

OpenAI San Francisco, United States

Join the engineering teams that bring OpenAI's ideas safely to the world · The Applied Engineering team works across research, engineering, product, and design to bring OpenAI's technology to consumers and businesses. We seek to learn from deployment and distribute the benefits ...
Site Reliability Engineer

1 day ago

eTeam Inc. San Francisco, United States

Role: Site Reliability Engineer · Location: 100% remote · Duration: 6+ Months · Primary Skill: · Minimum 8 years exp in Terraform, Ansible, Networking, Jenkins, Python, GCP in Technology companies. · Security (vulnerability management). ...
Site Reliability Engineer

1 day ago

Anthropic San Francisco, United States

We are looking for a Site Reliability Engineer who will ensure the high availability and performance of our Kubernetes clusters that power machine learning research and services. · About Anthropic Anthropic's mission is to create reliable, interpretable, and steerable AI systems ...
Site Reliability Engineer

1 week ago

Best Secret San Francisco, United States

About BestSecretGroup · We are a leading European members-only online destination for premium and luxury off-price fashion. Partnering with over 3,000 international brands, our tech-focused mindset and strong commitment to sustainability drives a truly unique experience for our m ...
Site Reliability Engineer

6 days ago

Mission Box Solutions San Francisco, United States Permanent

As a Site Reliability Engineer (SRE), you will play a vital role in continuously driving improvements in observability, performance, and reliability, aiming to make a substantial impact across the federal government. Our client firmly believes that exceptional technology services ...
Site Reliability Engineer

3 weeks ago

Resource Informatics Group San Francisco, United States

Job Title: Site Reliability Engineer · Work Location: San Francisco, CA (Hybrid after showing successful engagement) · Duration: 18+ months · Most important skills:10 years of Oracle database administration experience on large production environment · Database hands on skills ...
Staff Reliability Engineer

57 minutes ago

SPAN Inc San Francisco, United States

The Company · SPAN develops products that accelerate the rapid adoption of renewable energy in the home. The flagship SPAN Smart Panel is the first true evolution for the traditional home electric panel, harnessing enhanced technology for metering, monitoring, and control. An ex ...
Site Reliability Engineer

2 days ago

Together AI San Francisco, United States

As a Site Reliability Engineer (SRE) at Together, you are responsible for keeping all user-facing services and production systems running smoothly. You are a blend of a pragmatic operator and a software engineer that applies sound engineering principles, operational discipline, a ...
Site Reliability Engineer

3 days ago

Orb San Francisco, United States

Mission · Orb is on an ambitious mission to provide every business with the infrastructure to unlock their revenue. Best-in class businesses find ways to effectively align their monetization to product usage—whether that's through seats, consumption, feature limits, or usage-bas ...
Site Reliability Engineer

1 week ago

Replit San Francisco, United States

[Full Time] Site Reliability Engineer at Replit (United States) | BEAMSTART Jobs · Site Reliability Engineer · Replit United States · Date Posted · 23 Feb, 2023 · Work Location · San Francisco, United States · Salary Offered · $70000 — $175000 yearly · Job Type · Full Time · Ex ...
Site Reliability Engineer

3 weeks ago

Cypress HCM San Francisco, United States

Job Description · Job Description · Site Reliability Engineer (Grafana)Responsibilities:Collaborate with Service Owners and Observability Leaders to develop a strategy for monitoring the technology stack using Grafana. · Initiate data ingestion by deploying Telegraf and exporters ...
Site Reliability Engineer

2 days ago

Withorb San Francisco, United States

Mission · Orb is on an ambitious mission to provide every business with the infrastructure to unlock their revenue. Best-in class businesses find ways to effectively align their monetization to product usage—whether that's through seats, consumption, feature limits, or usage-bas ...

Site Reliability Engineer - San Francisco, United States - Talkdesk

Description

Site Reliability Engineer

Site Reliability Engineer

Site Reliability Engineer

Site Reliability Engineer

Site Reliability Engineer

Site Reliability Engineer

Site Reliability Engineer

Site Reliability Engineer

Software Engineer, Reliability

Site Reliability Engineer

Site Reliability Engineer

Site Reliability Engineer

Site Reliability Engineer

Site Reliability Engineer

Staff Reliability Engineer

Site Reliability Engineer

Site Reliability Engineer

Site Reliability Engineer

Site Reliability Engineer

Site Reliability Engineer

Rob Schroder

McKenzie Friel

Lucero Yañez

Harold Moses

Muhammad Farooq

Melvin Mathew

for Recruiters

Information

Site Reliability Engineer - San Francisco, United States - Talkdesk

Description

Site Reliability Engineer professionals in San Francisco