- 8+ years of experience in building and running high volume customer facing services in highly dynamic environments in Software Engineering, Site Reliability, or related roles.
- BS Degree in Computer Science or a related field.
- Proficient in at least one programming language (Go, Python, TypeScript) with experience in a software engineering environment.
- Experienced in automation, infra as code, and making reusable patterns.
- Passionate about improving the reliability and performance of critical services through the use of monitoring, metrics, incident management, and proactive engineering. Has helped investigate and remediate critical issues in production services and infrastructure.
- Expert in observability tools such as DataDog and has hands-on experience instrumenting services for monitoring, logging, metrics collection, tracing.
- Knowledgeable in performance and load testing tools and methods to simulate production workloads.
- Acts with urgency, ownership, and with a mindset of continuous improvement.
- Able to participate in an on-call rotation to ensure issues are resolved as quickly as possible and prevented from further occurrence.
- Experienced in GitOps practices and technologies (CI/CD, Infrastructure as Code, etc).
- 5+ years experience in AWS and its major services such as EC2, ECS, S3, SQS, EKS, Lambda.
- Writes clear and concise documentation.
- Interested in mentoring or teaching others.
- 2+ years of experience in GCP and its common services such as GKE, Artifact Repository, Google VPC
- Extensive experience scaling and managing relational databases and nosql technologies.
- Experienced with data storage technologies such as RDS, DynamoDB, ElasticSearch.
- Receive a great compensation package including salary plus performance bonus earning potential, paid annually.
- Enjoy flexible PTO and time off policies allowing you to take the time you need to be your whole self.
- Appreciate the generous medical, dental, vision, STD, LTD, and life insurance options for you and your family.
- Take advantage of our health saving account HSA program plus health care and dependent care FSA programs.
- Love that we offer an employer match on our 401(k) plan.
- Receive employer paid commuter benefit (for eligible employees)
- Appreciate the generous support program for new parents
- Obtain pet insurance and some of our offices are pet friendly
- Courage. We believe that when we overcome fear, we enable our best selves.
- Curiosity. We are curious, which is the gateway to empathy, inclusion, and understanding.
- Service. We serve our community with humility, enabling joy and belonging for others.
- Kaizen. We have a growth mindset committed to constant forward progress.
-
Cloud Service Reliability Engineer
1 week ago
Forhyre San Francisco, CA, United StatesJob Description · Job Description · We are looking for someone that is generalist at heart, one who is curious, appreciates complexity, knows or wants to learn when to step back and when to dive deep. We call this role a Cloud Service Reliability Engineer. · The Cloud Service ...
-
Cloud Service Reliability Engineer
13 hours ago
Forhyre San Francisco, United StatesJob Description · Job DescriptionWe are looking for someone that is generalist at heart, one who is curious, appreciates complexity, knows or wants to learn when to step back and when to dive deep. We call this role a Cloud Service Reliability Engineer. · The Cloud Service Relia ...
-
Site Reliability Engineer, Cloud Operations
1 week ago
Philpar San Francisco, CA, United StatesShipHero: Senior Site Reliability Engineer · We have built a software platform entrusted by hundreds of eCommerce companies, large and small to run their operations and we continue to grow. About US$5 billion of eCommerce orders are shipped a year via ShipHero. Our customers sel ...
-
Cloud Support Site Reliability Engineer
3 weeks ago
MerQube San Francisco, United States Full timeFounded in 2019 by visionary leaders from the finance and technology sectors, MerQube pioneers the use of cutting-edge technology to transform not only index creation but also the wider landscape of systematic investing. By harnessing cloud-based architecture and the most advance ...
-
Atlassian San Francisco, CA, United StatesOverview: · We are looking for a reliability expert who is passionate about scaling Cloud services to join our growing SRE team. You are someone who is aware of current industry trends (those related to reliability) and who excels at working with a diverse set of partners, who c ...
-
Atlassian San Francisco, United StatesOverview: · We are looking for a reliability expert who is passionate about scaling Cloud services to join our growing SRE team. You are someone who is aware of current industry trends (those related to reliability) and who excels at working with a diverse set of partners, who c ...
-
Ellation, Inc. San Francisco, CA, United StatesWho We Are · We're a cast of characters working to shine a spotlight on anime. Crunchyroll is an international business focused on creating both online and offline experiences for fans through content (licensed, co-produced, originals, distribution), merchandise, events, gaming, ...
-
ThousandEyes (part of Cisco) San Francisco, United States Full timeWho We Are · The name ThousandEyes was born from two big ideas: the power to see what's not ordinarily possible, and the ability to collect intelligence from vantage points as diverse and global as the Internet. As organizations depend on cloud services, the Internet has become t ...
-
Director, Cloud Ops/Site Reliability
14 hours ago
Decision Engines Palo Alto, United StatesWe are looking for an experienced Cloud Ops leader who will be responsible for operating what will be the world's largest enterprise-grade intelligent business process automation platform. · We are pioneering The Autonomous Enterprise by automating the work of millions of knowle ...
-
Director, Cloud Ops/Site Reliability
22 hours ago
Decision Engines Palo Alto, United StatesWe are looking for an experienced Cloud Ops leader who will be responsible for operating what will be the worlds largest enterprise-grade intelligent business process automation platform. · We are pioneering The Autonomous Enterprise by automating the work of millions of knowled ...
-
Engineering Operations Technician
3 days ago
Amazon Data Services, Inc. San Francisco, United States1+ years of electrical or mechanical, or 1+ years of data center or mission critical facilities (example: hospital, military facility, public safety facility, etc.) experience · - High school or equivalent diploma · Amazon Web Services (AWS) is currently hiring for an Engineering ...
-
Staff Product Manager, New Product
3 days ago
Databricks San Francisco, United StatesRDQ424R175 · Our mission at Databricks is to be the generational Data and AI company. Databricks has created a new market category of the Data Lakehouse (CIDR 2021 paper), an open platform that unifies data warehousing, ELT, advanced analytics, and AI. The Lakehouse architecture ...
-
Product Engineer
1 week ago
WPRO TALENTS San Francisco, United StatesThis is a remote position. · Our Client is the first fully-integrated and modular blockchain optimized for real world assets (RWAs). We've built an EVM-compatible L2 that's fast, efficient, and incredibly cheap, along with an end-to-end onboarding platform that lets asset issuers ...
-
Backend Developer
1 week ago
Bioindustrial Manufacturing and Design Ecosystem ( BioMADE) Emeryville, United States**Essential Duties and Responsibilities** · - Help the team envision, architect, and build, scalable data platform that will power BioMADE's Digital Backbone focusing on scalable data lake, data integration, data transformation, and curation. · - Develop both internally and exter ...
-
Data Governance Analyst, Expert
6 days ago
PG&E Corporation Oakland, United StatesRequisition ID # 157588 · **Job Category**: Information Technology · **Job Level**: Individual Contributor · **Business Unit**: Information Technology · **Work Type**: Hybrid · **Job Location**: Oakland · Department Overview · At PG&E, the Data and Analytics organization is focus ...
-
Data Engineer
1 week ago
Nexus Solutions Emeryville, United StatesSoda is teaming up with a leading E-commerce platform based in Germany that specializes in creating lottery platforms. With 20 years of experience, they have continuously grown and innovated across Europe. In 2022, they raised an impressive €286 million for social projects and ch ...
-
Software Engineering Manager
5 days ago
Pivotal Palo Alto, United StatesPivotal is the leader in the emerging market of electric Vertical Takeoff and Landing (eVTOL) aircraft. We design, develop, and manufacture light eVTOL aircraft and are renowned for the BlackFly, the first light eVTOL to fly manned missions and enter the consumer market. · Mobili ...
-
Saas
1 week ago
Element Energy Menlo Park, United StatesApril 8, 2024Element Energy is a dynamic startup company reimagining energy storage and battery management. Our breakthrough algorithms and our patented control systems solve critical battery safety and performance issues in the multi-billion-dollar large-scale grid storage and e ...
-
IT Operations Manager
1 week ago
Nextlabs, Inc. San Mateo, United States**IT Operations Manager** · **Location: US** · **Responsibilities**: · - Lead the IT department's operational and strategic planning, including fostering innovation, planning projects and organizing and negotiating the allocation of resources. · - Manage and maintain the Company' ...
-
Site Reliability Engineer
1 week ago
Vertisystem San Francisco, United StatesDuration: 6 months contract · Pay rate: $90/hr on W2 · Job Summary: · It is an exciting time to be part of the organization's CICD and Cloud Site Reliability Engineering (SRE) team. SREs operate right at the intersection of Software Engineering and Infrastructure Engineering. The ...
Senior Software Engineer, Embedded Cloud Reliability - San Francisco, CA, United States - Crunchyroll
Description
About Crunchyroll
WE HELP EVERYONE BELONG. IT'S OUR PURPOSE. Founded by fans, Crunchyroll delivers the art and culture of anime to a passionate community. We super-serve over 100 million anime and manga fans across 200+ countries and territories, and help them connect with the stories and characters they crave. Whether that experience is online or in-person, streaming video, theatrical, games, merchandise, events and more, it's powered by the anime content we all love. Join our team, and help us shape the future of anime Who We Are We're a cast of characters working to shine a spotlight on anime. Crunchyroll is an international business focused on creating both online and offline experiences for fans through content (licensed, co-produced, originals, distribution), merchandise, events, gaming, news, and more. Visit our About Us pages for more information about our collection of brands. About The Team At Crunchyroll, our platforms and infrastructure form the foundation on which our services are built and directly influence our customer experience and velocity of our engineers. The Cloud Reliability team at Crunchyroll embeds with our development teams and partners with our core platform teams to deliver the critical cloud infrastructure that enable our services. You will report into our Senior Manager and this role can be fully remote. About You