- Manage day-to-day operations of data service, realtime/batch data pipelines, such as SLA management, system deployment, performance tuning and trouble shooting
- Create tools and automation to improve system administration and operation efficiency
- Participate in regular on-call duties
- Engage in and improve the whole lifecycle of services from inception and design, throughout development, capacity planning, and launch reviews, to deployment, operation, and refinement
- Design and implement software platforms and monitor frameworks for efficient, automated, and intelligent service-oriented architecture (SOA) governance
- Scale systems sustainably through mechanisms such as automation; evolve systems reliability, efficiency, and velocity by pushing for changes
- Practice sustainable user support, incident response, and blameless postmortemsQualifications
- Bachelor's degree in Computer Science, with at least 3 years of related experience
- Demonstrated independent thinking capabilities and troubleshooting skills
- Experience programming in one of the following programmings: Python, Go, C, C++, Java and Rust
- Familiar with backend systems such as MySQL/Redis/Nginx/Kafka/Kubernetes/Docker and big data technologies such as Hadoop/Spark/Flink/Hive/OLAP/ClickHouse, etc.
- Familiar with Unix/Linux system internals, networking, and distributed systems
- Good communication and coordination skills
- Experience in Trust & Safety is a plusTikTok is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe and so does our workplace. At TikTok, our mission is to inspire creativity and bring joy. To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach. We believe individuals shouldn't be disadvantaged because of their background or identity, but instead should be considered based on their strengths and experience. We are passionate about this and hope you are too.
-
Liebherr Group Mountain View, United StatesSafety & Reliability Engineer für den Bereich Elektroniksysteme (m/w/d) Lindenberg | Job ID 70278 · Organization · Liebherr-Aerospace Lindenberg GmbH · Country · Deutschland · Entry level · Berufserfahrene · Faszinierendes schaffen: Ihre Aufgaben · Entwicklung, Nachweisführu ...
-
Site Reliability Engineer
4 days ago
Advantis Global is now INSPYR Solutions Sunnyvale, United StatesABOUT THIS FEATURED OPPORTUNITY · The QoS Infrastructure Tools Team is responsible for building and maintaining tools that are essential for Site Reliability Engineers (SREs) and engineers across the organization. The team primarily develops applications using Golang for backend ...
-
Site Reliability Engineer
4 days ago
Lawrence Harvey Sunnyvale, United StatesSite Reliability Engineer · Status: Full Time · Compensation: 120k to 145k · Hybrid Requirements: 3 days in office, 2 days remote · Lawrence Harvey has partnered with a leading Chinese fintech startup that is committed to democratizing payment services and empowering people and ...
-
Packaging Reliability Engineer
1 day ago
Yoh Mountain View, United StatesPackaging Reliability Engineer · As a Packaging Reliability Engineer, you will be responsible for qualifying packaging for consumer electronic products. The company creates iconic packaging that meets a high bar for reliability and demonstrates care for the people who use them an ...
-
Site Reliability Engineering
3 weeks ago
NewsBreak Mountain View, United StatesAbout NewsBreak · NewsBreak is redefining the way users interact with local news and their communities. By bridging local users, local content creators, and local businesses, our mission is to foster safer, more vibrant, and authentically connected lives. Through robust collabor ...
-
Site Reliability Engineer
3 weeks ago
Wayve Mountain View, United StatesAt Wayve, we're not just another autonomous vehicle company. We stand out with our revolutionary approach to self-driving technology, embracing the power of embodied AI to redefine the boundaries of what's possible. While others depend on static maps and rigid rules, we believe i ...
-
Reliability Engineer
17 hours ago
Apple Cupertino, United StatesReliability Engineer · Cupertino,California,United States · Hardware · Do you ever wonder what goes into making Apple products an amazing user experience? Apples innovative reliability team is responsible for insuring that our products exceed our customers expectations for rob ...
-
Operations Reliability Engineer
17 hours ago
Apple Sunnyvale, United StatesOperations Reliability Engineer · Sunnyvale,California,United States · Operations and Supply Chain · Imagine what you could do here. At Apple, new ideas have a way of becoming extraordinary products, services, and customer experiences very quickly. Bring passion and dedication ...
-
Site Reliability Engineer
1 week ago
Amiseq Inc. Sunnyvale, United StatesSite Reliability Engineer · Sunnyvale, CA - Hybrid · 6-12 Months W2 Contract · Job Description: · Hands on development on building n-tier applications using RESTful Services, Java/J2EE, JavaScript, Python, NoSql. · • Working knowledge of one or more cloud technologies such as AZ ...
-
Site Reliability Engineer
2 weeks ago
Tech Mahindra Sunnyvale, United StatesProficiency with the architecture, deployment, performance tuning, and troubleshooting large scale distributed systems on AWS · Understanding of SRE principals including monitoring, alerting, error budgets, fault analysis, and automation · Make sure to apply quickly in order to ...
-
Site Reliability Engineer
2 weeks ago
Tech Mahindra Sunnyvale, United StatesProficiency with the architecture, deployment, performance tuning, and troubleshooting large scale distributed systems on AWS · Understanding of SRE principals including monitoring, alerting, error budgets, fault analysis, and automation · Skilled at writing clean, high-performan ...
-
Staff Site Reliability Engineer
3 weeks ago
SmartThings Mountain View, United StatesStaff Site Reliability Engineer (Mountain View, CA) · Department: Behaviors, Execution and Foundation · Employment Type: Full Time · Location: Mountain View, CA · Reporting To: Angela Tan · Description · We're SmartThings, one of the leading IoT ecosystems in the world, crea ...
-
Staff Site Reliability Engineer
2 days ago
SmartThings Mountain View, United StatesJob Description · Job DescriptionDescriptionWe're SmartThings, one of the leading IoT ecosystems in the world, creating the most effortless way for anyone to create a smart home. As a wholly owned subsidiary of Samsung, our corporate offices are based in Minneapolis and the Bay A ...
-
Staff Site Reliability Engineer
3 weeks ago
SmartThings Mountain View, United StatesJob Description · Job DescriptionDescriptionWe're SmartThings, one of the leading IoT ecosystems in the world, creating the most effortless way for anyone to create a smart home. As a wholly owned subsidiary of Samsung, our corporate offices are based in Minneapolis and the Bay A ...
-
Site Reliability Engineering Intern
3 weeks ago
NewsBreak Mountain View, United StatesAbout NewsBreak · NewsBreak is redefining the way users interact with local news and their communities. By bridging local users, local content creators, and local businesses, our mission is to foster safer, more vibrant, and authentically connected lives. Through robust collabor ...
-
Reliability Engineer
1 week ago
Natron Energy Santa Clara, United StatesNatron is seeking a Reliability Engineer to support the development and test of our high-power battery systems for data center UPS and EV charging applications. The occupant of this position will work with the Product Engineering, Reliability, Technology, and Operations teams to ...
-
Reliability Engineer
3 weeks ago
Comtech TCS Santa Clara, United StatesJob Description · Job Description · Comtech Telecommunications Corp. has an opportunity in Santa Clara, CA for a · Reliability/Failure · Analysis Engineer. In this important role, you will collaborate with a diverse team of technical professionals and interact with outside cu ...
-
Reliability Engineer
3 weeks ago
Comtech Telecom Santa Clara, United StatesComtech Telecommunications Corp. has an opportunity in Santa Clara, CA for a Reliability/Failure Analysis Engineer. In this important role, you will collaborate with a diverse team of technical professionals and interact with outside customers, providing solutions to a variety of ...
-
OneHouse LLC Sunnyvale, United StatesAbout Onehouse · Onehouse is a mission-driven company dedicated to freeing data from data platform lock-in. We deliver the industry's most interoperable data lakehouse through a cloud-native managed service built on Apache Hudi. Onehouse enables organizations to ingest data at sc ...
-
Staff Reliability Engineer
3 weeks ago
Cavnue Mountain View, United StatesWe believe that the future of transportation is automated. Automated travel will be safer, more comfortable, more efficient and a powerful economic enabler for our communities. However, automating driving is a massively complex engineering challenge, requiring vehicles to navigat ...
Site Reliability Engineer - Mountain View, CA, United States - TikTok
Description
TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Mumbai, Singapore, Jakarta, Seoul and Tokyo.Our Trust and Safety engineering team is fast growing and responsible for building machine learning models and systems to identify and defend internet abuse and fraud on our platform.
Our mission is to protect billions of users and publishers across the globe every day.We embrace the state-of-the-art machine learning technologies and scale them to detect and improve trust and safety system using the tremendous amount of data generated on the platform.
With the continuous efforts from our team, TikTok is able to provide the best user experience and bring joy to everyone in the world.
In our team, you'll have the opportunity to manage the complex challenges of scale, while using expertise in coding, algorithms, complexity analysis, and large-scale system design.
We embrace a culture of diversity, intellectual curiosity, openness, and problem-solving. We encourage close collaboration while promoting self-direction.Responsibilities - What You'II Do