Jobs
>
San Jose

    Senior Site Reliability Engineer - San Jose, United States - Hireio, Inc.

    Default job background
    Description

    Job Description

    Job Description

    About the company

    It is the leading destination for short-form mobile video. It is the largest Unicorn startup. It's the leader in short-form video hosting service now. It surpassed 1.3 billion mobile downloads in United States and 2 billion worldwide. With 1.5 billion monthly active users worldwide, it ranked one of the most popular social entertainment app.

    About the team

    Our data infrastructure Site Reliability Engineering (SRE) team is a pioneer in innovation. We seamlessly merge software development and infrastructure operations to design, build, and manage large-scale, highly distributed systems. We take pride in overseeing one of the industry's most extensive cloud infrastructures. As software development evolves, building systems from a mix of components has become the new standard. In this era, SRE takes a central role. This role demands the ability to design, develop, and operate these components, transforming them into cloud-managed, scalable, and reliable elements. Our professionals play a critical role as connectors, ensuring the seamless integration of these diverse components to deliver high-performing systems. Our dynamic SRE field is about actively shaping the future of technology, not just keeping pace with it. We contribute significantly to the next chapter of data infrastructure. We're currently in the process of building global teams around the world. Join us today and embark on this transformative journey

    Responsibilities:

    Participate in and enhance the complete service lifecycle, from inception and design, through development, capacity planning, launch reviews, deployment, operation, and refinement.- Design and implement software platforms and monitoring frameworks to govern service-oriented architecture (SOA) efficiently, automatically, and intelligently.

    Develop and manage components of cloud-managed data infrastructure, encompassing technologies such as Kubernetes, Redis, MySQL, Flink, and more.

    Establish sustainable mechanisms for scaling systems, such as automation, to drive enhancements in reliability, efficiency, and velocity.

    Provide sustainable user support, manage incident responses, and conduct blameless postmortems as part of our ongoing efforts to improve our systems.

    Requirements


    • Bachelor's degree in Computer Science or a related technical field with 5+ years of experience


    • Experience programming in one of the following Languages: C, C++, Java, Python, Go, and Rust


    • Familiar with Unix/Linux system internals, networking, and distributed systems


    • [Preferred] Experience in MySQL, Redis, Ngnix, Kubernetes, Docker, OpenStack, Hadoop, Spark, Flink, etc.


    • [Preferred] Experience in designing and analyzing large-scale distributed systems


    • [Preferred] Strong skills in problem solving and communication


    • [Preferred] Bilingual in Mandarin and English

    Benefits

    Our company benefits are designed to convey company culture and values, to create an efficient and inspiring work environment, and to support our employees to give their best in both work and life. We offer the following benefits to eligible employees: ​

    We cover 100% premium coverage for employee medical insurance, approximately 75% premium coverage for dependents and offer a Health Savings Account(HSA) with a company match. As well as Dental, Vision, Short/Long term Disability, Basic Life, Voluntary Life and AD&D insurance plans. In addition to Flexible Spending Account(FSA) Options like Health Care, Limited Purpose and Dependent Care. ​

    Our time off and leave plans are: 10 paid holidays per year plus 17 days of Paid Personal Time Off (PPTO) (prorated upon hire and increased by tenure) and 10 paid sick days per year as well as 12 weeks of paid Parental leave and 8 weeks of paid Supplemental Disability. ​

    We also provide generous benefits like mental and emotional health benefits through our EAP and Lyra. A 401K company match, gym and cellphone service reimbursements. The Company reserves the right to modify or change these benefits programs at any time, with or without notice.



  • Advanced Micro Devices , Inc. San Jose, United States

    Overview: · WHAT YOU DO AT AMD CHANGES EVERYTHING · We care deeply about transforming lives with AMD technology to enrich our industry, our communities, and the world. Our mission is to build great products that accelerate next-generation computing experiences the building bloc ...


  • TEKsystems San Jose, United States Contract

    Description: · Adobe is looking for an experienced Site Reliability Engineer to join the internal tooling team support, configure, integrate, upgrade, and automate the use of enterprise tools used across their large Engineering organization. Role will be focused on user interact ...


  • HCLTech San Jose, United States

    About HCLTech: · HCLTech is a global technology company, home to 221,000+ people across 60 countries, delivering industry-leading capabilities centered around digital, engineering and cloud, powered by a broad portfolio of technology services and products. We work with clients ac ...


  • Natron Energy Santa Clara, United States

    Natron is seeking a Reliability Engineer to support the development and test of our high-power battery systems for data center UPS and EV charging applications. The occupant of this position will work with the Product Engineering, Reliability, Technology, and Operations teams to ...

  • Comtech Telecom

    Reliability Engineer

    3 weeks ago


    Comtech Telecom Santa Clara, United States

    Comtech Telecommunications Corp. has an opportunity in Santa Clara, CA for a Reliability/Failure Analysis Engineer. In this important role, you will collaborate with a diverse team of technical professionals and interact with outside customers, providing solutions to a variety of ...

  • COMTECH TELECOMMUNICATIONS

    Reliability Engineer

    3 weeks ago


    COMTECH TELECOMMUNICATIONS Santa Clara, United States

    Job Description · Job DescriptionComtech Telecommunications Corp. has an opportunity in Santa Clara, CA for a Reliability/Failure Analysis Engineer. In this important role, you will collaborate with a diverse team of technical professionals and interact with outside customers, pr ...


  • MRINetwork Jobs San Jose, United States

    Job Description · Job Description · We are working with a company operating in the best of both worlds – an innovative start-up inside of a $6 billion parent company building the next generation of solar. They have developed an industry-leading building-integrated solar technol ...


  • Diverse Lynx San Jose, United States

    Semiconductor Reliability Senior Engineer · 5+ experience in IC reliability engineering with hands-on experience in 1 or more related areas such as Product Engineering, Test Engineering, Failure Analysis. · •Good understanding of Semiconductor, manufacturing process (Fab, Assembl ...


  • Intel San Jose, United States

    Job Details: · Job Description: · Microelectronic Quality Reliability Engineers provide project management, product, process design/development and sustaining support for integrated circuit or semiconductor assemblies, various other electronic components, sub systems and/or com ...


  • Antora Energy San Jose, United States

    Job Description · Job DescriptionAt Antora, we're on a mission to stop climate change. And we can't do that unless we tackle the 30% of global emissions that come from industry. · Antora is unlocking zero-emissions industrial energy, cheaper than fossil fuels. Antora's thermal ba ...


  • HCLTech San Jose, United States

    About HCLTech: · HCLTech is a global technology company, home to 221,000+ people across 60 countries, delivering industry-leading capabilities centered around digital, engineering and cloud, powered by a broad portfolio of technology services and products. We work with clients ac ...


  • Myriad Consulting Inc San Jose, United States

    This role also open for junior (3+ yoe) candidates, and SRE lead (7+ yoe). · Site Reliability Engineering(SRE) team combines software and systems engineering to build and run large-scale, massively distributed, and fault-tolerant systems. In our team, you ll have the opportunity ...


  • Analog Devices San Jose, United States

    Come join Analog Devices (ADI) - a place where Innovation meets Impact. For more than 55 years, Analog Devices has been inventing new breakthrough technologies that transform lives. At ADI you will work alongside the brightest minds to collaborate on solving complex problems that ...


  • The Dignify Solutions LLC San Jose, United States

    AWS Infra SRE/DevOps engineer with proven work experience ensuring reliability, availability and performance of cloud infra and platform. · - Specialist on Cisco Cloud run-on for infrastructure management, who can install, run, and maintain software like docker, and containers. ...


  • Cisco San Jose, United States

    The successful applicant will be performing work in FedRAMP environments, and therefore, must be a U.S. Person (i.e. U.S. citizen, U.S. national, lawful permanent resident, asylee, or refugee). This position may also perform work that the U.S. government has specified can only be ...


  • Analog Devices San Jose, United States

    Come join Analog Devices (ADI) – a place where Innovation meets Impact. For more than 55 years, Analog Devices has been inventing new breakthrough technologies that transform lives. At ADI you will work alongside the brightest minds to collaborate on solving complex problems that ...


  • IBM San Jose, United States

    Automation: Develop and maintain automation tools and scripts to streamline deployment, monitoring, and management of the infrastructure and · applications. · Monitoring and Alerting: Set up and maintain monitoring and alerting systems to proactively identify and resolve issues b ...


  • Zoom San Jose, United States

    ** Sponsorship is not available for this position ** · What you can expect · As a senior level Product Resilience SRE, you will define, scope, plan, and schedule Disaster Recovery Testing at Zoom. You will document any gaps identified by our testing, and drive technical solutions ...


  • ByteDance San Jose, United States

    【For Pay Transparency】Compensation Description (annually)The base salary range for this position in the selected city is $ $410000 annually.​ · Compensation may vary outside of this range depending on a number of factors, including a candidate's qualifications, skills, competenci ...


  • Celestial AI Santa Clara, United States

    About Celestial AI · As the industry strives to meet the demands of the AI workloads, bottlenecks in data transfers between processors and memory have hindered progress. The Photonic Fabric based Memory Fabric provides an optically scalable solution to the 'Memory Wall' problem, ...