Jobs
>
Seattle

    Staff Site Reliability Engineer - Seattle, United States - Coupang

    Default job background
    Description

    About the Company:
    At Coupang we are building the future of ecommerce. Born out of an obsession to make shopping, eating,
    and living easier than ever, we're collectively disrupting the multi-billion-dollar e-commerce industry from
    the ground up. We exist to wow our customers. We know we're doing the right thing when we hear our
    customers say, "How did we ever live without Coupang?" We are one of the fastest-growing e-commerce
    companies that established an unparalleled reputation for being a dominant and reliable force in South
    Korean commerce.

    We are proud to have the best of both worlds — a startup culture with the resources of a large global
    public company. This fuels us to continue our growth and launch new services at the speed we have
    been at since our inception. We are all entrepreneurial surrounded by opportunities to drive new initiatives
    and innovations. At our core, we are bold and ambitious people who like to get our hands dirty and make
    a hands-on impact. At Coupang, you will see yourself, your colleagues, your team, and the company
    grow every day.
    Our mission to build the future of commerce is real. We push the boundaries of what's possible to solve
    problems and break traditional tradeoffs. Join Coupang now to create an epic experience in this always-on, high-tech, and hyper-connected world.


    About the Role:
    Site Reliability Engineers (SREs) at Coupang is a mission-critical role which combines software and
    system engineering to build, run and scale our complex, large-scale ecommerce systems. As part of the
    Site Reliability Engineering team, you will be responsible for ensuring all our customer facing services are
    healthy, monitored, automated, and designed to scale. As SRE organization we take pride in handling
    "operations as an engineering" problem with automation first approach. You will use your background to
    build best in class infrastructure automation for areas such as Observability, Incident management,
    Disaster Recovery, Load testing, Capacity engineering and many more. In this role you will work very
    closely with our product development teams from an early stage of design to all the way helping resolve
    any production incidents, maintaining SLI/SLA bar for production services and influencing them with SRE
    principles and best practices. If you take pride in complete ownership, have a passion for solving complex
    technical challenges for large scale distributed systems and demeanor to work and communicate
    effectively across team boundaries, this is the role for you


    Key Responsibilities:
    Serve as a primary point responsible for the reliability, health, and performance of all Coupang

    customer-facing services.

    Gain deep knowledge of Coupang application workflow and dependencies.

    Spearheading and conceptualizing revolutionary designs in critical service architecture.

    Conducting comprehensive architecture reviews leading re-architecting initiatives to set industry

    leading benchmarks in performance, reliability and availability.

    Lead and drive large scale technical initiatives across multiple engineering teams.

    Be able to drive collaboration effectively across organizational boundaries, be able to build strong

    stakeholder relationships to achieve broad organizational objectives.

    Identify and implement scalable solutions for complex technical problems. Be the change driver.

    Self-motivated to be able to navigate the ambiguity with large initiatives and find solutions to

    accomplish the goal.

    Be the SRE champion/lead working with rest of the technical leaders across Coupang to define

    and drive the engineering roadmap.

    Contribute towards hiring and building a world class team. Mentor and coach junior engineers on

    the team.

    Communicate effectively with people at all levels of the organization.

    Essential Qualifications:

    10+ years of industry experience building and operating large scale distributed systems.

    Deep UNIX/Linux systems knowledge and administration background.

    Strong programming skills in one or more of: Python, Java, Golang, C++.

    Strong problem-solving and analytical skills spanning systems, network (TCP/IP) and code, with a

    focus on data-driven decision-making.

    Proficient with cloud-based infrastructure, including AWS, Azure, or Google Cloud Platform.

    Strong understanding of DevOps and SRE practices, including continuous integration, continuous

    delivery, and infrastructure as code (IaC).

    Proficient with containerization and orchestration technologies, such as Docker and Kubernetes.

    Knowledge of observability ecosystem including metrics, logging, tracing and tools, such as

    Prometheus, Grafana, Elastic Stack, Datadog, or New Relic.

    Excellent communication and collaboration skills, with the ability to work with teams across

    distinct functions and technical domains.


    Preferred Qualifications:
    Master's degree in computer science, Engineering, or a related technical field.

    Prior experience working with large scale web-based Java architectures and JVM configuration.

    Professional certifications in cloud platforms, monitoring tools, or related technologies.

    Previous experience working on a large-scale ecommerce platform.

    #J-18808-Ljbffr


  • Marsh McLennan Companies Seattle, United States Full time

    Description: · Our not-so-secret sauce. · Award-winning, inclusive, Top Workplace culture doesn't happen overnight. It's a result of hard work by extraordinary people. More than 9, of the industry's brightest talent drive our efforts to deliver purposeful work and meaningful im ...


  • Blue Origin Seattle, United States

    At Blue Origin, we envision millions of people living and working in space for the benefit of Earth. We're working to develop reusable, safe, and low-cost space vehicles and systems within a culture of safety, collaboration, and inclusion. Join our diverse team of problem solvers ...


  • Sogeti Seattle, United States

    Site Reliability Engineer · FTE with benefits · Our team is looking to add experienced Site Reliability / DevOps Engineer to our team. Experienced with Python and Shell Scripting. Should have extensive experience with Azure or AWS (Azure preferred) Experience with Monitoring ...


  • Boeing Seattle, United States

    Reliability and Maintainability Engineer (Associate, Mid-Level & Senior) · Company: · The Boeing Company · Job ID: · Date Posted: · Location: · USA - Everett, WA, USA - Seattle, WA · Job Description Qualifications: · Boeing Commercial Airplanes · (BCA) is seeking · Associate, ...


  • Georgia IT Inc Seattle, United States

    Site Reliability Engineer (SRE) / DevOps Engineering · Seattle, WA · Contract · Responsibilities8-10+ years of Site Reliability / DevOps Engineering. · Looking for an experience SRE with some data engineering background or experience. · Any experience with Databricks a big p ...


  • Apple Seattle, United States

    Site Reliability Engineer · Seattle,Washington,United States · Software and Services · The Apple Services Engineering (ASE) team is one of the most exciting examples of Apples long-held passion for combining art and technology. These are the people who power the App Store, App ...


  • INSPYR Solutions Seattle, United States

    Title: Site Reliability Engineer · Location: Seattle, WA (Hybrid 2-3 days on-site) · Duration: 1+ year contract, (Possibility of conversion) · Compensation: $85-$95.40/hour · Work Requirements: US Citizen, GC Holders or Authorized to Work in the U.S. · Skillset / Experience: · Yo ...


  • Blue Origin Seattle, United States

    At Blue Origin, we envision millions of people living and working in space for the benefit of Earth. We're working to develop reusable, safe, and low-cost space vehicles and systems within a culture of safety, collaboration, and inclusion. Join our diverse team of problem solvers ...


  • Boeing Seattle, United States

    Reliability and Maintainability Engineer (Associate, Mid-Level & Senior)Company:The Boeing Company · Job ID: · Date Posted: · Location:USA - Everett, WA, USA - Seattle, WA · Job Description Qualifications:Boeing Commercial Airplanes · (BCA) is seeking · Associate, Mid-Level ...


  • Tik Tok Seattle, United States

    Responsibilities · About TikTok U.S.Data Security · TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. U.S. Data Security ("USDS") is a subsidiary of TikTok in the U.S. This new, security-first division was created ...


  • Anduril Industries Seattle, United States

    Anduril Industries is a defense technology company with a mission to transform U.S. and allied military capabilities with advanced technology. By bringing the expertise, technology, and business model of the 21st century's most innovative companies to the defense industry, Anduri ...


  • Blue Origin Seattle, United States Full time

    At Blue Origin, we envision millions of people living and working in space for the benefit of Earth. We're working to develop reusable, safe, and low-cost space vehicles and systems within a culture of safety, collaboration, and inclusion. Join our diverse team of problem solvers ...


  • Adobe Seattle, United States

    Our Company · Changing the world through digital experiences is what Adobe's all about. We give everyone—from emerging artists to global brands—everything they need to design and deliver exceptional digital experiences We're passionate about empowering people to create beautiful ...


  • Saint-Gobain S.A. Seattle, United States

    Consistent with CertainTeed Gypsum Vision, Mission, Values and Objectives, the Reliability Engineer identifies and quantifies Line 1 and Line 2 root cause failure(s), and drives permanent solutions to address systemic or chronic mechanical deficienci Reliability Engineer, Liabili ...


  • Saint-Gobain Seattle, United States

    Pourquoi a-t-on besoin de vous ? · POSITION SUMMARY · Consistent with CertainTeed Gypsum Vision, Mission, Values and Objectives, the Reliability Engineer identifies and quantifies Line 1 and Line 2 root cause failure(s), and drives permanent solutions to address systemic or chr ...


  • Hireio, Inc. Seattle, United States

    Job Description · Job Description1. Engage in and improve the whole lifecycle of Ads systems — from system design consulting through to launch reviews, deployment, operation and refinement. · 2. Build availability of services deployed across multiple data centers globally. · 3. D ...


  • Blue Origin Seattle, United States

    At Blue Origin, we envision millions of people living and working in space for the benefit of Earth. We're working to develop reusable, safe, and low-cost space vehicles and systems within a culture of safety, collaboration, and inclusion. Join our diverse team of problem solvers ...


  • Blue Origin Seattle, United States

    At Blue Origin, we envision millions of people living and working in space for the benefit of Earth. We're working to develop reusable, safe, and low-cost space vehicles and systems within a culture of safety, collaboration, and inclusion. Join our diverse team of problem solvers ...


  • Sogeti Seattle, United States

    Lead Site Reliability Engineer · Seattle, WA · FTE/ Direct hiring with benefits · No Remote - Onsite and Hybrid position from WA location only · Qualification & Skills · 8+ years of experience in Site Reliability Engineering or related field · Develop, maintain and configure ...


  • Apple Seattle, United States

    Senior Site Reliability Engineer · Seattle,Washington,United States · Software and Services · Apple Services Engineering team is one of the most exciting examples of Apples long-held passion for combining art and technology. Join Apple Services Engineering Cloud Service Infras ...