Jobs
>
Philadelphia

    Staff Software Engineer, Reliability - Philadelphia, United States - Salesforce

    Default job background
    Description

    Reference #:

    JR251400 To get the best candidate experience, please consider applying for a maximum of 3 roles within 12 months to ensure you are not duplicating efforts.

    Job CategorySoftware Engineering

    Job Details

    About Salesforce We're Salesforce, the Customer Company, inspiring the future of business with AI+ Data +CRM.

    Leading with our core values, we help companies across every industry blaze new trails and connect with customers in a whole new way.

    And, we empower you to be a Trailblazer, too - driving your performance and career growth, charting new paths, and improving the state of the world.

    If you believe in business as the greatest platform for change and in companies doing well and doing good - you've come to the right place.


    About the team The Reliability and Incident Automation team builds tools and products that underpin Reliability, Service Ownership and Incident Management at Slack.


    We seek diverse perspectives and strategies with a focus on how to keep Slack reliable, empower service owners and learn from incidents.

    We collaborate with product and infrastructure engineering teams to continuously improve shared technology and processes, and maintain incident management as a foundational skill set of all engineering teams at Slack.

    Slack has a positive, diverse, and supportive culture. We want people who are curious, inventive, and inspired to do their best work every single day. In our work together we aim to be smart, humble, hardworking and, above all, collaborative. If this sounds like a good fit for you, please apply and connect with our team.


    What you will be doing:
    Lead engineering development on internal products and tools with a focus on prototyping and iteration for high velocity. Engage with teams and users to build features that have a delightful user experience and make their lives better.


    Build tooling and services that handle failure gracefully and without interrupting incident response in an environment that requires rock solid reliability and interacting with a variety of external systems, such as Observability, Monitoring, Alerting and Ticketing, to provide real time information to incident responders.

    Provide mentorship and guide the team forward through technical expertise.


    Facilitate and participate in incident investigations and reviews (aka postmortems) for major incidents at Slack and drive program improvements for Incident Analysis and Review across Slack Engineering.


    Run training and workshops to teach Incident Responders and Commanders across Slack about the principles of incident management and the tactical ways in which we perform incident response.

    Be a peer and mentor to engineers who are new to on-call work and various roles in incident response.

    Be a service owner for the software and tooling we write and develop. You will participate in an on-call rotation, assist with triage, address production issues, and respond to incidents. Participate as an Incident Commander at Slack.


    What you should have:
    You have 7+ years of experience in Reliability, Incident Management and/or operating distributed systems at scale.

    You have experience with functional or imperative programming languages - e.g., PHP, Python, Ruby, or Go.

    You write understandable, testable code with an eye towards maintainability.

    You are a strong communicator with a positive attitude, and empathy. Explaining complex technical concepts to designers, support, and other engineers is no problem for you.

    You possess strong computer science fundamentals: data structures, algorithms, programming languages, distributed systems, and information retrieval.

    Strong UX and design sensibilities, and a desire to sweat the small stuff.

    Self-awareness and a desire to continually improve.

    Experience with large scale distributed systems and cloud-based environments.

    You enjoy helping onboard new team members, mentoring, and teaching others.

    You have a Bachelor's egree in Computer Science, Engineering or related field, or equivalent training, fellowship, or work experience.


    Bonus points:
    You are passionate about Site Reliability Engineering (SRE), Resilience Engineering and Learning from Incidents

    Experience building tools or applications with Python and Go

    Curiosity for gaining valuable insights via analytics and metrics

    You have experience in responding to and coordinating incidents in previous roles


    Accommodations If you require assistance due to a disability applying for open positions please submit a request via this .

    Posting Statement At Salesforce we believe that the business of business is to improve the state of our world. Each of us has a responsibility to drive Equality in our communities and workplaces.

    We are committed to creating a workforce that reflects society through inclusive programs and initiatives such as equal pay, employee resource groups, inclusive benefits, and more.

    Learn more about Equality at and explore our company benefits at .

    is an Equal Employment Opportunity and Affirmative Action Employer.

    Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender perception or identity, national origin, age, marital status, protected veteran status, or disability status.

    does not accept unsolicited headhunter and agency resumes. will not pay any third-party agency or company that does not have a signed agreement with.

    Salesforce welcomes all.

    For Colorado-based roles, the base salary hiring range for this position is $185,800 to $269,500.

    Compensation offered will be determined by factors such as location, level, job-related knowledge, skills, and experience. Certain roles may be eligible for incentive compensation, equity, benefits.

    More details about our company benefits can be found at the following link:

    and are Equal Employment Opportunity and Affirmative Action Employers.

    Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender perception or identity, national origin, age, marital status, protected veteran status, or disability status.

    Headhunters and recruitment agencies may not submit resumes/CVs through this Web site or directly to managers. and do not accept unsolicited headhunter and agency resumes. and will not pay fees to any third-party agency or company that does not have a signed agreement with or


  • SRI Telecom Philadelphia, United States

    Cloud DevOps Reliability Engineer - 5G · Location: Philadelphia, PA · Salary: $59.66-$68.89/hr. · Company Information: · In the exploding business of modern telecommunications, SRI is at the forefront of this expansion by helping companies build, integrate & staff their cutting- ...


  • Forhyre Philadelphia, United States

    Job Description · Job DescriptionForhyre is looking for engineers who can bring unique perspectives and innovative ideas to all areas of development and are interested in continuing to improve our platform through the ever-changing technology landscape. · To be successful in thi ...


  • AdvanSix Philadelphia, United States

    AdvanSix plays a critical role in global supply chains, innovating and delivering essential products for our customers in a wide variety of end markets and applications that touch people's lives, such as building and construction, fertilizers, plastics, solvents, packaging, paint ...


  • Comcast Corporation Philadelphia, PA, United States

    FreeWheel, a Comcast company, provides comprehensive ad platforms for publishers, advertisers, and media buyers. Powered by premium video content, robust data, and advanced technology, we're making it easier for buyers and sellers to transact across all screens, data types, and s ...


  • Susquehanna International Group Philadelphia, United States

    Overview · SIG is looking for a Site Reliability Engineer with a focus on Virtualization and related Automation skills to join our team. This is a great opportunity to leverage your VMware and scripting skills to enhance our growing environment with automation and tooling. · Yo ...


  • SRI Telecom Philadelphia, United States

    Job Description · Job DescriptionCloud DevOps Reliability Engineer - 5G · Location: Philadelphia, PA · Salary: $59.66-$68.89/hr. · Company Information: · In the exploding business of modern telecommunications, SRI is at the forefront of this expansion by helping companies build, ...


  • Lockheed Martin Moorestown, United States

    · Description:This role is for a Facilities Reliability Lead Specialist whose primary responsibilities will be to implement programs that institute asset reliability, reduce operational risk and drive lower costs within the campus, infrastructure, systems and equipment that supp ...


  • Allscripts Healthcare, LLC Philadelphia, United States

    Veradigm Provider Veradigm offers provider practices a suite of easy-to-use healthcare provider solutions that help streamline clinical and financial workflows. We then deliver actionable insights to drive improved outcomes, reduce patients out-of-p Reliability Engineer, Liabilit ...


  • Veradigm Philadelphia, United States

    Welcome to Veradigm Our Mission is to be the most trusted provider of innovative solutions that empower all stakeholders across the healthcare continuum to deliver world-class outcomes. Our Vision is a Connected Community of Health that spans continents and borders. With the larg ...


  • Comcast Philadelphia, United States

    FreeWheel, a Comcast company, provides comprehensive ad platforms for publishers, advertisers, and media buyers. Powered by premium video content, robust data, and advanced technology, were making it easier for buyers and sellers to transact across all screens, data types, and sa ...

  • Kearfott

    Reliability Engineer

    23 hours ago


    Kearfott Trenton, United States

    Founded in 1918, Kearfott Corporation, a global Aerospace and Defense supplier for over 100 years, is a leader in the design and manufacture of precision motion control products and inertial navigation components. Kearfott has a very long history of innovation and excellence, and ...


  • Comcast Corporation Philadelphia, PA, United States

    FreeWheel, a Comcast company, provides comprehensive ad platforms for publishers, advertisers, and media buyers. Powered by premium video content, robust data, and advanced technology, we're making it easier for buyers and sellers to transact across all screens, data types, and s ...

  • Lockheed Martin Corporation

    Reliability Engineer

    2 hours ago


    Lockheed Martin Corporation Medford, United States

    Job ID: 667052BR · Date posted: Mai. 28, 2024 · Description:This role is for a Facilities Reliability Lead Specialist whose primary responsibilities will be to implement programs that institute asset reliability, reduce operational risk and drive lower costs within the campus, ...


  • Comcast Corporation Philadelphia, PA, United States

    FreeWheel, a Comcast company, provides comprehensive ad platforms for publishers, advertisers, and media buyers. Powered by premium video content, robust data, and advanced technology, we're making it easier for buyers and sellers to transact across all screens, data types, and s ...


  • Comcast Corporation Philadelphia, United States

    FreeWheel, a Comcast company, provides comprehensive ad platforms for publishers, advertisers, and media buyers. Powered by premium video content, robust data, and advanced technology, we're making it easier for buyers and sellers to transact across all screens, data types, and s ...


  • Comcast Philadelphia, United States

    FreeWheel, a Comcast company, provides comprehensive ad platforms for publishers, advertisers, and media buyers. Powered by premium video content, robust data, and advanced technology, were making it easier for buyers and sellers to transact across all screens, data types, and sa ...


  • Covanta Chester, United States Full time

    Who we are · For more than 40 years, Covanta has been at the forefront of sustainable materials management, providing companies and communities world-class waste and resource solutions. · Through our diverse and scalable full-service capabilities, we're leading the charge to a c ...


  • Tekgence Inc Wilmington, United States

    Job Description · Formal training or certification on site reliability engineering concepts and 3+ years applied experience · Proficient in site reliability culture and principles and familiarity with how to implement site reliability within an application or platform · Proficien ...


  • TE Connectivity Berwyn, United States

    Select how often (in days) to receive an alert: · TE Connectivity in Berwyn, PA is looking for a · Quality & Reliability Engineer · to · Recommend and implement process improvements and modifications. Support the implementation and training of quality standards and improvement ...


  • Ccube Wilmington, United States

    Job Description · Job DescriptionJob Title: SRE Engineer · Location: Wilmington, DE – Only Local · Job DescriptionFormal training or certification on site reliability engineering concepts and 3+ years applied experience · Proficient in site reliability culture and principles and ...