Jobs
>
Atlanta

    Operational Reliability Engineer - Atlanta, United States - SiriusXM

    Default job background
    Description

    Responsibilities:

    Who We Are:

    SiriusXM and its brands (Pandora, SXM Media, AdsWizz, Simplecast, and SiriusXM Connected Vehicle Services) are leading a new era of audio entertainment and services by delivering the most compelling subscription and ad-supported audio entertainment experience for listeners -- in the car, at home, and anywhere on the go with connected devices. Our vision is to shape the future of audio, where everyone can be effortlessly connected to the voices, stories and music they love wherever they are.

    This is the place where a diverse group of emerging talent and legends alike come to share authentic and purposeful songs, stories, sounds and insights through some of the best programming and technology in the world. Our critically-acclaimed, industry-leading audio entertainment encompasses music, sports, comedy, news, talk, live events, and podcasting. No matter their individual role, each of our employees plays a vital part in bringing SiriusXMs vision to life every day.

    SiriusXM

    SiriusXM is the leading audio entertainment company in North America, and the premier programmer and platform for subscription and digital advertising-supported audio products. SiriusXMs platforms collectively reach approximately 150 million listeners, the largest digital audio audience across paid and free tiers in North America, and deliver music, sports, talk, news, comedy, entertainment and podcasts. Pandora, a subsidiary of SiriusXM, is the largest ad-supported audio entertainment streaming service in the U.S. SiriusXM's subsidiaries Simplecast and AdsWizz make it a leader in podcast hosting, production, distribution, analytics and monetization. The Companys advertising sales organization, which operates as SXM Media, leverages its scale, cross-platform sales organization and ad tech capabilities to deliver results for audio creators and advertisers. SiriusXM, through Sirius XM Canada Holdings, Inc., also offers satellite radio and audio entertainment in Canada. In addition to its audio entertainment businesses, SiriusXM offers connected vehicle services to automakers.

    solutions powered by AdsWizz; sonic creative consultancy Studio Resonate; and an extended content network featuring exclusive monetization agreements with Audiochuck, NBCUniversal, SoundCloud, and many more. Reaching more than 150 million listeners each month, SXM Media delivers audiences the tailored brand experiences they crave while putting creators first, making it easy for every marketer to produce, plan, buy and measure across its entire audio universe.

    How Youll Make An Impact:

    SiriusXM is looking for a strong collaborative team player to work within our 24x7 streaming infrastructure Operational Reliability Engineering (ORE) Team working within SiriusXMs Streaming infrastructure. Our team is seeking a highly skilled engineer to join our team and ensure the stability, performance, and reliability of our systems. The ideal candidate will have excellent knowledge and experience in Linux, AWS, Python programming, APM tools such as Datadog, RESTful service calls, and service troubleshooting. Your primary responsibility will be to proactively monitor, optimize, and troubleshoot our systems to maintain operational excellence. The ability to demonstrate problem solving is essential. The position is classified as a on-call position will participate in an on-duty rotation. The candidate will be expected to ensure the coverage of after-hours deployments/maintenance windows and incident responses. This position is an individual contributor/hands on role. The position focuses on streaming infrastructure which is primarily within AWS. This also includes expanded responsibilities surrounding all infrastructure support including application monitoring, software build machines, and software support servers. The candidate should have a penchant for solving tough technical problems, and a dedication to ensuring high availability.

    What Youll Do:

    This is technically a diverse role. The required skills are exciting and manifold.

    Linux Expertise:

    • Maintain and manage Linux-based systems and servers
    • Implement best practices for system configuration, security, and performance

    AWS Proficiency

    • Manage and optimize AWS infrastructure and services
    • Understand the monitoring, and scale applications on AWS to ensure high availability.

    Python Programming

    • Develop and maintain automation scripts and tools using Python
    • Automate repetitive tasks to enhance system reliability and efficiency

    APM (Application Performance Monitoring) Knowledge

    • Utilize Datadog APM tools to monitor and analyze application performance
    • Identify and address performance bottlenecks and issues in real-time

    RESTful Service Calls

    • Collaborate with development teams to ensure the reliability of RESTful APIs
    • Troubleshoot and resolve issues related to service calls and integrations

    Service Troubleshooting

    • Investigate incidents/outages, identifying root causes and implementing preventive measures
    • Work closely with cross-functional teams to resolve issues and improve system reliability

    Create automated tasks with shell scripts (bash, batch, python)

    Monitoring and understanding security events with WAFs, IDS/IPS, and access logs

    Networking (TCP/IP) configuration including Firewall ACLs and security

    Creation, Management and Configuration of Virtual Machines, Clone and Templates

    Collecting, monitoring, and analyzing systems performance data for improved performance

    Perform custom application maintenance, to include debugging, installing new application releases,

    patching

    Develop, document and maintain procedures for administering, maintaining, and supporting

    infrastructure

    What Youll Need:

    • Minimum of 5 years IT/Engineering experience. 1 year in a 24x7 HA support environment
    • AWS Certified DevOps Engineer, Certified Solutions Architect, AWS Certified Data Analytics Specialty is preferred to demonstrate a good AWS knowledge base
    • BS Computer Science/Engineering, Information Sciences Technology or Equivalent Experience
    • Proven experience as a Site Reliability Engineer or DevOps Engineer.
    • Excellent knowledge of Linux systems and administration.
    • Expertise in Python programming for automation and tool development.
    • In-depth understanding of APM tools, particularly Datadog.
    • Thorough knowledge of RESTful service architecture and best practices.
    • Exceptional troubleshooting and problem-solving skills.
    • Strong communication and collaboration skills to work effectively with cross-functional teams.
    • Familiarity with containerization and orchestration tools (e.g., Docker) is a must.

    Bring These Also

    • Work independently as part of a team, including cross-functional teams
    • Exhibit excellent time management skills, with the ability to prioritize and multi-task, and work under shifting deadlines in a fast-paced environment
    • Pay attention to details and be organized
    • Interface with a multitude of diverse personalities in a professional and consistent manner
    • Identify problems, recommend solutions and perform triage in a team environment
    • This position requires 24x7 availability for support and after hours work in order to support the availability and uptime requirements of the business
    • Must have legal right to work in the U.S

    Our goal at SiriusXM is to provide and maintain a work environment that fosters mutual respect, professionalism and cooperation. SiriusXM is an equal opportunity employer that does not discriminate on the basis of actual or perceived race, creed, color, religion, national origin, ancestry, alienage or citizenship status, age, disability or handicap, sex, gender identity, marital status, familial status, veteran status, sexual orientation or any other characteristic protected by applicable federal, state or local laws.

    The requirements and duties described above may be modified or waived by the Company in its sole discretion without notice.


  • The Select Group

    Reliability Engineer

    2 weeks ago


    The Select Group Atlanta, United States

    SYSTEMS RELIBILITY ENGINEER · You can get further details about the nature of this opening, and what is expected from applicants, by reading the below. · The Select Group is currently hiring for a Systems Reliability Engineer to join as a resource for one of our clients within ...

  • The Select Group

    Reliability Engineer

    3 weeks ago


    The Select Group Atlanta, United States

    SYSTEMS RELIBILITY ENGINEER · The Select Group is currently hiring for a Systems Reliability Engineer to join as a resource for one of our clients within the Telecommunicaitons Industry that will be hybrid sitting in Atlanta, GA. This Engineer will be responsible for assessing t ...


  • Austin Allen Inc Atlanta, United States

    Reliability Engineer – Manufacturing – South Carolina · Salary $80,0000 - $110,000 + Bonus + Fantastic Benefits + Paid Relocation to South Carolina · Actively recruiting talented Reliability Engineers for this growing manufacturing company with locations nationwide · You'll be ...

  • JLL

    Reliability Engineer

    3 weeks ago


    JLL Atlanta, United States

    JLL supports the Whole You, personally and professionally. · Our people at JLL are shaping the future of real estate for a better world by combining world class services, advisory and technology to our clients. We are committed to hiring the best, most talented people in our ind ...

  • STONE Resource Group

    Reliability Engineer

    3 weeks ago


    STONE Resource Group Atlanta, United States

    ***NOTE: This is role is unable to do C2C and our client is unable to sponsor at this time. This will also be a hybrid model in Atlanta, GA.*** · Overview · STONE Resource Group is partnered with a leading company in the HVAC Industry looking to expand their current team by add ...

  • U.S. Bank National Association

    Reliability Engineer

    2 weeks ago


    U.S. Bank National Association Atlanta, United States

    Elavon (Elavon is a part of the U.S. Bank family) seeks a full-time Reliability Engineer in Atlanta, GA. The Reliability Engineer supports production applications and proactively looks for ways to automate discoveries, eliminate incidents from recurring and/or reduce the time it ...


  • The Select Group Atlanta, United States

    SYSTEMS RELIBILITY ENGINEER · The Select Group is currently hiring for a Systems Reliability Engineer to join as a resource for one of our clients within the Telecommunicaitons Industry that will be · hybrid sitting in Atlanta, GA . This Engineer will be responsible for assessin ...

  • Jones Lang LaSalle IP, Inc.

    Reliability Engineer

    3 weeks ago


    Jones Lang LaSalle IP, Inc. Atlanta, United States

    JLL supports the Whole You, personally and professionally. · Our people at JLL are shaping the future of real estate for a better world by combining world class services, advisory and technology to our clients. We are committed to hiring the best, most talented people in our ind ...

  • Southern Company

    Reliability Engineer

    3 weeks ago


    Southern Company Atlanta, United States

    Job Description · Reliability Engineer - Asset Management · Reliability Engineer - Asset Management Locations: State of Georgia. Not required to report to a specific Operating Headquarters location. · Reliability Engineering - Asset Management will primarily be responsible for ...


  • Jones Lang LaSalle IP, Inc. Atlanta, United States

    JLL supports the Whole You, personally and professionally. · Our people at JLL are shaping the future of real estate for a better world by combining world class services, advisory and technology to our clients. We are committed to hiring the best, most talented people in our ind ...


  • SUCCESS KOREA Atlanta, United States

    채용제목 · Reliability Engineer(과차장급) (지원마감) · 회사소개 · 화학회사 · 업무내용/자격요건 · 담당직무 · Chemical Engineering, Mechanical Engineering related (화학공학전공또는 기계공학전공, 기계과 관련전공) · Chemical Industry에서 공정관리 또는 Maintenance 관련 업무 담당 경력 · 단순한 정비업무보다는 각종 기계에 대한 관리 및 분석 능력이 있는 사람 · 정비팀 관련 업무를 수행했던 사람으 ...


  • Channel Personnel Services Atlanta, United States

    Job Description · Job DescriptionØ Identify and manage asset reliability risks that could adversely affect plant or business operations.Ø Develops and maintains plant standards (piping, tankage, insulation, etc.) that influence the selection of materials, equipment, and spare par ...

  • Channel Personnel Services

    Reliability Engineer

    2 weeks ago


    Channel Personnel Services Atlanta, United States

    Job Description · Job DescriptionØ Identify and manage asset reliability risks that could adversely affect plant or business operations.Ø Develops and maintains plant standards (piping, tankage, insulation, etc.) that influence the selection of materials, equipment, and spare par ...


  • PSC Biotech Atlanta, United States

    Job Description PSC Biotech provides the life sciences with essential services to ensure that health care products are developed, manufactured, and distributed to the highest standards, in compliance with all applicable regulatory requirements. Our goal is to skyrocket our client ...

  • STONE Resource Group

    Reliability Engineer

    3 weeks ago


    STONE Resource Group Sandy Springs, United States

    ***NOTE: This is role is unable to do C2C and our client is unable to sponsor at this time. This will also be a hybrid model in Atlanta, GA.*** · Overview · STONE Resource Group is partnered with a leading company in the HVAC Industry looking to expand their current team by addin ...


  • Diverse Lynx Atlanta, United States

    9406496 · Site Reliability Engineering (SRE) · Duration (Months): 8 · TCS - Atlanta, GA · Competencies: Digital : Site Reliability Engineering (SRE), Test Automation · Experience (Years):4-6 · Essential Skills: Site Reliability Engineering (SRE) · Desirable Skills: Site Rel ...


  • QuEST Global Services Pte. Ltd Atlanta, United States

    Quest Global is an organization at the forefront of innovation and one of the world's fastest growing engineering services firms with deep domain knowledge and recognized expertise in the top OEMs across seven industries. We are a twenty-five-year-old company on a journey to beco ...


  • The Coca-Cola Company Atlanta, United States

    Location(s): · United States of America · City/Cities: · Atlanta · Travel Required: · 00% - 25% · Relocation Provided: · Job Posting End Date: · May 3, 2024 · Shift: · First Shift (United States of America) · Job Description Summary: · We are seeking an experienced l ...


  • Honeywell Atlanta, United States

    As a Site Reliability Engineer here at Honeywell, you will play a critical role in ensuring the reliability, availability, and performance of our systems and applications. You will work closely with cross-functional teams to identify and resolve issu Reliability Engineer, Liabili ...


  • Al Nahiya Group Atlanta, United States

    Job Description Knowledge of risk and reliability management systems Experience in failure mode and effect analysis, with a solid understanding of failure mechanisms Experience in troubleshooting, analyzing and resolving engineering problems independently Effective communication ...