- Ensure High Availability: Implement and maintain resilient cloud architectures, monitor system performance, and proactively identify and resolve potential bottlenecks or points of failure.
- Incident Management: Play an active role in production on-call, responding swiftly to troubleshoot and resolve production issues. Minimize service disruptions and downtime by conducting thorough triaging and debugging of product or system issues. Continuously optimize the on-call process for sustainability and efficiency.
- Automation and Tooling: Develop and maintain automation scripts, tools, and processes to streamline system deployment, monitoring, and management tasks. Your contributions will be vital in efficiently scaling cloud operations.
- Performance Optimization: Optimize cloud infrastructure and applications for performance, scalability, and cost-effectiveness.
- Security and Compliance: Collaborate with security engineers to implement best practices and ensure compliance with security standards and policies.
- Monitoring and Alerting: Design and configure advanced monitoring systems to gain insights into system behavior, set up alerts, and respond proactively to potential issues. Create and maintain comprehensive dashboards and playbooks for production on-call.
- Software Development Consultation: Engage actively in the entire software development lifecycle. Participate in system design reviews and provide valuable Site Reliability Engineer (SRE) insights during launch reviews, influencing and enhancing system architecture.
- Bachelor's degree in Computer Science, a related field, or equivalent practical experience.
- 3+ years of professional experience maintaining production systems on Cloud based services and infrastructure.
- 8+ years of software development experience in one or more programming languages with a primary focus on leveraging, working on cloud-based services and infrastructure.
- Strong knowledge of cloud platforms such as Google Cloud Platform, AWS, or Azure. We prefer AWS experience but will also entertain GCP or Azure
- Practical experience with containerization technologies, including Docker and Kubernetes.
- Familiarity with Python, Bash scripting and Ansible
- Familiarity with infrastructure as code tools like Terraform is essential.
- Solid understanding of databases, networking, security principles, and best practices.
- Proficiency in using monitoring and alerting tools to detect and respond to potential issues effectively.
- AWS Certifications (such as Solutions Architect, Security, etc.)
- Experience in a regulated industry or healthcare field
-
Junior Reliability Engineer
2 weeks ago
JLL Raleigh, United States Full timeJLL supports the Whole You, personally and professionally. · Our people at JLL are shaping the future of real estate for a better world by combining world class services, advisory and technology to our clients. We are committed to hiring the best, most talented people in our ind ...
-
Site Reliability Engineer
1 week ago
Bandwidth Raleigh, United StatesApply Now · Site Reliability Engineer at Bandwidth · Raleigh, NC · Site Reliability Engineer (Raleigh, NC) Duties: Work closely with leadership and internal partners to ensure that software meets security, SLA, performance, and capacity requirements. Set up and maintain monito ...
-
Site Reliability Engineer
1 week ago
Bandwidth Raleigh, United StatesApply Now Site Reliability Engineer at Bandwidth · Raleigh, NC · Site Reliability Engineer (Raleigh, NC) Duties: Work closely with leadership and internal partners to ensure that software meets security, SLA, performance, and capacity requirements. Set up and maintain monitorin ...
-
Site Reliability Engineer
1 week ago
Bandwidth Inc. Raleigh, United States Full timeSite Reliability Engineer (Raleigh, NC) Duties: Work closely with leadership and internal partners to ensure that software meets security, SLA, performance, and capacity requirements. Set up and maintain monitoring tools and systems to detect issues using Datadog Monitors and Ale ...
-
Site Reliability Engineer
4 weeks ago
Cisco Raleigh, United StatesWho We Are · Today's results-oriented business environment is more than that - it's a period of disruption between the pandemic, global business change and internal process complexity. For us to focus on simplicity and the best customer experience, we need great talent and the ...
-
Site Reliability Engineer
4 weeks ago
Hawthorne Executive Search Raleigh, United Statesand Requirements · Our client delivers technology solutions that enable community banks to better serve the communities in which they do business. Their technology and development culture is built on the tenants of innovation, alignment, collaboration, execution, and FUN We hope ...
-
Site Reliability Engineer
1 week ago
Qualys Raleigh, United StatesCome work at a place where innovation and teamwork come together to support the most exciting missions in the world · Site Reliability Engineer, Cloud Platform · ** · The successful applicant will be performing work in FedRAMP environments, and therefore, must be a U.S. Person ( ...
-
Sr Site Reliability Engineer
3 weeks ago
Allscripts Raleigh, United StatesWelcome to Veradigm Our Mission is to be the most trusted provider of innovative solutions that empower all stakeholders across the healthcare continuum to deliver world-class outcomes. Our Vision is a Connected Community of Health that spans continents and borders. With the larg ...
-
Reliability Engineer
1 week ago
DSJ Global Raleigh, United StatesJob Title: · Reliability Engineer · Industry: · Chemicals/Food & Beverage · Location : North Carolina · DSJ Global is currently partnered with a Fortune 500 manufacturing company based out of North Carolina who are looking for their next Reliability Engineer. As the Reliability ...
-
Sr Site Reliability Engineer
2 weeks ago
Allscripts Raleigh, United StatesWelcome to Veradigm, where our Mission is transforming health, insightfully. Join the Veradigm team and help solve many of today's healthcare challenges being addressed by biopharma, health plans, healthcare providers, health technology partners, and the patients they serve. At V ...
-
Mallinckrodt Pharmaceuticals Raleigh, United States Full timeDescription · The Company · Mallinckrodt is a company united around a central pursuit – improving outcomes for underserved patients with severe and critical conditions. As a purpose-driven organization with a broad network and relentless determination, we create products, deli ...
-
Sr. Site Reliability Engineer
2 weeks ago
SitusAMC Raleigh, United StatesSitusAMC is where the best and most passionate people come to transform our client's businesses and their own careers. Whether you're a real estate veteran, a passionate technologist, or looking to get your start, join us as we work together to realize opportunities for everyone, ...
-
Reliability Engineer
2 weeks ago
CBRE Garner, United StatesReliability Engineer · Job ID · 163999 · Posted · 18-Apr-2024 · Service line · GWS Segment · Role type · Full-time · Areas of Interest · Engineering/Maintenance · Location(s) · Garner - North Carolina - United States of America · At **CBRE Global Workplace Solutions ...
-
Site Reliability Engineer
4 weeks ago
Bandwidth Recruitment Raleigh, United StatesSite Reliability Engineer · (Raleigh, NC) Duties: Work closely with leadership and internal partners to ensure that software meets security, SLA, performance, and capacity requirements. Set up and maintain monitoring tools and systems to detect issues using Datadog Monitors and ...
-
Reliability Engineer
1 week ago
CBRE Garner, United StatesReliability Engineer · Job ID · 163999 · Posted · 18-Apr-2024 · Service line · GWS Segment · Role type · Full-time · Areas of Interest · Engineering/Maintenance · Location(s) · Garner - North Carolina - United States of America · At **CBRE Global Workplace Solutions ...
-
Site Reliability Engineer
3 weeks ago
Cisco Raleigh, United StatesWho We Are · Today's business environment is more than that - it's a period of disruption between the pandemic, global business change and internal process complexity. For us to focus on simplicity and the best customer experience, we need great talent and the right skills to be ...
-
Site Reliability Engineer
2 weeks ago
Bandwidth Raleigh, United StatesJob Description · Job DescriptionSite Reliability Engineer (Raleigh, NC) Duties: Work closely with leadership and internal partners to ensure that software meets security, SLA, performance, and capacity requirements. Set up and maintain monitoring tools and systems to detect issu ...
-
Site Reliability Engineer
3 weeks ago
Qualys Raleigh, United StatesCome work at a place where innovation and teamwork come together to support the most exciting missions in the world · Site Reliability Engineer, Cloud Platform · ** The successful applicant will be performing work in FedRAMP environments, and therefore, must be a U.S. Person (i ...
-
Site Reliability Engineer
2 weeks ago
Cisco Raleigh, United StatesLocation: · RTP, North Carolina, US · Area of Interest · Job Type · Professional · Cloud and Data Center, Software Development · Job Id · 1421649 · Who We Are · Today's business environment is more than that – it's a period of disruption between the pandemic, global business chan ...
-
Site Reliability Engineer
4 weeks ago
Booz Allen Hamilton Raleigh, United StatesSite Reliability Engineer page is loaded · Site Reliability Engineer · Apply · locations · Fort Bragg, NC · time type · Full time · posted on · Posted 2 Days Ago · job requisition id · R · Site Reliability Engineer · The Opportunity: · Everyone is trying to "harness t ...
Staff Site Reliability Engineer - Raleigh, United States - GRAIL, Inc.
Description
GRAIL is seeking a Staff Software Engineer in our Site Reliability Engineering (SRE) team to help us improve security and reliability of production systems that are critical for our mission to detect cancer early and save lives. You will contribute to the architecture, design, development, implementation, and be responsible for secure, healthy, and reliable operation of critical cloud-based infrastructure, services, and applications. You are someone who enjoys learning and implementing best industry technology trends and practices. You foster and contribute to the creative and collaborative culture to deliver results. You embrace ambiguity and enjoy exploring new technologies delivering robust, scalable solutions.
This is a hybrid role and requires you to be onsite 2 days a week in Menlo Park, CA
Responsibilities