- Design and manage Java based microservices, bash scripts, Redis, High-Availability design, while strictly adhering to Site Reliability Engineering (SRE) principles.
- Thrive in high-pressure environments, working swiftly and reliably to maintain system integrity and meet service level objectives (SLOs) and service level indicators (SLIs).
- Proactively identify and address potential issues before they impact operations, utilizing observability tools like New Relic, Scalyr/Splunk, bash scripts, and Python scripts.
- Lead initiatives to enhance current systems and implement innovative solutions in collaboration with a fast-paced, mission-driven team, focusing on the implementation of SRE best practices.
- Conduct thorough root-cause analyses for production incidents and generate high-quality RCA reports, leveraging SRE methodologies to prevent recurrence.
- Apply software engineering principles to rectify operational challenges and optimize system performance, with a specific focus on implementing SRE-driven solutions.
- Ensure the availability, latency, performance, efficiency, and security of our infrastructure, adhering rigorously to SRE principles and best practices.
- Design and maintain robust production monitoring systems to ensure timely detection and resolution of issues, following SRE guidelines for effective monitoring and alerting.
- Utilize a diverse array of tools to troubleshoot performance and stability issues effectively, employing SRE methodologies to identify and mitigate bottlenecks.
- Evaluate and enhance application and environment security measures, integrating SRE-driven security practices into the development and deployment pipelines.
- Provide support for globally distributed, multi-cloud (public and/or private) environments, implementing SRE strategies for resilience and fault tolerance.
- Automate repetitive tasks at scale to streamline operational workflows and enhance efficiency, focusing on the implementation of SRE-driven automation solutions.
- Adhere to change management processes during implementations and utilize version control for application infrastructure, following SRE principles for reliable and auditable change management.
- Foster a SRE mindset throughout the organization, promoting collaboration and shared responsibility for reliability and performance
- Bachelor's Degree in Computer Science or related field, or foreign equivalent.
- Demonstrated curiosity and self-drive to tackle complex challenges and drive change in a diverse organizational landscape.
- Excellent written and verbal communication skills, with the ability to effectively communicate with engineering management, developers, and leadership.
- Proven ability to adapt to new technologies and learn quickly.
- Minimum of 5 years of experience in Site Reliability Engineering (SRE) or related roles.
- Collaborate within a diverse and global team environment.
- Participate in cross-training with other team members across different regions.
- Rotate in an on-call schedule as required to ensure 24/7 availability and support for critical systems.
-
Site Reliability Engineer
1 week ago
Amtex Systems Inc. Marlborough, United StatesJob Title: Lead Site Reliability Engineer · Duration: 6 months Contract to hire · Location: Marlborough, MA (Hybrid) · Responsibilities: · Design and manage Java based microservices, bash scripts, Redis, High-Availability design, while strictly adhering to Site Reliability Engine ...
-
Site Reliability Engineer
1 week ago
Amtex Systems Marlborough, United StatesJob Title: Lead Site Reliability Engineer · Duration: 6 months Contract to hire · Location: Marlborough, MA (Hybrid) · Responsibilities: · Design and manage Java based microservices, bash scripts, Redis, High-Availability design, while strictly adhering to Site Reliability Engin ...
-
Reliability Engineering
1 week ago
Akamai Cambridge, United StatesJob Title: Manager.Senior.Site Reliability Engineering · Work Location: 145 Broadway, Cambridge, MA 02142 · Job Description: · Akamai Technologies, Inc. is hiring for the following role in · Cambridge, MA · (multiple openings): · Manager.Senior.Site Reliability Engineering · Co ...
-
Reliability Engineering
1 week ago
Akamai Cambridge, United StatesJob Title: Manager.Senior.Site Reliability Engineering · Work Location: 145 Broadway, Cambridge, MA 02142 · Job Description: · Akamai Technologies, Inc. is hiring for the following role in Cambridge, MA (multiple openings): · Manager.Senior.Site Reliability Engineering · Col ...
-
Equipment Reliability Engineer
1 week ago
Entegris, Inc. Bedford, United StatesJob Title: · Equipment Reliability Engineer · Job Description: · The Role: · The Entegris Manufacturing site in Bedford, MA is seeking for Equipment Reliability Engineer to join our team. · The Equipment Reliability engineer is responsible for improving the reliability of multipl ...
-
Equipment Reliability Engineer
1 week ago
Entegris Bedford, United StatesThe Role: · The Entegris Manufacturing site in Bedford, MA is seeking for Equipment Reliability Engineer to join our team. · The Equipment Reliability engineer is responsible for improving the reliability of multiple manufacturing equipment by increasing Mean Time to Failure wi ...
-
Equipment Reliability Engineer
1 week ago
Entegris Bedford, United States· At Entegris we are committed to providing equal opportunity to all employees and applicants. Our policy is to recruit, hire, train, and reward employees for their individual abilities, achievements and experience without regard to race, color, religion, sexual orientation, age ...
-
Equipment Reliability Engineer
1 week ago
Entegris Bedford, United States· At Entegris we are committed to providing equal opportunity to all employees and applicants. Our policy is to recruit, hire, train, and reward employees for their individual abilities, achievements and experience without regard to race, color, religion, sexual orientation, age ...
-
Senior Reliability Engineer
1 week ago
Takeda Pharmaceutical Company Ltd Cambridge, United StatesBy clicking the "Apply" button, I understand that my employment application process with Takeda will commence and that the information I provide in my application will be processed in line with Takeda's Privacy Notice and Terms of Use. I further attest that all information I subm ...
-
Site Reliability Engineer
2 days ago
Barracuda Networks Inc Chelmsford, United StatesJob ID: · Come Join Our Passionate Team At Barracuda, we make the world a safer place. We believe every business deserves access to cloud-enabled, enterprise-grade security solutions that are easy to buy, deploy, and use. We protect email, networks, data and applications with in ...
-
Site Reliability Engineer
2 weeks ago
CVS Pharmacy Woonsocket, United StatesBring your heart to CVS Health. Every one of us at CVS Health shares a single, clear purpose: Bringing our heart to every moment of your health. This purpose guides our commitment to deliver enhanced human-centric health care for a rapidly changing w Reliability Engineer, Liabili ...
-
Principal Reliability Engineer
5 days ago
Taleo Billerica, United StatesCareers that Change Lives · Engineers and Scientists create our market-leading portfolio of innovations. Join us to make a lasting impact. Help bring the next generation of life-changing medical technology to patients worldwide. Together, we can change healthcare worldwide. At M ...
-
Senior Reliability Engineer
2 days ago
Takeda Pharmaceutical Company Ltd Hopkinton, United StatesBy clicking the Apply button, I understand that my employment application process with Takeda will commence and that the information I provide in my application will be processed in line with Takedas Privacy Notice and Terms of Use. I further attest that all information I submit ...
-
Site Reliability Engineer
13 hours ago
SS&C Technologies Holdings, Inc. Watertown, United StatesDay to Day:Responds to and resolves escalated incidents for customer issues or monitoring alerts. In-depth analysis of incident root cause;Working with R&D and architecture teams on defects and runtime inefficiencies identified in the production envi Reliability Engineer, Liabili ...
-
Senior Reliability Engineer
1 week ago
Takeda Pharmaceutical Company Ltd Cambridge, United StatesBy clicking the Apply button, I understand that my employment application process with Takeda will commence and that the information I provide in my application will be processed in line with Takedas Privacy Notice and Terms of Use. I further attest that all information I submit ...
-
Site Reliability Engineer
5 days ago
SS&C Technologies Holdings, Inc. Watertown, United StatesSite Reliability Engineer (SRE Co-Op (June-Dec)Location: Waltham, MA (hybrid)Get To Know The Team: The SS&C Intralinks SRE team is considered the guardians of the production application platform. The main driving factor of the team is to ensu Reliability Engineer, Liability, Reli ...
-
WBG Reliability Engineer
1 week ago
Analog Devices Wilmington, United StatesAnalog Devices, Inc. (NASDAQ: ADI) is a global semiconductor leader that bridges the physical and digital worlds to enable breakthroughs at the Intelligent Edge. ADI combines analog, digital, and software technologies into solutions that help drive advancements in digitized facto ...
-
Principal Reliability Engineer
2 weeks ago
Medtronic Billerica, United States**Careers that Change Lives** · Engineers and Scientists create our market-leading portfolio of innovations. Join us to make a lasting impact. Help bring the next generation of life-changing medical technology to patients worldwide. Together, we can change healthcare worldwide. A ...
-
Site Reliability Engineer
1 day ago
Boston Dynamics Watertown, United StatesAs a Site Reliability Engineer in the Connected Robots team, you will play a key role in ensuring the reliability, performance, and scalability of our cloud-hosted fleet and data management software. In this role you will develop tools and processes to measure and improve service ...
-
Senior Reliability Engineer
27 minutes ago
Takeda Pharmaceutical Company Ltd Cambridge, United StatesBy clicking the "Apply" button, I understand that my employment application process with Takeda will commence and that the information I provide in my application will be processed in line with Takeda's Privacy Notice and Terms of Use . I further attest that all information I sub ...
Lead Site Reliability Engineer - Stow, United States - BJ's Wholesale Club
Description
Join our team of more than 34,000 team members, supporting our members and communities in our Club Support Center, 235+ clubs and eight distribution centers. BJs Wholesale Club offers a collaborative and inclusive environment where all team members can learn, grow and be their authentic selves. Together, were committed to providing outstanding service and convenience to our members, helping them save on the products and services they need for their families and homes.
The Benefits of working at BJs
BJs pays weekly
Generous time off programs to support busy lifestyles*
o Vacation, Personal, Holiday, Sick, Bereavement Leave, Jury Duty
Benefit plans for your changing needs*
o Three medical plans**, Health Reimbursement Account (HRA), Health Savings Account (HSA), two dental plans, flexible spending
*eligibility requirements vary by position
**medical plans vary by location
As a Lead Site Reliability Engineer, you will be responsible for designing, building, monitoring, and continuously improving our ecommerce platform's infrastructure and processes. Leveraging your expertise in observability tools such as New Relic, Scalyr/Splunk, bash scripts, and Python scripts, you will play a pivotal role in ensuring the reliability and performance of our Java microservices-based architecture.
Key Responsibilities:
Qualifications:
Job Conditions: