- Define and implement a strategic reliability vision for the trading portfolio, covering infrastructure, network connectivity, application performance, and throughput
- Lead and oversee a team of SRE engineers, providing technical direction, mentorship, and performance guidance
- Own and evolve the SLA/SLO/SLI framework, including error budgets and service health reporting
- Configure and optimize comprehensive monitoring and alerting systems across infrastructure and applications
- Drive observability best practices using APM and monitoring platforms (e.g., Dynatrace)
- Analyze application and infrastructure performance to isolate fault domains and determine root causes of critical incidents
- Lead major incident management, coordinate resolution efforts, and conduct blameless postmortems
- Participate in 24x7x365 support rotation and ensure operational excellence across the team
- Identify automation opportunities to improve reliability, scalability, and operational efficiency
- 8+ years of experience in Site Reliability Engineering, DevOps, or Production Engineering
- Proven leadership experience (technical lead or team lead), with ability to oversee and mentor engineers
- Strong hands-on experience with SLA/SLO/SLI definition, governance, and reporting
- Solid experience working in Microsoft Azure environments (IaaS, PaaS, networking, monitoring)
- Hands-on experience with Dynatrace (configuration, alerting, dashboards, performance analysis)
- Experience with observability, monitoring, and APM tools in production environments
- Ability to operate effectively under pressure in time-sensitive, high-impact environments
- Medical, Dental and Vision Insurance (Subsidized)
- Health Savings Account
- Flexible Spending Accounts (Healthcare, Dependent Care, Commuter)
- Short-Term and Long-Term Disability (Company Provided)
- Life and AD&D Insurance (Company Provided)
- Employee Assistance Program
- Unlimited access to LinkedIn learning solutions
- Matched 401(k) Retirement Savings Plan
- Paid Time Off – the employee will be eligible to accrue 15-25 paid days, depending on specific level and tenure with EPAM (accrual eligibility may change over time)
- Paid Holidays - nine (9) total per year
- Legal Plan and Identity Theft Protection
- Accident Insurance
- Employee Discounts
- Pet Insurance
- Employee Stock Purchase Program
- If otherwise eligible, participation in the discretionary annual bonus program
- If otherwise eligible and hired into a qualifying level, participation in the discretionary Long-Term Incentive (LTI) Program
-
· Reliability Engineer · The Reliability Engineer monitors equipment condition and reviews equipment performance against plan, providing consultation and technical support targeting equipment reliability and the reduction of equipment costs. This is, at a high level, through the ...
ID, USA $85,000 - $145,000 (USD) per year3 days ago
-
Position Summary · Two Sigma is a financial sciences company, combining data analysis, invention, and rigorous inquiry to help solve the toughest challenges in investment management, securities, private equity, and venture capital. · Our team of scientists, technologists, and aca ...
New York $165,000 - $250,000 (USD) per year2 weeks ago
-
We're looking for a reliability engineer to provide comprehensive support for operations and maintenance of buildings, infrastructure, and equipment assets by implementing strategic asset management plans. · Integrate data from building automation systems, fault detection engines ...
Jersey City $100,000 - $120,000 (USD) Full time3 weeks ago
-
JLL empowers you to shape a brighter way by combining world-class services, advisory, and technology for clients. We are committed to hiring talented people who thrive in meaningful careers. · ...
Jersey City $100,000 - $120,000 (USD) Full time1 month ago
-
SRE - Leading Investment Bank · Market leading investment bank requires a Systems Reliability Engineer join their Reliability & Production Engineering department. This role supports Institutional Securities and Wealth Management brokerage Operations platforms which include diver ...
New York Full time1 week ago
-
A leading quantitative investment firm is seeking Site Reliability Engineers / DevOps Engineers to help scale and support a fully automated global trading platform. · ...
New York2 months ago
-
Site Reliability Engineer - (Linux & Python/Go) · New York, NY (Hybrid, 3 days in office) · Highly competitive compensation package · Join an elite technology and research group at the forefront of global finance, where world-class engineering and quantitative research converge t ...
New York $115,000 - $185,000 (USD) per year3 days ago
-
Software Engineer, Reliability & Platform (Fintech) · Location: New York, NY (in-person) · Company: Alinea Invest · Alinea is building the investing platform for the next generation. · Post-Series A · 2.5M+ downloads · 200K+ funded accounts · As we scale, system integrity, perfor ...
New York5 days ago
-
Location: New York, NY | Salary: $170,000–$220,000 | Equity: Competitive | Full-Time · About the Role · We are partnering with a fast-growing AI-powered fintech startup to hire a Software Engineer, Reliability & Platform. This is a high-impact role for an engineer who thrives in ...
New York3 days ago
-
This is an opportunity for someone with solid infrastructure management experience but ready for more ownership—someone eager to apply their skills and be a leader in a fast-paced, high-performance environment where daily deployments and continuous iteration are the norm. · ...
New York1 month ago
- Work in company
MHE Reliability Engineer, MHE Reliability Engineering Team
Only for registered members
As a Material Handling Equipment (MHE) MHE Reliability Engineer at Amazon you will be the primary point of contact for internal customers and vendors providing structured maintenance strategies for material handling equipment to improve performance useful life of equipment reduce ...
New York, NY1 month ago
-
Scheduling Reliability Engineer · What you'll do · Serve as the Subject-Matter Expert (SME) for our enterprise scheduling platforms · Maintain, tune, and upgrade the scheduling environment to ensure stability and high availability · Develop and enhance automation solutions using ...
New York5 days ago
-
What is Kalshi? · Kalshi has defined a new category: prediction markets. Kalshi allows people to trade on the outcome of any events and turn any question about the future into a financial asset. Kalshi fought for years and legalized prediction markets in the US for the first time ...
New York $115,000 - $185,000 (USD) per year1 week ago
-
Cloaked is a privacy startup dedicated to rebuilding consumer trust in how personal data is used. We're looking for a Senior Site Reliability Engineer to take ownership of the critical infrastructure powering our privacy platform. · Define and maintain SLOs/SLAs that balance user ...
New York, NY3 weeks ago
-
An industry-leading technology-driven trading firm hires an experienced Site Reliability Engineer to join their highest-performing quantitative team. · ...
New York1 month ago
-
Join us in building the future of finance. · ...
New York $217,000 - $255,000 (USD)1 month ago
-
Software Engineer, Reliability & Platform (Fintech) · Location: New York, NY (in-person) · Company: Alinea Invest · Alinea is building the AI-powered investing platform for the next generation. · Post–Series A · 2.5M+ downloads · As we scale, system integrity, performance, and pl ...
New York5 days ago
-
We're looking for a Site Reliability Engineer (SRE) to add to our US team. · Cutover's SRE team is responsible for ensuring the reliability and performance levels of our production systems and applications. · This role will involve close collaboration with our support and enginee ...
New York $120,000 - $130,000 (USD)3 weeks ago
-
We are looking for a Site Reliability Engineer to join our growing Platform Engineering team who can cultivate our SRE philosophy processes and technologies from the ground up. · Develop and promote our SRE philosophy establishing best practices and processes that will be instrum ...
New York2 months ago
-
We are seeking a Site Reliability Engineer to provide technical expertise and coordinate day-to-day deliverables for the team. The chosen candidate will assist in the technical design of large business systems; build applications, interfaces between applications and understand da ...
New York $82,419 - $107,145 (USD)1 month ago
-
+ Job summary: This is a Software Production Management & Reliability Engineering position at Director level. The role involves overseeing production environments, ensuring operational reliability of deployed software, implementing strategies for performance optimization and mini ...
New York $120,000 - $165,000 (USD) Full time1 month ago
Lead Site Reliability Engineer - New York - EPAM Systems
Description
Join our team as a Lead Site Reliability Engineer to drive system reliability, observability, and performance monitoring for mission-critical digital trading products.
You will lead monitoring initiatives in a high-availability trading environment, ensuring stable connectivity to external partners while proactively identifying opportunities for continuous improvement. At EPAM, you'll work on cutting-edge technologies, solve complex challenges, and shape the future of digital innovation. With access to continuous learning, mentorship, and global projects, your expertise will drive meaningful change.
Req#
Responsibilities
Requirements
We offer
For remote work in New York City only.
EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our clients, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.
Engineer the Future with a Career at EPAM )
This posting includes a good faith range of the salary EPAM would reasonably expect to pay the selected candidate. The range provided reflects base salary only. Individual compensation offers within the range are based on a variety of factors, including, but not limited to: geographic location, experience, credentials, education, training; the demand for the role; and overall business and labor market considerations. Most candidates are hired at a salary within the range disclosed. Salary range: $140,000 - $155,000. In addition, the details highlighted in this job posting above are a general description of all other expected benefits and compensation for the position.
Applications will be accepted on a rolling basis.
It is unlawful in Massachusetts to require or administer a lie detector test as a condition of employment or continued employment. An employer who violates this law shall be subject to criminal penalties and civil liability.
EPAM will not provide new H-1B visa sponsorship for this position. Candidates with existing transferable H-1B status may be considered.
EPAM Systems, Inc. is an equal opportunity employer. We recognize the value of diversity and inclusion in creating success for our customers, business partners, shareholders, employees and communities. We are committed to recruiting, hiring, developing and promoting employees without discrimination. As a global employer, this commitment includes complying with all laws in the countries in which we operate. Nevertheless, we believe equal employment practices should not be limited to what the law requires. Equal opportunity and inclusion are essential to motivate, empower and recognize the best in everyone.
At EPAM, employment actions are based on individual qualifications, without regard to race, color, religion, creed, gender, pregnancy status, sexual orientation, gender identity, gender expression, marital or familial status, national origin, ancestry, genetics, age, disability status, veteran status, citizenship status when otherwise legally able to work, or any other characteristic protected by law.
#J-18808-Ljbffr
-
Reliability Engineer
Only for registered members ID, USA
-
Reliability Engineer
Only for registered members New York
-
Reliability Engineer
Full time Only for registered members Jersey City
-
Reliability Engineer
Full time Only for registered members Jersey City
-
Systems Reliability Engineer
Full time Only for registered members New York
-
Site Reliability Engineer
Only for registered members New York
-
Site Reliability Engineer
Only for registered members New York
-
Software Engineer, Reliability
Only for registered members New York
-
Software Engineer, Reliability
Only for registered members New York
-
Site Reliability Engineer
Only for registered members New York
-
MHE Reliability Engineer, MHE Reliability Engineering Team
Only for registered members New York, NY
-
Scheduling Reliability Engineer
Only for registered members New York
-
Site Reliability Engineer
Only for registered members New York
-
Site Reliability Engineer
Only for registered members New York, NY
-
Site Reliability Engineer
Only for registered members New York
-
Staff Reliability Engineer
Only for registered members New York
-
Software Engineer, Reliability
Only for registered members New York
-
Site Reliability Engineer
Only for registered members New York
-
Site Reliability Engineer
Only for registered members New York
-
Site Reliability Engineer
Only for registered members New York
-
Site Reliability Engineer
Full time Only for registered members New York