-
DBRE - Database Reliability Engineer
5 days ago
Carreiras SoftExpert New York, United StatesSomos referência no mercado como uma empresa líder em soluções de software para a gestão integrada , contando com mais de 40 componentes dedicados à administração. Nosso objetivo é proporcionar às empresas a conformidade, inovação e transformação digital de processos, garantindo ...
-
Site Reliability Engineer
3 weeks ago
Salling Group A/S New York, United StatesJesteśmy oddziałem największej duńskiej firmy z branży handlu detalicznego. Działamy jako centrum usług biznesowych, w którym obsługujemy procesy zachodzące w naszych europejskich spółkach, takich jak: Bilka, Fotex, Salling czy dobrze znanej w Polsce — sieci dyskontowej Netto. · ...
-
Reliability Engineer
2 weeks ago
Executive Alliance Pine Brook, United StatesSalary : $ $130000 · Essential Functions · Work with IPT to identify reliability critical items and any non-conformances and offer design alternatives. · Generate reliability, maintainability and system safety reports for IPT and customer review. · Perform system safety analy ...
-
Senior Reliability and Safety Engineer
2 weeks ago
Hepco Livingston, United StatesOverview: · Our client is seeking to hire a Senior Reliability and Safety Engineer with experience in the aerospace industry to join their team. The successful candidate will be responsible for safety assessment, reliability predictions, and failure mode analysis required for new ...
-
Senior Reliability and Safety Engineer
3 weeks ago
Hepco Livingston, United StatesOverview: · Our client is seeking to hire a Senior Reliability and Safety Engineer with experience in the aerospace industry to join their team. The successful candidate will be responsible for safety assessment, reliability predictions, and failure mode analysis required for ne ...
-
Reliability Engineer
17 hours ago
Enser Boonton, United StatesJob Description · Job Description · ** · Reliability Engineer** · In this position, you will work in a dynamic people-focused environment where you will work directly with experienced engineers and scientists to assist with reliability analysis. You will also be responsible fo ...
-
Reliability Engineer
3 weeks ago
Enser Corporation Boonton, United StatesJob Description · Job Description · Reliability Engineer · In this position, you will work in a dynamic people-focused environment where you will work directly with experienced engineers and scientists to assist with reliability analysis. You will also be responsible for attendin ...
-
Senior Reliability and Safety Engineer
2 weeks ago
Hepco East Hanover, United StatesCompany: · HEPCO · Overview: · Design and develop products for aerospace and military weapon systems · Responsibilities include system safety assessment, reliability predictions and failure mode analysis for new development programs · Work with IPT to identify safety critical it ...
-
CoreWeave Roseland, United StatesCoreWeave is a specialized cloud provider, delivering a massive scale of GPU compute resources on top of the industry's fastest and most flexible infrastructure. CoreWeave builds cloud solutions for compute intensive use cases - VFX and rendering, machine learning and AI, batch p ...
-
Reliability Engineer
2 weeks ago
Sodexo Rahway, United States PermanentUnit Description: Sodexo is currently seeking a Reliability Engineer to provide support for our Life Science Client located in Rahway, New Jersey. The Reliability Engineer will play a crucial role in managing and maintaining a substantial backlog of initiatives and strategic chan ...
-
Engineer Reliability
3 weeks ago
JetBlue Airways Queens, United StatesPosition Summary · The Engineer Reliability reports to the Manager Reliability and is responsible for aircraft, power plant and component trend/performance analysis in support of the Reliability portion of the Continuing Analysis Surveillance System (CASS). · Essential Responsib ...
-
Site Reliability Engineer, Observability
3 weeks ago
CoreWeave Roseland, United StatesJob Description · Job Description · CoreWeave is a specialized cloud provider, delivering a massive scale of GPU compute resources on top of the industry's fastest and most flexible infrastructure. CoreWeave builds cloud solutions for compute intensive use cases — VFX and rende ...
-
Reliability and Safety Engineer
2 weeks ago
Marotta Controls Parsippany, United StatesCome grow with Marotta One of NJ's fastest growing technology companies, · named a New Jersey Top Workplace for 2022 & 2023, and a "Made in New Jersey" Manufacturer of the Year Award Winner. · You will have room to grow and be a part of an exciting team, all within a warm and w ...
-
Reliability Engineer
3 weeks ago
MSD Malaysia Rahway, United Stateslocations · NLD - North Brabant - Oss (Vollenhovermeer) · time type · Full time · posted on · Posted 5 Days Ago · job requisition id · R293237 · Job Description · Welkom in ons team · Een competitief salaris. · Goed bonus Plan · Search Firm Representatives Please Read Careful ...
-
Reliability Engineer
1 day ago
Jones Lange Lasalle, Inc. New York, United StatesJLL supports the Whole You, personally and professionally. Our people at JLL are shaping the future of real estate for a better world by combining world class services, advisory and technology to our clients. We are committed to hiring the best, most Reliability Engineer, Liabili ...
-
Sr. Engineer Reliability
5 days ago
Dynamics ATS Montville, United StatesSr. Engineer Reliability & Safety · Location: Parsippany, NJ · Salary : $100,000-$130,000 · Type: Direct · Enser is an Engineering Services Company that provides Staffing Support. This position is not internal to Enser. Please No Agencies. · Overview: · Marotta Controls ...
-
Reliability Engineer
3 weeks ago
Mini-Circuits Brooklyn, United States· Mini-Circuits designs, manufactures and distributes integrated circuits, modules, and sub-systems for high-performance radio frequency (RF) and microwave applications. With design, sales and manufacturing locations in over 30 countries, Mini-Circuits' products are used in a ra ...
-
Sr. Engineer Reliability
2 weeks ago
Dynamics ATS Montville, United StatesSr. Engineer Reliability & Safety · Location: · Parsippany, NJ · Salary · : $100,000-$130,000 · Type: · Direct · Enser is an Engineering Services Company that provides Staffing Support. This position is not internal to Enser. Please No Agencies. · Overview: · Marotta Con ...
-
Facility & Reliability Engineer
3 weeks ago
Innova Solutions Summit, United StatesInnova Solutions is immediately hiring for a Facility & Reliability Engineer · Position type: Full-time Contract · Duration: 12 Months · Location: Summit, NJ Onsite) · As a Facility & Reliability Engineer, you will: · The purpose of the Facility and Reliability Engineer is ...
-
Reliability Engineer
3 weeks ago
Madison Approach Yonkers, United StatesJob Description · Job DescriptionOur client, a manufacturing company in Yonkers, NY, is seeking a Reliability Engineer for their Quality Assurance team. BS in Engineering and a minimum of 10 years' experience providing engineering support to manufacturing in the transportation in ...
Site Reliability Engineer - Roseland, United States - CoreWeave
Description
CoreWeave is a specialized cloud provider, delivering a massive scale of GPU compute resources on top of the industry's fastest and most flexible infrastructure.
CoreWeave builds cloud solutions for compute intensive use cases — VFX and rendering, machine learning and AI, batch processing, and Pixel Streaming — that are up to 35 times faster and 80% less expensive than the large, generalized public clouds.
Learn more atAbout the role:
The Cloud Operations Team is the heart of CoreWeave's operational practice. In this role, you'll help define and shape how Site Reliability Engineering (SRE) is implemented at CoreWeave.
The Cloud Operations team defines and implements tooling and processes that enable operational best practices and continual improvement across all engineering teams.
An 'SRE of SREs,' you'll define and implement system and workflow automation ensuring service owners can rapidly identify and mitigate availability and performance regressions.
Collaborating across engineering, you support service owning SRE's with the 'picks and shovels' they need to excel at running their services.
You will work with a team of 8-10 mixed-specialization engineers and have the opportunity to work on the full gamut of rewarding challenges that come with building the AI Cloud in a communicative, supportive, and high-performing environment.
With a customer first mindset, establish reliability and quality assessment patterns for all CoreWeave systems.
Improve the performance, security, reliability, and scalability of internal and externally facing services.
Develop dashboards, alerts, automated remediation, and insights into the customer experience using observability tools.
Create and maintain Kubernetes operators, custom controllers, and other tools to intelligently scale our operational capability.
Establish and integrate incident and change management tools and workflows.
Act as Incident Commander for priority incidents and lead post mortems.
Participate in on-call rotation as needed as we establish and operationalize this new team
Enable and evangelize reliability engineering across CoreWeave's engineering teams.
Grow, change, invest in your teammates, be invested-in, share your ideas, listen to others, be curious, have fun, and, above all, be yourself.
Wondering if you're a good fit?
We believe in investing in our people, and value candidates who can bring their own diversified experiences to our teams – even if you aren't a 100% skill or experience match.
You have experience operating services in production and are interested in driving engineering practices such as: reliability at scale, testing (load, recovery, system etc.), progressive deployments, error budgets, observability, and fault-tolerant design.
You've done some Linux shell scripting and/or can navigate a *nix-based operating system (with the right cheat sheet, if required).
You are familiar with debugging and administration of linux and Kubernetes environments.You're comfortable with the idea of codifying practices into Kubernetes controllers, operators, and other applications using a modern programming language.
You have experience with incident management for your team or an organization.You're comfortable in open source environments.
You're excited to join a team with diverse perspectives and backgrounds that believe in tackling challenges, growing hand in hand, and winning together.
Why CoreWeave?
At CoreWeave, we work hard, have fun, and move fast We're in an exciting stage of hyper-growth that you will not want to miss out on.
Our team cares deeply about how we build our product and how we work together, which is represented through our core values:
Be Curious at your Core
Act like an Owner
Empower Employees
Deliver Best In-Class Client Experience
Achieve More Together
We support and encourage an entrepreneurial outlook and independent thinking. We foster an environment that encourages collaboration and provides the opportunity to develop innovative solutions to complex problems. As we get set for take off, the growth opportunities within the organization are constantly expanding.
You will be surrounded by some of the best talent in the industry, who will want to learn from you, too.
Come join usBenefits
We offer a competitive salary and benefits, including:
Medical, dental and vision insurance - 100% paid for the employee
Life Insurance
Short and long-term disability insurance
Flexible Spending Account
Flexible, full-service childcare support with Kinside
401(k) with a generous employer match
Flexible PTO
Catered lunch each day in our offices
Weekly massages in NJ office
A casual work environment
Work culture focused on innovative disruption
California Consumer Privacy Act - California applicants only
CoreWeave is an equal opportunity employer, committed to our diversity and inclusiveness.
We will consider all qualified applicants without regard to race, color, nationality, gender, gender identity or expression, sexual orientation, religion, disability or age.
#J-18808-Ljbffr