Hadoop Site Reliability Engineer - Bellevue
2 days ago

Job description
We are seeking a Hadoop Site Reliability Engineer with hands-on experience in Big Data ecosystems and cloud-based Hadoop platforms.The ideal candidate will provide administration, monitoring, and performance tuning for Hadoop, Spark, Hive, HBase, and Kafka workloads while supporting enterprise customers in production environments.
This role requires expertise in Hadoop administration, cloud platforms (AWS, CloudEra, HDInsight), troubleshooting production issues, and building tools to improve system reliability and supportability.
Required Technical / Functional SkillsExpertise in Big Data and Hadoop Ecosystems (Hortonworks, CloudEra)
Hands-on experience with Kafka for streaming data
Experience with HBase administration
Knowledge of Spark, Hive, and Hadoop performance tuning
Familiarity with cloud platforms like AWS HDInsight
Strong problem-solving and root-cause analysis skills
Ability to monitor and maintain cluster health for enterprise workloads
Experience building tools to improve system supportability and debuggability
Roles & Responsibilities
Perform Hadoop administration for enterprise workloads (CloudEra, Hortonworks, HDInsight, AWS)
Support HDInsight product team and resolve end-customer issues via ICMs
Run prototypes for migrating Big Data workloads to new platforms
Troubleshoot production issues, identify root causes, and provide timely mitigation
Tune Spark, Hive, and Hadoop jobs for optimal performance
Build tools and services to improve system debuggability and supportability
Monitor cluster health, ensure reliability, and provide proactive support for top customers
Technologies / Tools
Hadoop Ecosystem (HDFS, YARN, MapReduce)
Spark, Hive
HBase
Kafka
Cloud Platforms:
AWS, CloudEra, HDInsight
Performance Tuning & Monitoring Tools
Similar jobs
+Job summary · Cintas is seeking a Reliability Engineer to assist Group Vice President(s), location management teams and Corporate Quality and Engineering in overseeing long range capacity planning for assigned locations.++communicates with executives, management and operations t ...
3 weeks ago
Cintas is seeking a Reliability Engineer to assist Group Vice President(s), location management teams and Corporate Quality and Engineering in overseeing long range capacity planning for assigned locations. · ...
3 weeks ago
MHE Reliability Engineer, MHE Reliability Engineering Team
Only for registered members
Lensa does not hire directly for these jobs but promotes them on LinkedIn behalf of clients recruitment agencies marketing partners Description As a Material Handling Equipment MHE MHE Reliability Engineer you will be primary point of contact for internal customers and vendors pr ...
1 month ago
We're building a Cyber Threat Intelligence program that doesn't look like anyone else's. · We are redefining how intelligence at scale works inside a modern enterprise — starting with a custom, high-fidelity sensor platform that feeds our analysts the kind of telemetry most teams ...
1 month ago
We are looking for someone who can wear multiple hats depending on the task at hand has a can do attitude with a demonstrated background in reliability engineering high attention to detail thirst for knowledge and an inherent interest in all aspects of engineering. · We are seeki ...
1 month ago
We are looking for someone who can wear multiple hats depending on the task at hand, has a \ ...
4 weeks ago
Windows Server Administration Microsoft Azure Azure AAD DFSR DHCP DNS KMS WSUS TCP/IP Hyper-V High Availability Clusters Powershell DevOps Mindset Good communication Maintain and update documentation of projects and status using multiple tools · Azure AAD · DNS, · ...
1 month ago
Apply via Dice today. Work in a fast-paced environment. Participate in technical operations and rotations in response to performance and reliability issues. · 3+ years experience working with Unix Linux systems from kernel to shell and beyond with experience working with system l ...
1 week ago
We are looking for a Reliability Engineer to join our team at Meta Reality Labs. · As a Reliability Engineer in this role you will take a critical role in bringing reliable new AI-native augmented/virtual reality and wearable products. · The ideal candidate has a 'can do' attitud ...
1 month ago
As a Reliability Engineer in Meta Reality Labs, you will take a critical role in bringing reliable new AI-native augmented/virtual reality and wearable products. · We are looking for someone who can wear multiple hats depending on the task at hand, has a \ ...
1 month ago
MHE Reliability Engineer, MHE Reliability Engineering Team
Only for registered members
As a Material Handling Equipment (MHE) MHE Reliability Engineer, you will be the primary point of contact for internal customers and vendors. · ...
3 weeks ago
The Senior Availability / Reliability Engineer leads availability modeling, reliability analysis, · and mitigation planning for Fleet's behind-the-meter (BTM) power solutions.The role partners with engineering, · construction, commissioning, · and operations to identify risks ear ...
3 weeks ago
The Trade Desk is changing the way global brands and their agencies advertise to audiences around the world.We value the unique experiences and perspectives that each person brings to The Trade Desk, and we are committed to fostering inclusive spaces where everyone can bring thei ...
1 month ago
We are a SaaS company specializing in securing large-scale systems and are seeking a highly skilled Senior Site Reliability Engineer to join our team. · At Okta, we celebrate a variety of perspectives and experiences.We are not looking for someone who checks every single box - we ...
1 week ago
We're building a world where Identity belongs to you. · Okta is seeking a Senior Site Reliability Engineer to join our team. The role will involve designing and maintaining highly reliable infrastructure and automating manual processes. · ...
1 week ago
The Senior Reliability Engineer leads availability modeling, reliability analysis, and mitigation planning for Fleet's behind-the-meter power solutions and site-specific conditions. · ...
3 weeks ago
Microsoft has been a leading company in computing for decades. We are a global service relied on by governments and organizations to deliver the things they need to work every day. · Develops technical expertise in the code, features and operations of specific products as require ...
1 month ago
SpaceX is actively developing technologies to make human life on Mars possible. Hardware Reliability Engineer will be responsible for solving systemic production issues and driving process and design changes. · ...
2 weeks ago
The SRE Engineer will be responsible for deploying and managing AI resources on Microsoft Azure. · Maintain service uptime availability reliability and latency. · Track and integrate SRE metrics with enterprise monitoring systems. · ...
1 week ago
SRE Engineer AI role involves deploying managing AI resources on Microsoft Azure monitoring service uptime availability reliability latency tracking integrating SRE metrics with enterprise monitoring systems supporting CI/CD DevOps workflows using GitHub Azure DevOps. · ...
2 weeks ago