Principal Data Warehouse Performance Engineer - Phoenix, United States - Cloudera

    Default job background
    Description

    Principal Data Warehouse Performance Engineer*Job Description:
    Cloudera is looking for an experienced Principal Engineer to play a key role in advancing Clouderas data warehouse offerings across multiple cloud providers. Since its origin, Cloudera has enabled enterprise organizations to effectively manage and use their data using on-premise infrastructure. We are now building software solutions that enable our customers to leverage cloud infrastructure to facilitate their growing data needs, and thereby accelerating Clouderas next stage of growth.

    We are building a hybrid data platform with which customers can utilize their on-premise and cloud infrastructure, dynamically adjusting and scaling to their workload needs.

    The data platform, whether hosted by Cloudera or the customer, enables the latest and evolved techniques of managing data to empower various consumers of data from analysts to data scientists.

    In turn, building this data platform involves the latest technologies and software engineering paradigms.


    Responsibilities:

    • Work with internal development teams and the open source community to proactively drive performance improvements/optimizations across our data warehouse stack. Work with product managers, developers and the field team to understand performance and scale requirements, and develop benchmarks based on these requirements. Develop automation to execute benchmarks, collect and aggregate metrics and profiles, and report results, trends, and regressions. Analyze performance and scalability characteristics to identify bottlenecks in large-scale distributed systems. Perform root cause analysis of performance issues identified by internal testing and from customers and suggest corrective actions. Evaluate performance of competitor systems and provide related guidance to the field team.

    Required background:

    • 8 years of industry experience in performance related work ideally on large scale distributed systems Understanding of DBMS algorithms and data structure fundamentals.
    • Understanding of hardware trends and full stack systems performance:

    CPU, RAM, storage, network, Linux kernel, JVM, distributed systems performance.+ Understanding of performance analysis tools and techniques.+ Strong design and coding skills (Java/C++/Golang/Python preferred)+ Ability to work in a distributed setting with team members spread in multiple geographies+ Demonstrated ability to work on large cross-functional projects, including strong written communication skills and a collaborative mindset, as you will be working with many teams inside and outside of Cloudera.+ B.S.

    or M.S. in Computer Science or equivalent experience.


    Pluses:

    • Experience with the Hadoop ecosystem (i.e. Hive, Impala, Kudu) Hands-on experience with containerization, Kubernetes, public cloud infrastructure (AWS, Azure and/or GCP) and mesh-networks#J-18808-Ljbffr