Big Data Developer - Houston, United States - Diverse Lynx

    Default job background
    Description
    Title: Big Data Developer
    Location: Houston, TX (Onsite)
    Type: Contract (C2C or W2)

    Job Description:

    As a Big Data Developer within our technology team, you'll be instrumental in building, maintaining, and optimizing big data platforms and solutions. Leveraging the Cloudera suite, you will craft robust, scalable data processing pipelines and contribute to the development of our data-centric applications using cutting-edge technologies.

    Key Responsibilities:
    • Design and develop big data applications using Cloudera suite components like Kafka, HDFS, HBASE, KUDU, Zookeeper, HIVE, and Impala.
    • Write robust, efficient, and maintainable code in Java/Scala/Python, with a strong emphasis on Spark and Flink for real-time data processing.
    • Develop and maintain data pipelines using Apache NIFI, ensuring seamless data collection, ingestion, and distribution.
    • Utilize Spring-Boot and Flask frameworks for creating microservices that interact with Big Data systems.
    • Manage data workflows with Apache Oozie and orchestrate complex data processes with Apache Airflow.
    • Implement solutions for data security and governance using Atlas, Ranger, RangerKMS, and KTS within Cloudera ecosystems.
    • Work on cloud-native Big Data technologies, integrating solutions with cloud services for enhanced scalability and performance.
    • Scripting and automation of routine tasks within Linux environments to enhance development and deployment processes.
    • Optimize data retrieval with advanced HQL queries and tune performance for HIVE and Impala databases.
    • Employ Kubernetes container orchestration for deployment, scaling, and operations of application containers across clusters of hosts.
    • Ensure the development of high-quality applications by writing test cases and maintaining a continuous integration and deployment pipeline.
    • Contribute to the architectural decisions and create documentation outlining design and technical specifications.
    • Maintain a proactive approach to troubleshoot and resolve issues in the production environment.
    • Engage with cross-functional teams to translate business requirements into technical implementations.
    • Stay current with industry trends and evaluate new technologies for adoption into existing or new data infrastructure components.
    • Run Data Engineering Jobs on GPUs using SPARK, tweak Jobs to utilize distributed GPUs.
    • Design and implement security measures in the application development process, leveraging Kerberos authentication to secure communication within the cluster. Work closely with data administrators to ensure that all applications comply with established security protocols and access control measures. Develop scripts and automation tools to streamline the security aspects of the big data applications lifecycle.
    • Implement and maintain the encryption standards for securing data-in-transit using TLS protocols to prevent eavesdropping, tampering, and forgery. Architect solutions that encrypt data-at-rest to safeguard sensitive information using industry-standard encryption algorithms and manage encryption keys with a focus on maintaining performance while enforcing data security. Regularly review and audit the code-base for compliance with data protection regulations and organizational policies.
    Technical Qualifications:
    • Advanced programming skills in Java/Scala/Python with a focus on Spark and Flink for large-scale data processing.
    • Strong understanding of the Cloudera suite, including in-depth knowledge of data management and processing services.
    • Proficient in building applications with Spring-Boot and Flask, with experience in creating RESTful services.
    • Experience with scripting and automation in a Linux environment, along with expertise in shell scripting.
    • In-depth knowledge of SQL, HQL, and the ability to perform query optimization on big data sets.
    • Proficient in Kubernetes, including deploying applications in OpenShift or other Kubernetes environments.
    • Familiarity with Neo4j or similar graph database technologies, and the ability to integrate them into big data solutions.
    • Experience with Cloudera Data Services such as Cloudera Data Engineering, Cloudera Data Warehouse, and Cloudera Machine Learning is highly desirable.
    • Experience in RAPIDS & GPU-Aware Scheduling.
    • Knowledge of data modeling, data access, and data storage techniques for big data environments.
    • Proficient in integrating Kerberos authentication within big data applications for secure access and communication with big data services. Capable of scripting and automation to manage Kerberos ticket lifecycles, renewals, and troubleshooting common Kerberos issues in a development environment.
    • Skilled in implementing and managing security protocols for data in transit, including setting up and configuring TLS for secure data transfer within big data solutions. Knowledgeable in encryption standards and tools for encrypting data at rest, and understanding of key management systems and cryptographic practices to ensure data privacy and regulatory compliance.
    Education:
    • Bachelor's degree in Computer Science, Information Technology, or a related field.
    Certifications:
    • Cloudera Certified Professional (CCP) or any relevant Big Data certification is preferred.
    Soft Skills:
    • Strong analytical and problem-solving skills.
    • Excellent verbal and written communication abilities.
    • Collaborative team player with an ability to work in dynamic environments.
    • Self-motivated with a keen interest in technology and continuous learning.
    Diverse Lynx LLC is an Equal Employment Opportunity employer. All qualified applicants will receive due consideration for employment without any discrimination. All applicants will be evaluated solely on the basis of their ability, competence and their proven capability to perform the functions outlined in the corresponding role. We promote and support a diverse workforce across all levels in the company.