Jobs
>
Cupertino

    HPC (AWS) Claster Engineer - Cupertino, United States - TECHFUJI LLC

    TECHFUJI LLC
    TECHFUJI LLC Cupertino, United States

    1 week ago

    Default job background
    Description

    Job Description

    Job Description

    We are looking for a Senior Systems Developer with expertise in AWS, HPC Job Schedulers (PBS), Python, DevOps, Linux Administration, FlexLM, and Managing SQL and NoSQL on AWS.

    Job Responsibilities

    Designing and implementing the next generation of Amazon services' underlying architecture, inventing new techniques, solving complex scaling challenges, and launching new service features. Providing system support for the manageability, operability and performance of the software platforms and creating simple processes that help operate and build our system infrastructure. Adapting and improving operations management systems and processes to accommodate rapid and increasing growth in systems and traffic. Optimizing the performance of our systems by analyzing and deploying new hardware configurations. Applying networking and systems skills to build, optimize, and extend rapidly growing Amazon software services. Developing custom components to augment/enhance the current systems or design and build new applications/components ground up to better align with the team's and internal customer's goals and vision.

    Some of the key activities include:

    Maintaining HPC environments on AWS

    Maintaining Linux/Windows Remote desktop sessions

    User onboard, file permission, user group

    Maintaining HPC job scheduler (PBS)

    Maintain and manage multiple software tools (Install, configure, update/upgrade, patch and fix security vulnerabilities) such as but not limited Atlassian tools

    Manage software licensing service, customer and vendor communication for licensing needs.

    Issue (defect, risk, vulnerability, user requests) triaging and timely resolution of tickets (which include individual user requests to security and compliance issues raise for entities such as operating systems, AWS infrastructure and software updates)

    Basic qualification

    2+ years of experience programming with at least one modern language such as Python, PowerShell, C++,

    3+ years of experience in automation (building, testing, releasing or monitoring) and building tools for the same.

    5+ years of Linux administration experience along-with networking, storage systems and hands-on systems engineering experience

    4+ years of experience with AWS cloud on services such as but not limited to EC2, EFS, EBS, S3, Route53, VPC, VPC Peering and CloudFormation

    1+ years of experience in HPC systems and job schedulers (such as Altair PBS, Slurm, IBM LSF),

    1+ years managing Network License Servers (FlexLM compliant network licenses)

    2+ years of experience managing back-end databases (both SQL and NoSQL) on AWS

    Company DescriptionTechFuji was created to help companies navigate the world of technology. Every business needs it, and there are a lot of choices. But not every tech consultancy is the same. We're different, and we want to share why. Explore why you'll appreciate what we offer and how we offer it.

    Company Description

    TechFuji was created to help companies navigate the world of technology. Every business needs it, and there are a lot of choices. But not every tech consultancy is the same. We're different, and we want to share why. Explore why you'll appreciate what we offer and how we offer it.