Senior Member of Technical Staff - Santa Clara, United States - Oracle

    Default job background
    Description


    Are you interested in delivering large-scale, high performance, fault tolerant solutions? Oracle's Cloud Infrastructure team is building a next generation Infrastructure-as-a-Service that supports the most demanding mission-critical customer requirements, and operate at cloud scale to provide a secure, distributed multi-tenant cloud environment.


    Were looking for hands-on engineers with a passion for solving difficult problems in distributed systems, virtualized infrastructure, and highly available services.

    Joining Oracle will give you the opportunity to design and build innovative new systems from the ground up and operate services at scale.

    Our engineers have significant technical and business impact while delivering critical enterprise level features.

    Career Level - IC3


    As a Senior Member of Technical Staff, you will work as part of a highly collaborative team to build new features/tools while operating and growing the current service offering.

    You are an experienced cloud engineer with a proven track record of delivering high-scale, high-impact solutions.

    You understand distributed systems and are able to architect broad systems interactions while being very hands-on, able to dive deep into any part of the stack and lower-level system interactions.

    You are obsessed with the customer, always exceeding expectations. You have excellent communication skills. You can clearly explain complex technical concepts. You value simplicity and scale, work comfortably in a collaborative, agile environment, and be excited to learn.


    This is a leadership role where the candidate will participate in design activities, work with senior architects and product management on service definition and establish operational best practices across the organization.

    We expect this role to have an impact in new AI/ML service offerings as well as in enhancements to existing storage services.

    Qualifications

    4+ years experience developing commercial software in a distributed environment.

    BS or MS degree or equivalent experience relevant to functional area.

    Hands-on experience developing services on a public cloud platform (OCI, AWS, Azure, GCP).

    Deep experience with REST APIs and network/routing protocols in multi-AD/AZ and regional data centers.

    Strong knowledge of Java, Python, Go and/or C++.

    Hands-on experience building and operating large scale distributed systems.

    Administration experience in Enterprise Linux (Oracle Linux, Redhat, or Fedora platforms).


    Working knowledge of automation deployment and configuration management tools such as Chef, Salt, Ansible and/or Puppet, as well as infrastructure as code tools such as Terraform.

    Experience building continuous integration/deployment pipelines with robust testing and deployment schedules.

    Proficient with Docker, Kubernetes, GitHub/Bitbucket and monitoring frameworks such as Prometheus / Grafana and Nagios.

    Experience with distributed parallel filesystems such as Lustre, GPFS (IBM Storage Scale) or parallel NFS is a strong plus.

    Familiarity with AI/ML frameworks (TensorFlow/Keras, PyTorch, Scikit-Learn, XGBoost, Caffe), and MLOps a strong plus.

    Strong knowledge of data structures, algorithms, operating systems, databases, storage and persistent technologies.

    Strong troubleshooting, debugging and performance tuning skills.

    Understanding of key performance indicators and how to dig into them.

    Good understanding of Agile software development principles including using common tools such as JIRA.

    Ability to work independently and engage individuals and teams located across multiple geographies and or cultures.

    Strong written and verbal communications and presentation skills.