beBee background
Professionals
>
Cincinnati
Avinash Reddy Sankati

Avinash Reddy Sankati

Lead Data Engineer

Technology / Internet

Cincinnati, City of Cincinnati, Hamilton

Social


About Avinash Reddy Sankati:

Around 10+ years of experience as a Lead Data Engineer/Big Data Engineer/Python Developer, with a strong focus on designing, developing, and implementing data models for enterprise-level applications and systems.

Extensive experience in the Big Data ecosystem, with a particular focus on Data acquisition, Ingestion, modeling, Analysis, Integration, and Processing.

Experience in building data pipelines using Azure Data Factory, Azure Databricks, and loading data to Azure Data Lake, Azure SQL Database, Azure SQL Data Warehouse, and controlling and granting database access.

Experience in developing data integration solutions in Microsoft Azure Cloud Platform using services such as Azure Synapse Analytics, Azure Blob Storage, Azure Data Lake Storage along with DataStage and Informatica Power Center.

Familiarity with serverless computing using Azure Functions and event-driven architectures using Azure Event Grid, Azure Event Hub, and Azure Service Bus for building scalable and event-driven data processing workflows.

Used Azure Data platform capabilities such as Azure Data Lake, Azure Data Factory, HDInsight, Azure SQL Server, Azure Machine Learning, and Power BI to build huge Lambda systems.

Experience in creating Pipelines in ADF using Linked Services/Datasets/Pipeline/ to Extract, Transform, and load data from different sources such as Azure SQL, Blob storage, Azure SQL Data warehouse.

Strong understanding of ETL (Extract, Transform, Load) processes and best practices using AWS Glue, AWS Data Pipeline, and other AWS data integration services.

Good knowledge of Cloudera distributions and in Amazon Simple Storage Service (Amazon S3), Redshift, Lambda, Amazon EC2, Amazon EMR.

Created a custom logging framework for ETL pipeline logging using Append variables in Data factory, enabling monitoring and azure log analytics to alert support team on usage and stats of the daily runs.

Implemented installation and configuration of multi-node cluster on Cloud using Amazon Web Services (AWS) on EC2.

Implemented logging framework – ELK stack (Elastic Search, Log Stash& Kibana) on AWS.

Capable of using AWS utilities such as EMR, S3, and cloud watch to run and monitor Hadoop and Spark jobs

on AWS.

Designed and Developed Spark workflows using Scala for data pull from AWS S3 bucket and Snowflake

applying transformations on it.

Good knowledge of the architecture and components of Spark, and efficient in working with Spark Core,

Spark SQL, Spark streaming, and expertise in building PySpark and Spark applications for interactive analysis,

batch processing, and stream processing.

Expertise in using Kafka for log aggregation solution with low latency processing and distributed data

• Experience in designing, building, and configuring a virtual data center in the Google Cloud Platform (GCP) to support an enterprise data warehouse. This includes the creation of a Virtual Private Cloud (VPC), public

and private subnets, security groups, route tables, and Google Cloud Load Balancing.
• Hands on experience in GCP, BigQuery, GCS Bucket, Google cloud functions, cloud data flow, pub/sub cloud

shell, GSUTIL, BQ command line utilities, Dataproc, Stack driver.

consumption and widely used Enterprise Integration Patterns (EIPs).

In-depth experience and good knowledge in using Hadoop ecosystem tools such as HDFS, MapReduce, YARN, Spark, Kafka, Storm, Hive, Impala, Sqoop, HBase, Flume, Oozie, Ni-Fi, and Zookeeper.

Strong knowledge of databases such as MySQL, MS SQL Server, PostgreSQL, Oracle, and NoSQL databases such as MongoDB, Cassandra, and AWS DynamoDB.

Good experience in using Data Modelling techniques to find the results based on SQL and PL/SQL queries.

Experience with containerization and orchestration tools such as Docker, Kubernetes, and Helm, and with

infrastructure-as-code tools such as Terraform and Ansible.

Experienced working with JIRA for project management, GIT for source code management, JENKINS for

continuous integration, and Crucible for code reviews.

Domain knowledge on Mortgage, Consumer Lending, Healthcare, and ERP Industry.

Experience

Around 10+ years of experience as a Lead Data Engineer/Big Data Engineer/Python Developer, with a strong focus on designing, developing, and implementing data models for enterprise-level applications and systems.

Extensive experience in the Big Data ecosystem, with a particular focus on Data acquisition, Ingestion, modeling, Analysis, Integration, and Processing.

Experience in building data pipelines using Azure Data Factory, Azure Databricks, and loading data to Azure Data Lake, Azure SQL Database, Azure SQL Data Warehouse, and controlling and granting database access.

Experience in developing data integration solutions in Microsoft Azure Cloud Platform using services such as Azure Synapse Analytics, Azure Blob Storage, Azure Data Lake Storage along with DataStage and Informatica Power Center.

Familiarity with serverless computing using Azure Functions and event-driven architectures using Azure Event Grid, Azure Event Hub, and Azure Service Bus for building scalable and event-driven data processing workflows.

Used Azure Data platform capabilities such as Azure Data Lake, Azure Data Factory, HDInsight, Azure SQL Server, Azure Machine Learning, and Power BI to build huge Lambda systems.

Experience in creating Pipelines in ADF using Linked Services/Datasets/Pipeline/ to Extract, Transform, and load data from different sources such as Azure SQL, Blob storage, Azure SQL Data warehouse.

Strong understanding of ETL (Extract, Transform, Load) processes and best practices using AWS Glue, AWS Data Pipeline, and other AWS data integration services.

Good knowledge of Cloudera distributions and in Amazon Simple Storage Service (Amazon S3), Redshift, Lambda, Amazon EC2, Amazon EMR.

Created a custom logging framework for ETL pipeline logging using Append variables in Data factory, enabling monitoring and azure log analytics to alert support team on usage and stats of the daily runs.

Implemented installation and configuration of multi-node cluster on Cloud using Amazon Web Services (AWS) on EC2.

Implemented logging framework – ELK stack (Elastic Search, Log Stash& Kibana) on AWS.

Capable of using AWS utilities such as EMR, S3, and cloud watch to run and monitor Hadoop and Spark jobs

on AWS.

Designed and Developed Spark workflows using Scala for data pull from AWS S3 bucket and Snowflake

applying transformations on it.

Good knowledge of the architecture and components of Spark, and efficient in working with Spark Core,

Spark SQL, Spark streaming, and expertise in building PySpark and Spark applications for interactive analysis,

batch processing, and stream processing.

Expertise in using Kafka for log aggregation solution with low latency processing and distributed data

• Experience in designing, building, and configuring a virtual data center in the Google Cloud Platform (GCP) to support an enterprise data warehouse. This includes the creation of a Virtual Private Cloud (VPC), public

and private subnets, security groups, route tables, and Google Cloud Load Balancing.
• Hands on experience in GCP, BigQuery, GCS Bucket, Google cloud functions, cloud data flow, pub/sub cloud

shell, GSUTIL, BQ command line utilities, Dataproc, Stack driver.

consumption and widely used Enterprise Integration Patterns (EIPs).

In-depth experience and good knowledge in using Hadoop ecosystem tools such as HDFS, MapReduce, YARN, Spark, Kafka, Storm, Hive, Impala, Sqoop, HBase, Flume, Oozie, Ni-Fi, and Zookeeper.

Strong knowledge of databases such as MySQL, MS SQL Server, PostgreSQL, Oracle, and NoSQL databases such as MongoDB, Cassandra, and AWS DynamoDB.

Good experience in using Data Modelling techniques to find the results based on SQL and PL/SQL queries.

Experience with containerization and orchestration tools such as Docker, Kubernetes, and Helm, and with

infrastructure-as-code tools such as Terraform and Ansible.

Experienced working with JIRA for project management, GIT for source code management, JENKINS for

continuous integration, and Crucible for code reviews.

Domain knowledge on Mortgage, Consumer Lending, Healthcare, and ERP Industry.

Education

Masters in Artificial Intelligence

Professionals in the same Technology / Internet sector as Avinash Reddy Sankati

Professionals from different sectors near Cincinnati, City of Cincinnati, Hamilton

Other users who are called Avinash Reddy

Jobs near Cincinnati, City of Cincinnati, Hamilton

  • Work in company

    Data Engineer

    Saxon Global

    We are seeking a Senior Data Architect to lead the design and implementation of modern data and AI platforms for enterprise clients. This role combines deep technical expertise with strategic consulting and advisory responsibilities. · Design enterprise data architectures includi ...

    Cincinnati

    1 month ago

  • Work in company

    Data Engineer

    Jobs via Dice

    Dice is the leading career destination for tech experts at every stage of their careers. · Our client Drunix Solution Inc offers an experienced data engineering role. · An ideal candidate should have Netezza knowledge. · ...

    Cincinnati

    1 month ago

  • Work in company

    Advanced Data Engineer

    King Soopers/City Market

    The Advanced Data Engineer will lead our assortment data platform strategy for merchandising analytics using Databricks and AI/ML to architect and build solutions that impact key assortment planning and execution in merchandizingParticipate in the development and communication of ...

    Blue Ash

    2 weeks ago