No more applications are being accepted for this job
Site Reliability Engineer - Atlanta, United States - AppLab Systems Inc
Description
Pls dont submit without hands on Prometheus exp. Need Solid exp.Location:
Atlanta / Dallas / Irvine/ NJ (Day 1 onsite)
Job Description:
Position
Dynamic Engineer who has an understanding of application performance management, experience building monitoring and alerting solutions.
Troubleshoot incidents, identify root cause , fix and document problems and deploy preventative solutions.
Required Experience
5+ years of recent experience working on building automation and monitoring for
observability (Prometheus/Grafana/ELK).
5 + years of experience working on support projects and be on rotational on-call to address failures.
5+ years of recent experience with
Kubernetes, Docker , Helm and end to end support of applications in this environment.
5+ years of recent experience working in
AWS and/or GCP.
3+ years of full stack
python
development.
Great communication skills to be able to effectively communicate with team members as well as management.
Skills Preferred:
MLOps experience
MLE experience
by Jobble
#J-18808-Ljbffr