- Analyzes monitoring metrics for performance and fault tolerance.
- Collaborates with developers to enhance services and testing.
- Contributes to system design, platform management, and capacity planning.
- Balances speed of feature development with reliability.
- Assists in restoring normal service with incident response.
- Proficient in debugging and troubleshooting.
- Manages unwanted traffic with investigation and rate-limiting.
- Utilizes monitoring for proactive adjustments and alerts.
- Implements continuous improvement for processes and technology.
- Handles other assigned tasks as necessary.
- Bachelor's degree or equivalent experience
- 5+ years of experience in a technology or software role
- Proficient in Kubernetes, SRE principles, and cloud services (GCP).
- Experience with Dynatrace, New Relic, or SolarWinds
- Skilled in microservice architecture and infrastructure troubleshooting.
- Experienced in deploying, monitoring, and supporting enterprise applications.
- Proficient in CI/CD tools and performance optimization.
- Strong mix of software engineering and operational support skills.
- Knowledge of web technologies and tools like Azure DevOps, Dynatrace, Prometheus, Terraform, and Grafana.
- Grafana
- Splunk
Regards
sachin
-
Cloud Site Reliability Engineer
6 days ago
EOS Worldwide Birmingham, United StatesJob Description · Job DescriptionEOS: Real. Simple. Results. · EOS, the Entrepreneurial Operating System, is a complete set of simple concepts and practical tools that have helped thousands of entrepreneurs get what they want from their businesses. Purely implementing EOS helps t ...
-
FPGA Research Engineer
1 week ago
Kratos Defense Birmingham, United StatesU.S. CITIZENS ONLYDue to the nature of our research and the necessity to obtain security clearance, the Engineering division of Kratos SRE is only permitted to consider and hire U.S. Citizens for this position.Permanent Residents and Visa-holders are not eligible for employment. ...
-
Hardware Technician
1 week ago
Kratos Defense and Security Birmingham, United StatesGENERAL JOB SUMMARY: · This position supports hypersonic material evaluation programs within the Department of Defense and NASA. The job requires the candidate to have both computational skills and model assembly skills supporting the installation of high temperature instrumentat ...
-
Senior RF Research Engineer
1 day ago
Kratos Defense and Security Birmingham, United StatesU.S. CITIZENS ONLY Due to the nature of our research and the necessity to obtain security clearance, the Engineering division of Kratos SRE is only permitted to consider and hire U.S. Citizens for this position. Permanent Residents and Visa-holders are not eligible for employment ...
-
Sr. Human Resources Business Partner
1 week ago
Kratos Defense Birmingham, United StatesThe Sr. Human Resources Business Partner is directly responsible for the overall administration, coordination, and evaluation of the human resource function within the Kratos SRE business unit.The Sr. Human Resources Business Partner will positively influence and impact the organ ...
full time sre - Birmingham, United States - Saxon Global
Description
Title: Site Reliability EngineerLocation: Birmingham, AL (hybrid)
FULL TIME
SUMMARY
The Site Reliability Engineer (SRE) is responsible for enhancing system reliability and resilience through automation. This role combines software and systems engineering to maintain large-scale, fault-tolerant systems, ensuring they remain available and adaptable. The SRE actively monitors system health, supports cloud-based transformations, and innovates to meet customer needs while providing operational support for multiple distributed software applications.
JOB DUTIES