Site Reliability Engineer (SRE)
Focus: Monitoring and ensuring system reliability, performance, and uptime in a SaaS environment.
View Full Job DescriptionAbout the Role:
As a Site Reliability Engineer, you will be responsible for ensuring TurboVets’ platform remains available, reliable, and performs optimally. You will work closely with DevOps and engineering teams to monitor systems, manage incidents, and develop strategies for resilience and scaling. This role is ideal for someone passionate about uptime, reliability, and infrastructure management.
Responsibilities:
- Monitor system performance, detect incidents, and ensure uptime.
- Develop tools and practices to automate reliability and performance checks.
- Collaborate with teams to identify and resolve potential issues proactively.
- Manage incidents, troubleshoot issues, and implement preventive measures.
Qualifications:
- Bachelor’s degree in Computer Science, Engineering, or related field.
- 3+ years of experience in site reliability or infrastructure engineering.
- Proficient in monitoring tools and incident management practices.
- Knowledge of scripting languages (Python, Bash) and cloud environments.
Who You Are:
- An analytical thinker with a passion for maintaining high system reliability.
- Organized, proactive, and quick to act in high-pressure situations.
- Committed to building a stable, resilient platform that meets user expectations.
Equal Opportunity Statement:
We are an equal-opportunity employer and celebrate diversity, recognizing that diversity of thought and background builds stronger teams. We approach diversity and inclusion seriously and thoughtfully. We do not discriminate on the basis of race, religion, color, ancestry, national origin, caste, sex, sexual orientation, gender, gender identity or expression, age, disability, medical condition, pregnancy, genetic makeup, marital status, or military service.