Opis oferty:
Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures that Google Cloud's services—both our internally critical and our externally-visible systems—have reliability, uptime appropriate to customer's needs and a fast rate of improvement. On the SRE team, you’ll have the opportunity to manage the complex challenges of scale which are unique to Google Cloud, while using your expertise in coding, algorithms, complexity analysis and large-scale system design. SRE's culture of intellectual curiosity, problem solving and openness is key to its success. We encourage collaboration, self-direction, meaningful projects, and a blame-free environment for risks and growth.
Wymagania:
- Bachelor’s degree in Computer Science, a related field, or equivalent practical experience
- 5 years of experience with software development in one or more programming languages
- 3 years of experience in designing, analyzing, and troubleshooting large-scale distributed systems
- 2 years of experience leading projects and providing technical leadership
- Experience working in computing, distributed systems, storage, or networking
- Expertise in designing, analyzing, and troubleshooting large-scale distributed systems
- Ability to debug, optimize code, and to automate routine tasks
- Systematic problem-solving approach, coupled with effective verbal and written communication skills
Dodatki i korzyści:
- Opportunity to work on the challenges of scale unique to Google Cloud
- Environment of collaboration, innovation, and growth
- Intellectually stimulating and open culture
- Support and mentorship for learning and development
- Contribution to optimizing systems, building infrastructure, and automation
- Ownership of services lifecycle and improvement
- Practice of sustainable incident response and blameless postmortems
- Equal opportunity employer with a culture of belonging and inclusivity