Site Reliability Engineer for CloudSimple (Kyiv)
The USA Company at the beginning of the building of a next-generation private cloud as a service platform aiming at a scale of established cloud providers like Amazon AWS or Google Cloud. CloudSimple is led by a team of top Silicon Valley innovators and entrepreneurs with a rich history of the founding and scaling successful ventures.
On behalf of CloudSimple, Ciklum is looking for Site Reliability Engineer to join Kyiv team on a full-time basis. We are looking for a seasoned systems operations expert to lead initiatives focused on systems infrastructure management within a high volume fast scaling environment.
CloudSimple is building next-generation cloud infrastructure with a specific focus on enterprises workloads.
- You will be responsible for the systems deployment, operations, and monitoring for our infrastructure, including design and development of infrastructure automation.
- You will get your hands dirty, troubleshooting infrastructure, and architectural challenges using your existing knowledge and toolkits.
- You will drive reliability and supportability aspects of Cloud service by creating knowledgebase and, working with DevOps, coordinate change management policies, deploy ticket/incident management system, service request queue triaging and auto-remediation.
- You will utilize your advanced system architecture & administration skills for collaboration with engineering and product management, test and automation teams to architect and develop strategic and tactical solutions.
- You will implement, support and maintain Kubernetes clusters
- You will help develop requirements for customer onboarding processes, target environment sizing and migration automations.
- 7+ years of experience of increasing responsibility in a technical support and data center operations roles, including team and process management responsibilities. Experience with Cloud data centers is a must.
- Prior successful experience of working in an innovative, fast-paced startup with a high rate of flux. The candidate must demonstrate strong entrepreneurial spirit and vigor.
- Demonstrated proficiency in creating detailed technical design documents, facilitate design reviews, and execution of design implementation projects.
- BS/MS degree in Computer Science or equivalent experience
- Deep technical roots in data center technologies:
- Large-scale Linux production environments, preferably as part of a Cloud service provider environment.
- Understand datacenter networking fabric topologies and common architectures deployed
- Virtualization technologies, in particular VMware product suite (vCenter, ESXi) is required.
- Deep understanding with cluster management systems like Kubernetes and Docker based container deployments is required.
- Experience in Networking concepts is required – Layer 2/3, Load Balancers, VPN, Network Virtualization, BGP, OSPF
- Understanding of DevOps agility for continuous development and delivery with Chef / Puppet / Ansible.
- Deep understanding of KVM, Microsoft technologies (Hyper-V, Azure Pack, Azure Stack) is a strong plus.
- Reasonable technical understanding of configuration and maintenance of different NAS/SAN and networking systems in a virtualized environment. Understanding of Software Defined Data Center (SDS, SDN) is a strong plus.
- Deep understanding of ITIL processes and systems, such ITSM (ServiceNow) is a strong plus.
- Professional Certifications in Particular VCP, VCP-DCV, VCPCMA is a strong plus
- Must be an excellent verbal and written communicator
- Open-minded, flexible and thriving in a dynamic, ever-changing environment
- Excellent interpersonal skills
- Strong problem-solving abilities and engineer mentality
- Self-learning skills
What's in it for you
- Unique working environment where you communicate and work directly with client
- Variety of knowledge sharing, training, and self-development opportunities