As a Senior Site Reliability engineer at IOG, you will be working with a group of infrastructure and tooling engineers to develop, support, and maintain the product platform for the Smart Contracts tribe.
You will work with the application teams in the tribe to understand their development, deployment and other infrastructure requirements, design and implement new functionality on the platform to enable them to meet their needs without ongoing support or expertise outside of their application domain, teach and support them in leveraging the platform’s capabilities, and operate and maintain the infrastructure making the platform possible.
- Design platform capabilities with clean domain-specific interfaces to enable application teams to meet their own needs with full understanding and ownership
- Implement platform interfaces on top of standard 3rd party tools and services
- Document and educate the application teams on platform interfaces and operation
- Maintain an understanding of the current and planned state of supported applications, proactively identifying any capabilities needed for them to effectively manage and deliver their own work
- Continually improve platform interfaces and implementation in alignment with evolving knowledge of the requirements of security, efficiency, and maintainability
- Maintain and operate shared cloud infrastructure, including incident resolution
- Participate in the on-call rota
- Build tools and processes to reduce maintenance overhead over time
- Support application teams in resolving ad hoc issues not yet supported by the platform
Desirable technical capabilities include:
- Datacenter Scheduler Working with with schedulers like kubernetes or nomad
- Infrastructure as a Service Working with IaaS “cloud” platforms, ideally AWS
- Software Development Building maintainable software projects with well-designed interfaces (beyond ad hoc scripting)
- Infrastructure as code Working with IaC tools for declarative maintenance of infrastructure, ideally terraform
- Operations Maintaining and operating application infrastructure, including debugging outages, managing costs, etc.
- Nix Using the tools within the Nix ecosystem
- Blockchain infrastructure experience is a plus
On top of these and related technical capabilities, the Smart Contracts Site Reliability Engineering squad is a very high-agency, high-autonomy team; you will be expected to take full ownership over delivering value delegated to you and will be given the resources and authority needed to do so effectively. This requires a significant degree of time management, prioritization, self-awareness of existing and missing capabilities, and communication/collaboration skills.
Education / Experience
- 3+ years experience in a similar SRE, DevOps, Platform Engineering role
- BSc/MSc in a computer related field or equivalent experience
IOG is a fully distributed organization but due to team distribution, we require someone to be based either in Ireland or the UK.
- Flexible schedule
- Remote work
- Laptop reimbursement
- New starter package to buy hardware essentials (headphones, monitor, etc)
- Learning & Development opportunities
- Competitive PTO and Sick Leave plan
UK & Ireland Employees
- Monthly Health Stipend to use towards any wellness or medical coverage/service
At IOG, we value diversity and always treat all employees and job applicants based on merit, qualifications, competence, and talent. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.
To apply for this job please visit apply.workable.com.