Site Reliability Engineer

Website DNA Client

Working with a global company, expanding across APAC region by growing a Tech Operations team to support their business in Australia.

As Site Reliability Engineer, you will own the product that are introduced in production, focusing on reliability and performance at scale by understanding the tech stack in in detail. On top of that, you will learn something new every day.

As part of the Site Reliability Engineer team, you will work on large scale system design and troubleshooting, and be fluent in systems programming and/or automation. You will have a desire to tackle the complex problems of scale. You will also be proactive in identifying opportunities for automation

Responsibilities

  • Perform deep dives into both systemic and latent reliability issues.
  • Troubleshoot issues across the entire stack.
  • Identify and drive opportunities to improve automation.
  • Engage in service capacity planning and demand forecasting, software performance analysis and system tuning.
  • Participate in periodic on call duties.
  • Represent the SRE team in design reviews and operational readiness exercises for new and existing services.

Essential skills and experience

  • Strong experience in Linux System Administration
  • Experience with container orchestration, Kubernetes
  • Strong knowledge of Configuration Management – one of the following: Puppet, Ansible, Terraform, Salt, Elasticsearch, Splunk etc.
  • Experience with Cloud (ideally Azure)
  • Experience with Containerization tools (Docker, Kubernetes etc.)
  • Production DBA Experience administering SQL/NoSQL databases.
  • Experience with monitoring and alerting of production systems.

If this sounds like you, then send your CV and covering letter through to the following email address: matt@dnatalent.com.au

Upload your CV/resume or any other relevant file. Max. file size: 128 MB.

FIND TALENT FOR YOUR BUSINESS