Systems Reliability Engineer

What do we think we need?

Someone who:

A competent developer who can read Go, Javascript (NodeJS and frontend), Solidity (EVM code), and shell scripts within the first few weeks
Has a non-trivial understanding of cloud native tooling such as Kubernetes, Helm, Elasticsearch, and Prometheus.

Has basic working knowledge of AWS and GCE.

Has at least two years worth of multi-server non-trivial cloud deployments – ideally with containers
Who can continuously improve and refine our processes – deployment, CI, testing – without prompting

Who can plan, re-prioritise, and switch contexts with ease – creating just enough documentation so others have visibility

Can help the team avoid mistakes by positively challenging possible missteps

Can pinpoint cross-team reliability issues and take responsibility for things not ‘falling between the gaps’
Can interact with different teams with empathy to understand why reliability or correctness issues exist and how to help resolve them

Promotes collective responsibility and breaks down siloed thinking by making things their problem

Of particular importance is that you are able to take part in a forceful yet respectful technical argument with colleagues, that you are able to change your mind about something, and to change other’s minds.

You can apply for the job here:

