Skip to main content

Observability Manager

  • Infrastructure Engineering
  • Dallas

Do you want to tackle the biggest questions in finance with near infinite compute power at your fingertips? 

G-Research is a leading quantitative research and technology firm, with offices in London and Dallas.

We are proud to employ some of the best people in their field and to nurture their talent in a dynamic, flexible and highly stimulating culture where world-beating ideas are cultivated and rewarded. 

This is a hybrid role based in our new Dallas infrastructure hub where we work on the latest technologies in a cutting-edge environment.

The role

The Observability Platform team manages the doors – both entry and exit – to the telemetry services that are managed by the Platform Reliability and Observability Team. We ensure that engineers can effectively produce and consume telemetry for their services. This involves working with the Observability Engineering team to build robust pipelines to ingest and route data in predictable, composable ways as well as visualising that data after the fact to drive insight and action.

Under the umbrella of the Platform Engineering department, our group also has responsibility to mature the reliability of our full HPC stack, from networks and storage up to the compute and application platform layers. 

We are seeking a Manager with deep expertise in observability stacks. You will understand the unique problems that come with moving cloud-level volumes of telemetry data at scale, and be excited at the prospect of ensuring our customers have eyes into the same underlying telemetry data to run their services as efficiently as possible.

Knowledge of and experience running observability platforms at scale, serving a wide variety of customers with varying degrees of access, is a strong requirement. Knowledge of core SRE principals is highly beneficial.

Key responsibilities of the role include: 

  • Helping to lead the development of our observability and reliability engineering strategy

  • Defining and driving the roadmap for observability tooling, ensuring alignment with business goals and scalability requirements

  • Working with telemetry data at enormous scale, ingesting data from industry-leading GPU clusters

  • Acting as the lead for all observability efforts related to AWS services, ensuring seamless integration with the observability platform

  • Collaborating with engineering leadership to establish observability as a core function of the development lifecycle

  • Working closely with application teams to ensure observability systems are fully integrated and providing the necessary insights

  • Enabling SRE frameworks, promoting SLAs, SLOs and SLIs, and working closely with platform teams to ensure reliability is constantly improving

  • Growing, adapting and investing in your team, fostering a culture of continuous learning and improvement, encouraging adoption of new observability tools and techniques

Who are we looking for?

The ideal candidate will have the following skills and experience: 

  • Proven experience leading observability or SRE teams in a cloud-native or hybrid-cloud environment, running platforms in production and at scale

  • Well versed in reliability engineering concepts, including different types of testing, progressive deployments, error budgets, the role observability plays and fault-tolerant design

  • Hands-on experience with modern observability tools and frameworks such as Prometheus, OTEL (OpenTelemetry), Grafana and enterprise SaaS Observability platforms, such as Datadog and Dynatrace

  • Expertise in designing, building and scaling observability solutions for distributed systems

  • Customer focused, with an enthusiasm for providing infrastructure as a service and defaulting to a product lens when evaluating platform scale problems

  • Excellent communication skills and the ability to collaborate with cross-functional teams

  • Leadership experience with demonstrated success in mentoring and developing technical talent

  • Experience with cloud platforms, such as AWS, Azure or Google Cloud

  • Familiarity with microservices architecture and containerized environments, such as Kubernetes and Docker

  • Knowledge of infrastructure as code (IaC) and automation tools, such as Terraform and Ansible

Why should you apply

  • Market-leading compensation plus annual discretionary bonus

  • Lunch provided in the office (via GrubHub)

  • Informal dress code and excellent work/life balance

  • Excellent paid time off allowance of 25 days

  • Sick days, military leave, and family and medical leave

  • Generous 401(k) plan

  • 16-weeks’ fully paid parental leave

  • Medical and Prescription, Dental, and Vision insurance

  • Life and Accidental Death & Dismemberment (AD&D) insurance

  • Employee Assistance and Wellness programs

  • Generous relocation allowance and support

  • Great selection of office snacks, and hot and cold drinks

  • Free on-site gym and car parking

This role is employed through our US affiliate.

Location: Dallas
Apply Now
An image of Mario
Mario FPGA Manager

"While some people might think working in finance may not be too exciting, at G-Research, it is, especially if you see it as a problem to solve. How do we solve this algorithm? How do we get faster? This is why I think people are really excited to work at G-Research."

Find out more

Interview process

Online Application

Our assessment process kicks off with our Talent Acquisition team, who will review your application and assess your fit for the role.

Stage One: Technical Interview

You will meet with a team member – or take a remote test – where your technical abilities will be put to the test.

Stage Two: Behavioural Interview

We will set aside technical skills and focus on you.

Stage Three: Further Technical Interviews

Here, we will take a deeper dive into your technical skills and competencies.

Stage Four: Management Interviews

The final stage of our interview process is where you will meet members of your team, your future manager, and functional leadership.

Latest news

See all news
G-Research May 2025 Grant Winners
  • 18 Jun 2025

Each month, we provide up to £2,000 in grant money to early career researchers in quantitative disciplines. Hear from our May grant winners.

Read article
G-Research 2025 PhD prize winners: University of Warwick
  • 04 Jun 2025

Every year, G-Research runs a number of different PhD prizes in Maths and Data Science at universities in the UK, Europe and beyond. We're pleased to announce the winners of this prize, run in conjunction with the University of Warwick.

Read article
G-Research 2025 PhD prize winners: University of Oxford
  • 29 May 2025

Every year, G-Research runs a number of different PhD prizes in Maths and Data Science at universities in the UK, Europe and beyond. We're pleased to announce the winners of this prize, run in conjunction with the University of Oxford.

Read article

Stay up to date with
G-Research