Do you want to tackle the biggest questions in finance with near infinite compute power at your fingertips?
G-Research is a leading quantitative research and technology firm, with offices in London and Dallas.
We are proud to employ some of the best people in their field and to nurture their talent in a dynamic, flexible and highly stimulating culture where world-beating ideas are cultivated and rewarded.
This is a hybrid role based in our new Dallas infrastructure
The role
The Calc Farm Platform Services team develops and manages a large high-performance compute (HPC) platform to enable the business to conduct complex research at scale. We are seeking a highly motivated person to join our team to help us continue to push the envelope running batch workloads on Kubernetes.
The ideal candidate will have an active interest in Kubernetes and batch computing, a broad range of experience with software engineering and development, as well as experience managing large-scale infrastructure and complex tooling environments.
The main focus will be on Armada - an exciting open source CNCF project built and maintained by the team - which we use to solve multi-cluster Kubernetes batch job scheduling at scale.
You’ll join an experienced team, working at the cutting-edge of ML workloads and at scale.
Key responsibilities of the role include:
Designing and developing high-quality software solutions using procedural programming languages, with a focus on Golang
Building and maintaining highly scalable, highly available and globally distributed systems to support large-scale research workloads
Managing and optimising data interactions across relational and non-relational databases, particularly PostgreSQL
Developing and operating containerised applications within Kubernetes, ensuring effective orchestration and workload scheduling
Supporting, tuning and troubleshooting Linux-based systems as part of our core compute platform
Applying core networking knowledge to help debug, optimise and enhance platform connectivity and performance
Independently diagnosing and resolving complex technical issues across infrastructure and software layers
Applying solid software architecture principles, computer science fundamentals and data structure knowledge to guide design decisions and code quality
Driving continuous improvement by contributing to CI/CD pipelines and engineering best practices
Staying up to date with emerging technologies and approaches, and applying new knowledge across disciplines
Who are we looking for?
The ideal candidate will have the following skills and experience:
Experience with developing Kubernetes components, such as controllers and operators
Experience with event-driven programming and message queues, such as apache Kafka and Pulsar
Experience of high-performance computing, Kubernetes, or DAG (Directed Acyclic Graph) workflows
Experience of running systems at scale using a cloud provider, ideally AWS
Use of operational and runtime tools and practices, including monitoring and logging with systems such as Prometheus and Grafana
Why should you apply?
Market-leading compensation plus annual discretionary bonus
Lunch provided in the office (via GrubHub)
Informal dress code and excellent work/life balance
Excellent paid time off allowance of 25 days
Sick days, military leave, and family and medical leave
Generous 401(k) plan
16-weeks’ fully paid parental leave
Medical and Prescription, Dental, and Vision insurance
Life and Accidental Death & Dismemberment (AD&D) insurance
Employee Assistance and Wellness programs
Generous relocation allowance and support
Great selection of office snacks, and hot and cold drinks
Free on-site gym and car parking
This role is employed through our US affiliate.