Skip to main content

This is an exciting opportunity to get involved in a research project building upon innovative AI approaches to define gene x drug exposure x phenotype relationships at scale over continuous time in large human cohorts. We are looking for a highly motivated bioinformatics expert or data scientist to join this exciting project, and to drive specific aspects of this cutting-edge research, including the analysis of population scale genetic cohorts with electronic healthcare and prescriptions records. You will be based in the Birney research and Open Targets groups at EMBL-EBI and work closely with the Gerstung group at DKFZ. The Birney research group has several ongoing related projects and the candidate will benefit from extensive knowledge into population scale modelling of healthcare records, genetics and exposures. You will also benefit from the expertise in pharmacogenetics and drug target discovery within the Open Targets team, and be embedded in the EMBL Human Ecosystems transversal theme community, connecting to others involved in exposome and drug safety research.

Your role
You will work on direct linkage across a diverse set of human traits with detailed healthcare, genetic and additional information in multiple human cohorts, and as such we are looking for an experienced post-doctoral fellow with a strong interest in health-related data science and proven expertise in modern AI techniques.

You will lead on a project funded by the EMBL Human Ecosystems transversal theme programme to investigate gene x drug exposure x phenotype relationships at scale over continuous time in large human cohorts. The successful candidate will join the Birney research team at EMBL-EBI working closely with Open Targets to apply and further develop an AI framework based on generative transformers (Delphi) for multi-disease and multi-drug modelling across continuous time in human populations. Some of the initial work will include incorporating prescription exposome data into Delphi and developing a pharmacogenetic analysis model, correlating genetic variation with exposure to drug outcomes based upon known associations and exploring novel genetic-drug exposure-phenotype associations.

Drug exposure is a strong environmental modifier and is intrinsically linked to disease risk and outcome in human populations. There remains many open questions and undiscovered interactions between drug exposure, genetics and disease onset/outcome that can be investigated using large human cohorts with detailed health records and genetics, providing the candidate with lots of research opportunities and potential novel findingsOne key factor which makes this type of research more powerful is recent innovations in generative AI making it possible to model all disease and other important factors, such as drug exposure and genetics at the same time, mapped to the same internal space (embedding) and across continuous time. This not only allows us to assess the overall impact of genetics and exposures on disease risk across a population but also provides a framework for assessing at which time across a life course these effects most strongly manifest.

Key responsibilities

As part of a dynamic, collaborative, and international team, you will be responsible for:

- Integration of prescribing data and genetic data into the DELPHI model initially using UK BioBank data
- Developing benchmarking and validation datasets
- Benchmarking the model against known associations
- Analysis utilising the DELPHI model to investigate gene x drug exposure x phenotype relationships over continuous time
- Exploring the ability to run the model in other human cohorts with genetic and prescribing data

You have

- Advanced degree (MSc, PhD) in computer science, bioinformatics, software development, or a related field
- Strong proficiency in Python and experience with Large Language Model integration
- Proven experience in applying modern ML/LLM frameworks and concepts
- Good understanding of ML principles including embeddings, cross-validation and fine-tuning
- Proficiency in common data preprocessing task and normalisation
- Experience with large scale compute infrastructure, including high performance compute facilities (HPC) and / or cloud based workflow managers such as Nextflow
- Exposure to source code version control software such as Git and GitHub
- Experience in independent problem-solving and examples of resolving complex issues
- Fluency in written and spoken English
- Ability to effectively communicate ideas or issues and work with team members from multidisciplinary backgrounds
- Interest in promoting your work and the ways we have solved complex challenges

You might also have

- Experience in MLOps including experiment tracking and model deployment
- Experience with current LLM frameworks, such as LangChain and open-source LLM deployment (e.g., llama-cpp, ggml, Xorbits Inference)
- Knowledge of human genetics, genomics and/or pharmacogenetics – or are interested in learning about these topics
- Experience on working with prescribing data, electronic medical records, large scale cohort biobanks such as UK BioBank
- Enthusiasm in novel research and discovery

Benefits and Contract Information

- Financial incentives: depending on circumstances, monthly family/marriage allowance of £272, monthly child allowance of £328 per child. Generous stipend reviewed yearly, pension scheme, death benefit, long-term care, accident-at-work and unemployment insurances
- Hybrid working arrangements
- Private medical insurance for you and your immediate family (including all prescriptions and generous dental & optical cover)
- Generous time off: 30 days annual leave per year, in addition to eight bank holidays
- Relocation package
- Campus life: Free shuttle bus to and from work, on-site library, subsidised on-site gym and cafeteria, casual dress code, extensive sports and social club activities (on campus and remotely)
- Family benefits: On-site nursery, child sick leave, generous parental leave, holiday clubs on campus and monthly family and child allowances
- Contract duration: This position is a Fixed-term 2 year contract
- Salary: Year 1 Stipend – £3,307 per month after tax but excl. pension & insurances (Total package will be dependant on family circumstances)
International applicants: We recruit internationally and successful candidates are offered visa exemptions. Read more on our page for international applicants.
- Diversity and inclusion: At EMBL-EBI, we strongly believe that inclusive and diverse teams benefit from higher levels of innovation and creative thought. We encourage applications from women, LGBTQ+ and individuals from all nationalities.
- Job location: This role is based in Hinxton, near Cambridge, UK. You will be required to relocate if you are based overseas and you will receive a generous relocation package to support you.

Application Instructions:

To apply please submit a cover letter and a CV through our online system before the closing date 08/12/2024

Application Closing Date:
8 December 2024
Salary:
Year 1 Stipend - £3,307 per month after tax