top of page
Digital Programmer_edited.jpg

Review process for students

The lab will help you to develop and improve your data science skills, but we require a minimum level of competency so that you can get the most out of your time working with us. If you are interested in joining the Lab to undertake a project (e.g., as part of your MRes), you must answer the questions below and you should be able to use Python to produce a report to summarise your findings.
This assessment task focuses on breast cancer patients but your project in the lab is likely to use data relating to patients with brain tumours.


The data details the demographic and tumour characteristics of patients with breast cancer who received chemotherapy before surgery.

Important factors in breast cancer are ER, PR and Her-2 receptors, which can occur in varying combinations (i.e. ER+ve, PR+ve, Her-2 -ve or ER-ve, HER-2 +ve), tumour stage and size.

This allows clinicians to give chemotherapy, and measure response on imaging (MRI) with eventual correlation with surgical specimens. Patients' tumours can shrink by different amounts in response to treatment - and some might disappear completely (so-called "pathological Complete Response" or pCR). One would expect that patients who have pCR have a better prognosis.


  1. Produce a report that details the demographics and histological subtype of the cancers, with reference to missing data

  2. How do presurgical MRI tumour sizes relate to pCR?

  3. Report survival based on pCR rates

  4. Report survival based on demographic and tumour factors

  5. Does pCR influence survival? How would you differentiate between an association between pCR and survival and a causal relationship.

  6. What is the earliest timepoint at which response seems to be able to predict survival?

As part of the presentation, you will be asked general questions to test your understanding of the data, and questions about your technical approaches.

Other questions

  1. How do you think we can use data to improve care, treatment, outcomes and quality of life for brain tumour patients?

  2. How can we use data to help us find new treatment for patients with brain tumours?

  3. We live in a "data driven" world. What do you think the limits of data-driven approaches are?

  4. What do you think the most interesting and important advances in the last 5 - 10 and next 5 - 10 years in data science will be, and why?

  5. Where do you think the ethical issues with data lie, particularly in healthcare. Are these different for the brain tumour population?

Students projects: Project
bottom of page