WEDNESDAY, JUNE 12TH 2024
14:00-17:00 Pre-conference workshop:

Recent advancements in large language models (LLMs), such as ChatGPT, has revolutionised the field of natural language processing (NLP) and opened new possibilities for healthcare text analytics. This tutorial, structured as a combination of lectures and demonstrations, aims to provide a comprehensive guide to leveraging large language models in the healthcare domain, focusing on advanced techniques and applications. The tutorial will begin with an overview of the open source LLMs, emphasising their potential in addressing complex challenges within healthcare text analytics. Special attention will be given to the unique issues surrounding privacy, security, and domain-specific nuances inherent in healthcare data. Participants will be guided through practical applications of LLMs in two distinct healthcare text domains: 1) Discharge Note Generation and 2) PubMed Abstract Information Extraction. Practical demonstrations will illustrate how LLMs can be tailored for each specific domain using prompting, in-context learning, instruction tuning (finetuning). Furthermore, we will delve into LLMs’ challenges in adapting to handle multi-modal data representations.

Tutorial organisers: Yunsoo Kim, Jinge Wu and Honghan Wu (University College London, Institute of Health Informatics)

THURSDAY, JUNE 13TH 2024
9:30-10:15 Registration
10:15-10:30 Welcome
10:30-11:15 Keynote: Prof Suzan Verberne (Leiden University).

ChatGPT can do a lot for us: it can serve as a text corrector, as a source of inspiration, as a programming aid, and as an interactive search engine. ChatGPT is also widely used in the health domain, both by doctors and patients. Large language models (LLMs) such as ChatGPT can write very convincing texts, but being able to write fluently is not the same as providing correct information. Should we worry about that? In my presentation I will first discuss our work on text mining from patient experiences, highlighting the challenges of extracting medical information from informal text. Then I will discuss the opportunities of using LLMs, and go into the risks and challenges. I will also make suggestions for responsible use of LLMs for medical applications.

11:15-11:30 Break
11:30-12:20 Presentations: 

Chair:


  • Arlene Casey, Matúš Falis, Franz S. Gruber, Matthew Murrell, Spyro Nita, Amy Tilbrook, Charlie Mayor, Katherine O’Sullivan and Kathy Harrison. Developing a Common Schema for De-identification of Personal Health Identifiers in EHRs across Scotland.
  • Elizabeth Ford, Kerina Jones, Rob Stewart, Angus Roberts, Goran Nenadic, Simon Pillinger and Ben Fell. How can we conceptualise and measure re-identification risk from de-identified clinical free text data?
  • Daisy Monika Lal, Paul Rayson, Erik Van Mulligen and Jan Kors. Leveraging Large Language Models to Extract Cancer Patient Experiences. (lightning talk)
  • Liliana Valles, Alice Tapper, Daniel Goldwater and Matthew Taylor. Accelerating NHS feedback moderation at NHS.UK with NLP. (lightning talk)
12:20-12:35 Open community forum and discussions: 

This is an open slot for colleagues to briefly inform the community about any ongoing or future activities, initiatives, projects, etc. It can be used to invite collaborations, highlight opportunities and challenges, etc. Every speaker will have 3 minutes.

12:35-14:00 Lunch
14:00-15:15 Panel:

Chair: Arlene Casey (University of Edinburgh) and Vishnu Chandrabalan (Lancaster University)

15:15-15:55 PhD forum:

Chairs: Arlene Casey (University of Edinburgh) and Ruizhe Li (University of Aberdeen)


  • Carly Yung. Measuring Clinical Critical Cultural Awareness, a Sociolinguistic Analysis on Clinicians’ Language Change in Patients’ Mental Health Records Across Different Groups in the UK
  • Yunsoo Kim. Multimodal LLM for Computer Assisted Intervention: Human in the Loop with Eye Gaze of Radiologists

Panel:

16:00-17:30 Posters and demos: 

  • Franz S. Gruber, Matúš Falis, Amy Tilbrook and Arlene Casey. A Privacy Risk Dashboard for Clinical Free-text (DEMO)
  • Fawaz Alarfaj, Hikmat Ullah Khan and Muzamil Ahmed. Context-aware Medical Question Answering: An Extended Transformers-Based Approach with BioBERT Encoding for Restricted Domain Queries
  • Warren Del-Pinto, Jenny Humphreys, Meghna Jani, Prajwal Khairnar, Ana Aldana, Robyn Hamilton, Karim Webb, Goran Nenadic and William Dixon. Annotation of Outpatient Letters to Estimate Prevalence and Misclassification of Musculoskeletal Disease
  • Jack Wu, Jonathan Breeze, Dhruva Biswas, Sam Brown, Brian Tam To, Matthew Ryan, Theresa McDonagh, Daniel Bromage, Antonio Cannata, Thomas Searle, James Teo, Richard Dobson, Ajay Shah and Kevin O’Gallagher. How representative are heart failure clinical trials? A comparative study using natural language processing
  • Sean Farrell, Noura Al Moubayed, Alan Radford, Gina Pinchbeck and Peter-John Mäntylä Noble. Where are all the antimicrobials being used? Large Language Models for Monitoring and Adherence to Stewardship Guidelines in Veterinary Practices
  • Mihael Arcan, David-Paul Niland and Fionn Delahunty. An Assessment on Comprehending Mental Health through Large Language Models
  • Zihao Li, Samuel Belkadi, Nicolo Micheletti, Lifeng Han, Mattew Shardlow and Goran Nenadic. Improving Biomedical Text Readability with LLMs and Controllable Attributes
  • Kawsar Noor, Baptiste Paul Ribyere, Adam Sutton, Xi Bai, Tom Searle, Timothy Roberts and Richard J Dobson. NLP Enriched Research Data Extracts: An OMOP Pipeline for Producing Research Data Extracts
  • Matúš Falis, Aryo Gema, Hang Dong, Luke Daines, Siddharth Basetti, Michael Holder, Rose Penfold, Alexandra Birch and Beatrice Alex. Can GPT-3.5 Generate and Code Discharge Summaries?
  • Yusuf Yildiz, Goran Nenadic, Meghna Jani and David Jenkins. Investigating the Use of Transformer Models for Clinical Prediction Modelling – A Case Study in UK Biobank Secondary Care Data
  • Imane Guellil, Salomé Andres, Bruce Guthrie, Atul Anand, Huayu Zhang, Abul Kalam Hasan, Honghan Wu and Beatrice Alex. Enhancing Natural Language Processing Capabilities in Geriatric Care: An Annotation Scheme and Guidelines
  • Katharine Anderson, Sean Farrell, Robert Christley, Pj Noble and Gina Pinchbeck. The challenge of teasing out language in veterinary electronic healthcare records
  • Samuel Belkadi, Nicolo Micheletti, Lifeng Han, Warren Del-Pinto and Goran Nenadic. Label-To-Text-Transformer: Generating Synthetic Medication Prescriptions
  • Yunsoo Kim, Jinge Wu and Honghan Wu. Exploring Training Methods for Medical LLMs.

17:30-18:30 Birds of feather meetings: 

Chair:

Space will be available for colleagues to self-organise and run birds-of-feather or specific project meetings. The following groups will meet:

  • Standards for data modelling and representation (OMOP, FHIR)
  • Computational infrastructure
  • PPIE
  • Clinical NLP governance
  • Funding models for healthcare NLP
18:30-22:00 Drinks reception (18:30) and conference dinner (from 19:00)
FRIDAY, JUNE 14TH 2024
09:15-09:30 Introduction to Day 2
09:30-10:20 Presentations: 

Chair:


  • Jennifer Jiang, James Brandreth, Mairead McErlean, Jack Ross, Maisarah Amran, Enrico Costanza, Yogini Jani, Leilei Zhu, Richard Dobson, Folkert Asselbergs, Wai Keong Wong and Anoop Shah. Feasibility study of ‘MiADE’ point of care natural language processing system: methodology and initial results.
  • Stephen Barlow, Anna Barnes, Gary Cook, Sugama Chicklore, Yulan He and Thomas Wagner. Automatic TNM staging classification for [18F] fluorodeoxyglucose PET-CT reports for lung cancer utilising natural language processing and multi-task learning.
  • Ratchakrit Arreerard and Scott Piao. Exploring GPT-4 for Fine-Grained Emotion Classification.
  • Arlene Casey, Matúš Falis, Franz S. Gruber, Amy Tilbrook, Elizabeth Ford and Kathy Harrison. Beyond the Surface – Exploring and Defining Indirect Risks in Clinical Free-text (lightning talk)
10:20-11:00 PhD forum:

Chairs: Arlene Casey (University of Edinburgh) and Ruizhe Li (University of Aberdeen)


  • Adam Williams. Developing AI Methods for Animal Health and Welfare Monitoring
  • Simon Ellershaw. Automated Generation of Hospital Discharge Summaries Using Clinical Guidelines and Large Language Models

Panel:

11:05-12:20 Posters and demos: 

  • Faizan E Mustafa and Juan G. Diaz Ochoa. Tool for mapping of medical narratives into medical ontologies in low resource setting: A case study for German (DEMO )
  • Lifeng Han, Serge Gladkoff, Betty Galiano, Irina Sorokina, Gleb Erofeev and Goran Nenadic. Neural Machine Translation of Clinical Text between English and Spanish
  • Shubham Agarwal, Thomas Searle, Richard Dobson, Anthony Shek and James Teo. Improving Multi-Task Text Classification Performance in Electronic Health Records
  • Joseph Cronin, Keiran Tait, Jamie Wallis and Robert Durichen. ArcTEX – a precise clinical data enrichment model to support real world evidence studies
  • Wuraola Oyewusi, Eliana Vasquez Osorio, Goran Nenadic, Issy MacGregor and Gareth Price. Data, Dialogue, and Design: Patient and Public Involvement and Engagement for Natural Language Processing with Real-World Cancer Data
  • Lena Almutair, Eric Atwell and Nishant Ravikumar. Advancing Clinical Language Representation: Leveraging Semantic Cues in Clinical narrative
  • Mingyang Li, Viktor Schlegel and Goran Nenadic. How Patient-Level Knowledge Graph Benefits ICD Coding?
  • Imane Guellil, Mike Holder, Aileen Elizabeth Stirling, Beatrice Alex and Bruce Guthrie. Towards one resource for drug prescription within the UK
  • Huda M. Alshammari, Denham Phipps, Penny Lewis, Haifa Alrdahi and Riza Batista-Navarro. Development of Guidelines for Annotating Medication-Related Incident Reports
  • Jose Rodríguez Torres, Antonio Espinosa de Los Monteros, Angelo Santana, David Killick, P-J Noble and Alan Radford. Shedding light about canine and feline cancer in the UK. A text-mining approach to analyse 1,000,000 canine and feline tumour diagnoses between 2010 and 2023.
  • Mengxuan Sun, Ehud Reiter, Lisa Duncan and Rosalind Adam. The role of natural language processing in cancer care: a systematic scoping review with narrative synthesis

12:20-12:35 Open community forum and discussions: 

This is an open slot for colleagues to briefly inform the community about any ongoing or future activities, initiatives, projects, etc. It can be used to invite collaborations, highlight opportunities and challenges, etc. Every speaker will have 3 minutes.

12:35-14:00 Lunch
14:00-14:45 Keynote:  Dr Alistair Johnson (Glowyr, Inc.)

The world has been in awe at the recent applications of sophisticated machine learning models derived from large datasets. Yet in medicine, we continue to use decades old algorithms to support patient care. Models for cancer progression are based upon staging guidelines defined in the 70s, patient severity of illness is estimated using a scoring system from the 90s, and our latest and greatest criteria for sepsis was a model with three input variables. The reasons for the technological naivety in medicine are multifactorial, but one aspect stands out: researchers simply do not have much data. In this talk I will highlight the MIMIC series of databases, a suite of publicly accessible deidentified medical records. I’ll give an insider’s view on how the electronic health records for thousands of individuals were comprehensively deidentified, transformed, and shared for research without harm to the individual’s themselves. I’ll overview the utility of this data, and highlight some of our own work on language modeling enabled by the broad access to deidentified free-text clinical notes. I’ll conclude with my thoughts on how the field should better balance the benefits and risks of using patient data for research.

14:45-16:00 Industry forum: 

Chair: Dr Ben Fell (Akrivia)

Panel: Richard Dobson (King’s College London)

16:00-16:15 Final remarks and close