AI Challenge Day 2025

Main image: ART-AI AI Challenge Day 2023

The annual ART-AI ‘AI Challenge Day’ will take place on Monday, 20th January 2025, from 09:00 to 17:00 (GMT) at the APEX City of Bath Hotel, James St W, Bath BA1 2DA.

ART-AI students and industry will be presenting an AI Challenge and will host group discussions around a table. The goal is to foster thought-provoking conversations and advance research through collaboration, while also providing valuable networking opportunities. At the conclusion of the day, each group will present a summary of their discussions.

Please register and select which AI Challenges you would like to participate in by Monday 13th January 2025.

Every effort will be made to assign you to your selected topic(s), and to move you to a new table for the afternoon session if you select more than one topic, however we will also need to ensure a spread of participants across topics.

Programme 0900-17:00 (GMT)

09:00-09:30 Arrival Tea/Coffee

09:30-09:45 Introduction to the day

09:45-10:45 Presentation of AI Challenges from the table hosts

10:45-11:00 Mid Morning Tea/Coffee

11:00-12:30 Table discussions

12:30-13:30 Lunch

13:30-15:00 Table discussions

15:00-15:15 Afternoon Tea/Coffee

15:15-15:45 Prepare presentation of table discussion

15:45-16:45 Presentations/Feedback from table discussions

16:45-17:00 Closing the day

17:00 Drinks reception

‘Table Hosts’ and AI Challenges

1. AI as a facilitator for citizen interaction with socio-technical systems with Jaime Sichman

In recent decades, there has been great progress in the digitalization of diverse public services in most countries. Such a process increased significantly between 2020 and 2022, due to the COVID-19 pandemic. Tasks such as scheduling medical appointments, withdrawal of 2nd. copy of documents or renewal of driving license, can be carried out from home, through socio-technical systems designed for these purposes. In several countries, however, there is a large majority of the population that does not have great dexterity with the digital environment, meaning that this interaction between citizens and public authorities does not occur in the manner planned. Certainly, the use of AI techniques can mitigate this problem, and this is the challenge we intend to address on this day.

2. Needle-in-a-haystack (NiaH) tasks for classification and regression with James Proudfoot

Needle-in-a-haystack (NiaH) tasks in machine learning (ML) are problems that involve highly imbalanced datasets. Data imbalance refers to situations where class labels (integers) or y-values (floating points) show sharp spikes in their distributions. For example, in medical diagnosis, if 99% of patients do not have a certain disease and 1% are true positives for the disease, any ML training data would be considered imbalanced (ratio of true negatives to true positives is 99:1), or in drug discovery where the hit-rate of active compounds in a chemical library ranges from 1%-0.1% and below (as measured by EC50 y-values – half maximal effective concentrations) this has implications for both the evaluation of ML models (accuracy becomes an improper metric because the ML model could achieve 99% accuracy by simply predicting negatives for all cases) and in the training of models to combat data imbalance and predict rare events. One example of a method to help with imbalanced data involves training data re-balancing. Randomly down sampling of the more common classes and upweighting the ML model loss for these down sampled cases (usually by the amount of down sampling performed) has been commonly used (1). But this method may not be applicable to regression (as opposed to classification) problems and has been criticised for its potential to lose useful information by down sampling (2). We would like to investigate modifications to the down sampling-upweighting procedure (for example, intelligent methods to selectively down sample the larger classes) and applications of rebalancing to regression problems. We would also like to consider more generally alternative methods to tackling NiaH tasks and prediction of rare events in ML.

(1) https://developers.google.com/machine-learning/crash-course/overfitting/imbalanced-datasets

(2) https://stats.stackexchange.com/questions/569878/upweight-minority-class-vs-downsampleupweight-majority-class

3. What are the potential future risks of generative AI chatbots for online safety? with Ofcom

As generative AI chatbots continue to evolve and become embedded in different areas of our daily lives, their potential risks to online safety become increasingly complex and multifaceted. Advancements in underlying technologies, such as more sophisticated natural language processing and machine learning algorithms, could lead to more autonomous, convincing, and pervasive fraud attempts, misinformation, or manipulation. Changes in business models, including the monetization of AI interactions, might prioritize engagement and advertising interests over safety, potentially exposing users to harmful content. Future forms of user interactions and emerging use cases, such as personalised mental health support and its use as a social companion raises concerns about the ethical use of AI. Additionally, these risks may disproportionately affect specific populations, such as children, the elderly, or marginalized communities, who may be more vulnerable to exploitation or harm. Understanding these trends is crucial for developing robust safeguards and ensuring the safe and equitable deployment of generative AI chatbots.

4. Implications of Non-Determinism in Neural Network Optimisation with Dan Beechey and Sophia Jones

Neural networks are powerful but unpredictable, with small changes in factors like parameter initialisation or data pre-processing leading to significant variations in performance. A recent paper shows that many sources of non-determinism can produce similar levels of variability, even from seemingly trivial changes like altering a single neuron’s initialisation. These surprising findings challenge assumptions about neural network stability and raise critical questions about reproducibility and explainability. For this year’s AI Challenge Day, our table will reproduce the most important findings in the paper, to validate their claims and explore their broader implications.

Participants will work in small groups or individually to replicate the paper’s experiments, breaking the work into subtasks managed using agile project management and a shared scrum board. Tasks will include reproducing key results, investigating additional sources of non-determinism, and discussing the impact on AI research. We ask that participants are familiar with Python, PyTorch and bring a laptop, and we will contact them beforehand with setup instructions. The paper, which participants are encouraged to review before the event, can be found here: https://proceedings.mlr.press/v139/summers21a/summers21a.pdf. By the end of the session, we aim to verify the findings and generate new insights into managing non-determinism in training AI systems.

5. No Standards, No Trust: Why We Need Generative AI to Learn to Follow Industry Standards with Joseph Marvin Imperial

Technical standards, or simply standards, are established documented rules, regulations, and guidelines that facilitate the interoperability, quality, and accuracy of systems and processes. In recent years, the adoption of generative AI models such as ChatGPT by OpenAI has increased tremendously, spreading implementation interests across standard-driven industries, including aerospace, security and defence, medicine, manufacturing, education, and language proficiency assessment.

In this table, we will discuss ideas towards transparent and effective integration of standards to generative AI to enhance management, oversight, and user trust while ensuring the production of high-quality, standard-compliant outputs. Considering the open-endedness of standards as living documents, we will also cover potential research ideas, challenges, and pitfalls towards optimizing generative AI models with respect to constraints defined by standards while maintaining extensibility to domain-specific customizations. The main position of this table is that the role of technical standards created by domain experts will play a principal role in the quality compliance, trust, and oversight of larger, more powerful AI systems in the future.

We invite PhD students, staff, industry partners, and guests from interdisciplinary areas (social sciences, natural sciences, engineering, security and defence, education, etc) who are familiar or have worked with standards, norms, and guidelines from their respective domains. Participants who will also be invited to contribute to a draft position paper manuscript conforming to the current position described above which will be submitted to a major ML conference by mid-2025.

6. Going between quantitative analysis of natural language data and semantic validation of outputs from language models with Jack McKinlay

Generative language models like ChatGPT have given us the opportunity to automatically generate natural language data from different inputs and prompts. While methods of analysing text data are well-developed and established, up until now we have been able to reasonably assume the link between outputs and sources is valid, based on the outputs being created by human processes. Now that we can no longer rely on that assumption, how can we bridge the gap between using existing methods for analysing natural language outputs from generative models, and validating the meaningfulness and appropriateness of these outputs in a robust, automated fashion?

7. Accurate speech to text transcription in group conversations with Eoin Cremen

AI is being used in qualitative research to analyse dialogue transcripts. All of the leading LLM’s are able to perform accurate and efficient content, thematic, and other discourse analyses without any additional training or fine-tuning. However, this performance relies on accurate speech-to-text transcription. Researchers have traditionally performed manual transcription. There are multiple AI tools available, both paid and free, that can deliver accurate speech-to-text services. However, the accuracy of such tools declines rapidly when faced with the challenges of group dialogue. Multiple voices, overlapping speech, and other features of group discussions typically result in compromised transcripts. A low-cost solution to this challenge would be of high value to many researchers.

8. Designing AI Feedback for Human Understanding and Performance with Joshua Tenn

Giving users feedback on an AI’s performance seems sensible, given overall preferences for human-in-the-loop designs and legislation on human oversight of AI. Yet, growing psychological research shows we typically misunderstand feedback in three ways: (a) we overly penalise AI for errors, (b) when a model has been performing well, our sensitivity to error is low, and (c) we have a poor understanding of the relative performance of ourselves and AI. On this table, we will aim to brainstorm and design ways of providing feedback for users that may improve human understanding and performance. This could be from several approaches – e.g., an explainable AI approach (what information we explain), a policy approach (what information do we mandate) and a psychological approach (what sort of messaging).

AI Challenge Day 2025

Programme 0900-17:00 (GMT)

‘Table Hosts’ and AI Challenges

Event Info

More Events