Bournemouth University

PhD   Research Project Outline

Kavisha Jayathunge


Emotionally Expressive Speech Synthesis

Supervisor: Dr Richard Southern

Emotionally expressive speech synthesis for a multimodal virtual avatar.

Virtual conversation partners powered by Artificial Intelligence are ubiquitous in today’s world, from personal assistants in smartphones, to customer-facing chatbots in retail and utility service helplines. Currently, these are typically limited to conversing through text, and where there is speech output, this tends to be monotone and (funnily enough) robotic. The long-term aim of this research project is to design a virtual avatar that picks up information about a human speaker from multiple different sources (i.e. audio, video and text) and uses this information to simulate a realistic conversation partner. For example, it could determine the emotional state of the person speaking to it by examining their face and vocal cadence. We expect that taking such information into account when generating a response would make for a more pleasant conversation experience, particularly when a human needs to speak to a robot about a sensitive matter. The virtual avatar will also be able to speak out loud and project an image of itself onto a screen. Using context cues from the human speaker, the avatar will modulate its voice and facial expressions in ways that are appropriate to the conversation at hand.

The project is a group effort and I'm working with several other CDE researchers to realise this goal. I'm specifically interested in the speech synthesis aspect of the project, and how existing methods could be improved to generate speech that is more emotionally textured.

Background: MEng in Electronics and Software Engineering from the University of Glasgow

Ruibin Wang


Intelligent Dialogue System for Automatic Diagnosis

Supervisor: Dr Xiaosong Yang

The automatic diagnosis of diseases has drawn increasing attention from both research communities and health industry in recent years. Due to the conversation between a patient and a doctor can provide many valuable clues for diagnosis, dialogue system is naturally used in the field of medical diagnosis to simulate the consultation process between doctors and patients. The existing dialogue-based diagnosis systems is mainly based on data-driven methods and highly rely on the statistical features from large amount of data which is normally not available. Previous works have already indicated that using medical knowledge graph in the diagnosis prediction system will improve the model’s prediction performance and robustness against data insufficiency and noise effectively.

The aim of my project is to propose a new dialogue-based diagnosis system which not only can efficiently communicate with patients to obtain symptom information, but also can be guided by medical knowledge to make accurate diagnosis more efficiently.

Research Interest: Natural language processing, Chatbot, Task-oriented dialogue, Deep learning algorithms


BSc, Applied Mathematics, Southwest Jiaotong University

MSc, Vehicle operation engineering, Southwest Jiaotong University


Jiajun Huang



Photorealistic and Interactive Audio-Driven Talking Head Video Synthesis

Supervisor: Hongchuan Yu

With the advent of conversational AI technologies, such as Google Assistant or Amazon Alexa, having conversations with AI is becoming a ubiquitous way of interacting with machines. To make the experience of talking to machines more engaging and enjoyable, visual information such as talking animation of an virtual avatar is indispensable. However, many existing methods can only generate unsatisfactory results and those that can have little consideration for interactive scenarios. Those capable models usually take significant computational resources to run and can not generate other kinds of animation such as listening or thinking animation.

The goal of my project is to devise new methods in synthesizing talking head videos that can surpass existing models in terms of quality and is capable of generating more versatile animation for interactive use cases.


BSc in Network Engineering from South China Normal University




University of Bath

Will Kerr


Supervisor: Dr Wenbin Li


The PhD topic is Autonomous Filming Systems (AFS) for the application of professional cinematography. Its focus is on developing a camera equipped ground-based (wheeled) robot, which is intended to provide better accuracy, safety, efficiency, and artistic control than the current manual method of filming on-set (e.g. 3 people ; one moving a camera ‘dolly’, one controlling the camera, and one holding cables).

Current research

Examples of autonomous filming exist already, mostly concentrating on UAV (drone) platforms. These have been realised by commercial companies (DJI, Skydio, Yuneec etc), and the research community (e.g.  [1] [2], [3], [4]). They range from the simple waypoint-based trajectory planning, to some integration of artistic cinematography principles in the camera pose decisions. Integration of localisation, mapping, and visual processing to aid in trajectory planning features in most systems.

Research Plans

This research will advance from the above State-Of-The-Art by focussing on the more nuanced artistic aspects of how professional video is captured. Firstly, it will analyse and distil existing movie content to understand professional techniques using computer vision analysis techniques (e.g. the most basic example of this is the rule-of-thirds, but more expect to be understood by the developed algorithms). Investigations will cover computer vision analysis (e.g. colour, focus, movement and framing), and the emotional impact that various cinematic techniques invoke on viewers. Secondly, this learnt-behaviour will be applied to a physical and simulated ground-robot in a representative on-set filming environment, attempting to automate the decision-making process of how artful camera movement is realised. Thirdly, evaluation and iteration will refine the performance in conjunction with industrial collaboration, ideally in a real-life filming task.

Manuel Rey Area


Deep View Synthesis For VR Video

Supervisor: Dr Christian Richardt

With the outbreak of VR, it is key for users to be fully inmersed in the virtual world. The users must be able to move their head freely around the virtual scene unveiling occluded surfaces, perceiving depth cues and observing the scene to its last detail. Furthermore, if scenes are captured via casual devices (smartphone) anyone could convert their 2D pictures to a 3D fully immersive experience bringing a new digital world representation closer to ordinary users. The aim of this project is to synthesize novel views of a scene from a set of input views captured by the user. Eventually, the whole scene 3D geometry must be reconstructed and depth cues must be preserved to allow 6-DoF (degrees of freedom) head motion avoiding the well-known VR sickness. The main challenge lies in generating the synthetic views with a high level of detail, light reflections, shadows, occlusions... resembling to reality as much as possible.


MSc Computer Vision, Autonomous University of Barcelona

BSc Telecommunications Engineering, University of Vigo

Mesar Hameed


Research Project: Immersive Technology to Enable Travel and Transport for the Visually Impaired

Supervisor: Prof Peter Hall 



As part of the European Commission's Marie Skłodowska-Curie Research Fellowship Programme for Doctoral degrees, the Centre for Digital Entertainment at the University of Bath supports 10 Fellows with Industrial Research Enhancement (FIRE).

The FIRE programme is an integrated 4-year Doctoral Training programme, bringing together two national Centres for Doctoral Training: the Centre for Sustainable Chemical Technologies (CSCT) and the Centre for Digital Entertainment CDE). The Marie Skłodowska-Curie actions (MSCA) FIRE programme is delivering autonomous, creative, highly-skilled scientists and engineers fully ready for careers in international industries, and a model for intersectoral, interdisciplinary doctoral training in an international environment. Industrial partners ensure that research carried out is relevant and enhances the employability of graduates, both in Europe and globally.

Our Fellows receive training in scientific, societal and business aspects of digital entertainment and conduct challenging PhD research. All projects are interdisciplinary and supported by industrial or international partners.

The positions are based at the University of Bath and require some continuing involvement with an appropriate company.

This project receives funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 665992.

Our current MSCA FIRE Fellows are

Fire Fellow Research Project Outline

Tobias Bertel


Light field synthesis from existing imagery Image-based rendering of real environments for virtual reality with Dr Christian Richardt, Dr Neill Campbell and Professor Darren Cosker; Industrial partner: Technical University of Brunswick.

Thu Nguyen Phuoc 


Interactive Fabrication-aware Architectural Modelling Neural rendering and inverse rendering using physical inductive biases with Dr Yongliang Yang and Professor Eamonn O’Neill. Industrial Partner: Lambda Labs.


Yassir Saquil

Machine learning for semantic-level data generation and exploration Machine learning for semantic-level data generation and exploration with Dr Yongliang Yang and Professor Peter Hall.

Soumya C Barathi

Interactive Feedforward in High Intensity VR Exergaming

Adaptive Exergaming with Dr Christof Lutteroth and Professor Eamonn O’ Neill; Industrial Partner: eLearning Studios.

Project completed and fellow graduated.


Youssef Alami Mejjati


Multitask Learning for Heterogeneous data Creative editing and synthesis of objects in photographs using generative adversarial networks with Professor Darren Cosker and Dr Wenbin Li.

Jan Malte Lichtenberg


Bounded rationality in machine learning Bounded rationality in machine learning with Professor Özgür Şimşek; Industrial Partner: Max Planck Institute.

Maryam Naghizadeh 

A New Method for Human/Animal Retargeting using Machine Learning Multi-Character Motion Retargeting for Large Scale Changes with Professor Darren Cosker and Dr Neill Campbell.

Andrew Lawrence 

Learning 3D Models of Deformable/Non-Rigid Bodies Using Bayesian Non-Parametrics to Learn Multivariate Dependency Structures with Dr Neill Campbell and Professor Darren Cosker.

Tayfun Esenkaya 

Spatially Enhancing Sensory Substitution Devices and Virtual Reality Experiences

One Is All, All Is One: Cross-Modal Displays for Inclusive Design and Technology with Dr Michael Proulx and Professor Eamonn O’Neill; Industrial Partner: Atkins.

Project completed and fellow graduated.

© Centre for Digital Entertainment 2021. Site by MediaClash.