The thesis topic is: 1) how can we design a listening system / device to detect the (dimensional) emotion state of the last utterance spoken by a user to a voice assistant using only speech audio emotion and physiological signals? 2) given the interaction context (e.g., home), how does audio directional information, which is prone to distortion, contribute to recognition performance?
MSc thesis: Few shot emotion recognition using intelligent voice assistants and wearables
Advisor(s): Alle-Jan van der Veen, Abdallah El Ali (CWI)
Program: MSc Signals and Systems