Generating empathetic responses through emotion tracking and constraint guidance
Graphical abstract. Credit: Frontiers of Computer Science (2023). DOI: 10.1007/s11704-023-2792-7

Enabling machines to communicate like humans is a long-term goal of open-domain dialogue generation. To achieve this goal, more and more studies on dialogue generation focus on the key factor, emotion. The empathetic dialogue system aims to recognize user's emotion and situation, then generates responses accordingly.

Such empathetic dialogue system can improve user's experience and establish long-term human-machine interaction. However, the existing empathetic dialogue generation models ignore the continuity of parties' in adjacent dialogue turns, resulting in inadequate emotional perception. Besides, the emotions involved in empathetic response are flexible, it is difficult to set the specific empathetic policy.

To address these issues, a research team led by Donghong Han published their research on 15 April 2024 in Frontiers of Computer Science.

The team proposed a novel empathetic dialogue generation model ETHREED, which relies on hierarchical GRUs to extract and track the emotional representation of both parties in dialogues separately. Additionally, the model predicts the responses' emotional representations by using the stochastic policy network and the guided policy search. The experimental results show that our responses have better diversity, empathy and relevance.

In one dialogue, parties' emotions tend to be continuous, or shift toward positive or negative depending on context. For modeling the continuous process of different parties, ETHREED utilizes four GRUs to get the global state, party state, emotional representation, and content representation in dialogues.

The global GRU tracks all utterance representations to get context information. The party GRU models the interaction of parties. The emotion GRU tracks parties' emotions respectively. The content GRU extracts the dialogue content representation and mitigates emotion perception errors.

Additionally, empathy can be seen as the transfer of emotion between the parties, so the research defines the process of predicting the listener's emotion state based on the speaker's emotion state as the empathetic policy. A stochastic policy network is used to model this process. We use the listener's true response emotion distribution as a constraint to guide the policy search.

Lastly, the pointer generation network dynamically incorporates the predicted listener's emotional representation and context information to decode.

Future work can consider introducing the dialogue behavior to guide the response generation and explore more reasonable evaluation metrics.

More information: Jing Li et al, Generating empathetic responses through emotion tracking and constraint guidance, Frontiers of Computer Science (2023). DOI: 10.1007/s11704-023-2792-7

Provided by Frontiers Journals

Citation: Generating empathetic machine responses through emotion tracking and constraint guidance (2024, May 16) retrieved 16 May 2024 from https://techxplore.com/news/2024-05-generating-empathetic-machine-responses-emotion.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.