Application: enriching and adding structure to audiovisual data
Speaker diarization, also called speaker segmentation and clustering, is the
process of partitioning an input audio stream into homogeneous segments
according to speaker identity. Speaker diarization can also improve the
readability of automatic transcription by structuring the audio stream into
speaker turns.
A specific use of speaker diarization is as a 'Who's Who' in audio
documents, thus providing a means of knowing 'who spoke when'. This technology was
applied to the task of determining the speaking time of political speakers
during the last Presidential election period in France as an aid to human
operators.