Transformer Based Multimodal Summarization and Highlight Abstraction Approach for Texts and Speech Audios

Altundogan T.G.; Karakose M.; Tanberk S.

Transformer Based Multimodal Summarization and Highlight Abstraction Approach for Texts and Speech Audios

dc.contributor.author	Altundogan T.G.
dc.contributor.author	Karakose M.
dc.contributor.author	Tanberk S.
dc.date.accessioned	2024-07-22T08:01:57Z
dc.date.available	2024-07-22T08:01:57Z
dc.date.issued	2024
dc.description.abstract	Multimodal summarization is a kind of summarization application in which its inputs and/or outputs can be in different data types like text, video, and audio. In this study, a new approach based on fine tuning of different pre-trained transformers was developed for abstractive and extractive summarization of audio and text data. In the proposed method, abstractive and extractive summaries of text data are provided only as text, while extractive summaries of audio data are presented as both text and audio data. Abstractive summaries of the audio data are presented as text only. Transformers with text2text input-output relationship were used in both extractive and abstractive summarization processes of the proposed method. For the training and inference processes of audio this type of data to be handled in transformers, an ASR step was followed before the summarization step. The experimental results obtained were given in detail and compared with similar approaches in the literature. As a result of the comparison, it was seen that the proposed method achieved better performance than similar prior approaches. © 2024 IEEE.
dc.identifier.DOI-ID	10.1109/IT61232.2024.10475775
dc.identifier.uri	http://akademikarsiv.cbu.edu.tr:4000/handle/123456789/11666
dc.language.iso	English
dc.publisher	Institute of Electrical and Electronics Engineers Inc.
dc.subject	Abstract data types
dc.subject	Abstracting
dc.subject	Audio data
dc.subject	Audio summarization
dc.subject	Datatypes
dc.subject	Fine tuning
dc.subject	Input-output
dc.subject	Multi-modal
dc.subject	Multimodal summarization
dc.subject	Speech audio
dc.subject	Text data
dc.subject	Transformer fine-tuning
dc.subject	Data mining
dc.title	Transformer Based Multimodal Summarization and Highlight Abstraction Approach for Texts and Speech Audios
dc.type	Conference paper

Collections

Scopus Koleksiyonu

Transformer Based Multimodal Summarization and Highlight Abstraction Approach for Texts and Speech Audios

Files

Collections