Contextual Summarization of Email Threads

Authors

  • Fozia Khan Department of Computer Science, MAJU, Karachi, Pakistan Author
  • Muhammad Hussain Mughal Department of Computer Science, Sukkur IBA University, Sukkur, Pakistan Author
  • Shaukat Wasi Department of Computer Science, MAJU, Karachi, Pakistan Author

DOI:

https://doi.org/10.52584/QRJ.2302.04

Keywords:

Email Thread Summarization, Contextual Summary, Actions, Natural Language Processing, Machine Learning, Extractive Summarization, Abstractive Summarization, con-texts, Clustering

Abstract

Email has become the primary medium for official communication, enabling the exchange of text, files, and attachments among recipients. A single email thread often encapsulates extensive information shared among multiple participants across diverse topics, resulting in complex and lengthy discussions. Effective summarization of email threads is crucial for generating concise and precise summaries without omitting critical details. This study presents a novel methodology for generating contextual summaries of email threads. The proposed framework begins with constructing a custom dataset comprising university event related email threads (2 < thread length < 6) annotated for summarization tasks. The summarization process involves a three step approach: (1) clustering semantically similar sentences using K-Means and Agglomerative Hierarchical Clustering, (2) extracting contextual information from clusters using Latent Dirichlet Allocation (LDA) and Key Phrase Extraction, and (3) generating abstractive summaries for each contextual cluster using pre-trained transformer models (BART and T5). The proposed approach was systematically evaluated at each stage using standard automatic evaluation metrics, demonstrating its effectiveness in identifying and condensing essential information from lengthy threads. The results highlight the potential of this methodology to streamline email thread summarization. Future work will explore advanced techniques and models to enhance the quality and applicability of contextual summarization further.

Author Biography

  • Shaukat Wasi, Department of Computer Science, MAJU, Karachi, Pakistan

    Dr. Shaukat Wasi received his PhD (CS) and MS (CS) degrees from FAST-National University of Computer and Emerging Sciences (NUCES) in 2015 and 2006 respectively. He did his Bachelors in Computer Science from University of Karachi in 2003. He started his teaching career at FAST-NUCES in 2005. He, then joined DHA Suffa University (DSU) as the founding teaching faculty. Currently, he is engaged at Mohammad Ali Jinnah University (MAJU) as an Associate Professor and the leading the Faculty of Computing as Associate Dean. Dr. Shaukat wasi has been working and interested to do research in HCI, IR, IE, Text Classification and Mining. He is also leading the research group “Intelligent and Interactive Natural Language Processing (IINLP)” initiated at Faculty of Computing, Mohammad Ali Jinnah University. Various projects are under progress at IINLP that mainly address problems in multilingual text and document processing with a focus on Human Factors.

Downloads

Published

2026-04-14