News Summarization AI
Deep learning news summarizer using PRIMERA & LED architecture
Overview
A production-grade news summarization system that leverages state-of-the-art transformer architectures to generate concise, accurate summaries of Indian news articles. The system combines PRIMERA (Pyramid-based Masked Sentence Pre-training for Multi-document Summarization) and LED (Longformer Encoder-Decoder) models to handle long-form news content effectively.
The project demonstrates end-to-end deep learning pipeline design — from data preprocessing and model fine-tuning to containerized deployment with a RESTful API interface.
Tech Stack
Architecture
The system follows a modular architecture with clear separation between data processing, model inference, and API serving layers.
Model Architecture
The project evaluates two long-document transformer architectures optimized for summarization tasks:
- PRIMERA — Built on the Longformer architecture with pyramid-based pre-training, enabling efficient processing of documents up to 4,096 tokens. Uses global attention on sentence-level tokens for multi-document awareness.
- LED (Longformer Encoder-Decoder) — Extends the Longformer's local+global attention mechanism to an encoder-decoder framework. Handles long input sequences with linear complexity through windowed local attention combined with task-motivated global attention.
Evaluation Results
The models were evaluated using industry-standard summarization metrics on Indian news datasets:
- ROUGE-1: 71.43% — Measures unigram overlap between generated and reference summaries. This score indicates strong lexical alignment with human-written summaries.
- BERTScore: 0.93 — Uses contextual BERT embeddings to measure semantic similarity. A score of 0.93 demonstrates that generated summaries capture the meaning of the source text with high fidelity.
- ROUGE-2 & ROUGE-L — Additional bigram and longest common subsequence metrics were used to validate consistency across different evaluation dimensions.
Deployment
The system is fully containerized using Docker for consistent deployment across environments:
- FastAPI — Async REST API with automatic OpenAPI documentation. Endpoints accept raw news text and return structured JSON summaries.
- Docker — Multi-stage Dockerfile with optimized image size. Model weights are loaded at container startup for fast inference.
- Batch Processing — Supports both single-article and batch summarization through the API interface.
# Run with Docker
docker build -t news-summarizer .
docker run -p 8000:8000 news-summarizer
# API Usage
curl -X POST http://localhost:8000/summarize \
-H "Content-Type: application/json" \
-d '{"text": "Your news article text here..."}'
Key Features
- Handles long-form news articles (up to 4,096 tokens) without truncation
- Dual-model evaluation pipeline for comparing PRIMERA vs LED performance
- Comprehensive evaluation using ROUGE-1/2/L and BERTScore metrics
- Production-ready Docker deployment with FastAPI REST endpoints
- Focused on Indian English news for domain-specific summarization quality