International Journal For Multidisciplinary Research

E-ISSN: 2582-2160     Impact Factor: 9.24

A Widely Indexed Open Access Peer Reviewed Multidisciplinary Bi-monthly Scholarly International Journal

Call for Paper Volume 7, Issue 3 (May-June 2025) Submit your research before last 3 days of June to publish your research paper in the issue of May-June.

Content Based Video Retrival System using RESNET-50

Author(s) Prof. Kruthika cg, Asim Aryal, Mokshith B C, Rajeshwar Chaubey, Sathvika J
Country India
Abstract With the rapid increase in user-generated content and digital media, there is a growing need for intelligent systems that can understand and retrieve relevant video data based on both visual and textual queries. Traditional content-based video retrieval methods are often limited in handling complex semantic relationships and user intent. In this work, we propose a hybrid multimodal video retrieval framework that leverages deep learning techniques to bridge this gap. Our system combines visual feature extraction using a fine-tuned ResNet-50 model and semantic text embeddings derived from large language models like GPT-4 to enable more meaningful video classification and retrieval. To prepare the data, videos are preprocesses by extracting keyframes at regular intervals, resizing and normalizing them for uniform input, and storing them as tensors for efficient access. The classification model is trained and evaluated on the AID (Aerial Image Dataset), which offers diverse land-use categories, making it ideal for testing semantic understanding in complex scenes. Once labelled, videos are indexed using both visual and textual representations to support flexible and context-aware retrieval.Initial results show promising performance in recognizing high-level video concepts and returning contextually relevant content based on natural language prompts. This research showcases the potential of combining visual deep networks with language models to build intelligent, scalable video search systems suited for modern content platforms. Future work will focus on integrating personalization and real-time querying for broader applicability.
Keywords multimodal retrieval, ResNet-50, GPT-4, video classification, semantic search, deep learning, aerial image dataset
Field Engineering
Published In Volume 7, Issue 3, May-June 2025
Published On 2025-06-22

Share this