International Journal For Multidisciplinary Research

E-ISSN: 2582-2160     Impact Factor: 9.24

A Widely Indexed Open Access Peer Reviewed Multidisciplinary Bi-monthly Scholarly International Journal

Call for Paper Volume 7, Issue 3 (May-June 2025) Submit your research before last 3 days of June to publish your research paper in the issue of May-June.

Imaginary Ink: A Novel Approach to Multi-Modal Text-to-Visual Content Generation System

Author(s) Mr. Umang Garg, Mr. Anshul Raj, Mr. Aaditya Sharma, Mr. Lakshay Tyagi, Prof. Hemant Kumar Bhardwaj
Country India
Abstract This work presents "Imaginary Ink," a novel multi-modal text-to-visual generating system able to convert textual descriptions into high-quality 2D images and 3D models. To close the difference between natural language processing and computer vision, the system uses recent developments in deep learning architectures, especially diffusion models and transformer-based networks. Unlike current solutions that usually concentrate on either image generation or 3D modelling only, Imaginary Ink offers a unified platform handling both modalities using a fresh pipeline architecture. Combining separate specialised rendering engines for 2D and 3D outputs with a semantic understanding module that extracts spatial relationships and visual characteristics from text, our method Imaginary Ink provides much more flexibility and user experience than state-of- the-art single-modal systems, yet experimental results show that it performs competitively. The system architecture, implementation details, performance evaluation, and possible applications spanning many fields including education, design, entertainment, and accessibility solutions are presented in this work.
Keywords Text to image generation, Text to 2D/3D model generation
Field Engineering
Published In Volume 7, Issue 3, May-June 2025
Published On 2025-05-16
DOI https://doi.org/10.36948/ijfmr.2025.v07i03.44347
Short DOI https://doi.org/g9kfj8

Share this