International Journal For Multidisciplinary Research

E-ISSN: 2582-2160     Impact Factor: 9.24

A Widely Indexed Open Access Peer Reviewed Multidisciplinary Bi-monthly Scholarly International Journal

Call for Paper Volume 8, Issue 3 (May-June 2026) Submit your research before last 3 days of June to publish your research paper in the issue of May-June.

Scalable AI Infrastructure on ARM: A Comprehensive Framework for Cost-Efficient Deep Learning Model Lifecycle Management

Author(s) Udaya Kumar Reddy Veeramreddygari
Country United States
Abstract This paper presents a comprehensive framework for training and deploying machine learning models optimized for AWS Graviton processors, with a primary focus on cost-performance trade-offs in enterprise environments. Our approach leverages ARM-based Graviton3 processors across EC2, ECS, and Lambda services to achieve significant cost savings while maintaining competitive performance metrics. Through extensive benchmarking across TensorFlow, PyTorch, and scikit-learn frameworks, we demonstrate up to 40% reduction in operational costs with minimal latency penalties. The framework incorporates advanced optimization techniques including mixed-precision training, model quantization, and adaptive batching specifically tuned for ARM architecture. A production case study in financial services illustrates practical implementation strategies, achieving 37% cost reduction in ML inference workloads while maintaining sub-100ms response times. The proposed architecture supports both CPU-intensive training workloads and high-throughput inference scenarios, making it particularly suitable for cost-conscious organizations seeking to democratize ML deployment.
Keywords AWS Graviton, ARM Architecture, Machine Learning Optimization, Cost-Efficient Computing, TensorFlow, PyTorch, Model Serving, Cloud Economics.
Field Engineering
Published In Volume 6, Issue 3, May-June 2024
Published On 2024-06-07

Share this