Back to Projects

SmartFlow AI — Enterprise AI Voice Agent SaaS

SmartFlow AI is a scalable, multi-tenant SaaS platform that enables businesses to deploy human-like AI voice agents for automated customer support, sales, and outbound communication. The platform integrates enterprise telephony with real-time generative AI to deliver low-latency, natural conversations, effectively transforming traditional call centers into intelligent, automated systems.

4Views
Next.jsNestJSWebSocketsPostgreSQLPrisma ORMRedisElevenLabs APITata Smartflo APIRazorpayTailwind CSSShadcn UI
SmartFlow AI — Enterprise AI Voice Agent SaaS - image 1

The Story

# 🚀 SmartFlow AI — Enterprise AI Voice Agent SaaS ## 📌 Project Description SmartFlow AI is a scalable, multi-tenant SaaS platform that enables businesses to deploy human-like AI voice agents for automated customer support, sales, and outbound communication. The platform integrates enterprise telephony with real-time generative AI to deliver low-latency, natural conversations, transforming traditional call centers into intelligent, automated systems. --- ## 🎯 Key Features ### 🧠 AI Voice Intelligence - Integrated ElevenLabs ConvAI for realistic speech synthesis and natural conversations - Custom AI agents with configurable: - System prompts - Opening scripts - Personality traits - Multilingual support (including Hinglish) - Context-aware conversations with interruption handling --- ### 📞 Telephony & Call Automation - Integrated Tata Smartflo API for enterprise-grade calling - Click-to-call functionality - Bulk outbound calling campaigns with intelligent delays - Real-time call tracking and status updates - Human-in-the-loop call transfer (AI → Human escalation) --- ### 🏢 Multi-Tenant SaaS Architecture - Organization-based multi-tenancy - BYOK (Bring Your Own Key) support for AI APIs - Role-based access control: - Super Admin - Organization Owner - Virtual number management system --- ### 💳 Billing & Analytics - Razorpay integration for subscriptions and payments - Usage-based billing (calls, duration, AI usage) - Real-time dashboards for: - Call logs - Usage tracking - Revenue analytics --- ## 🏗️ Core Engineering Challenge — AI Audio Proxy ### Problem Telephony systems use **u-law (8kHz)** audio format, while AI engines require **PCM (16kHz+)**, causing compatibility issues and latency in real-time conversations. --- ### Solution Designed and implemented a **Low-Latency WebSocket Audio Proxy** that: - Intercepts live call audio streams - Transcodes audio in real-time: - Mulaw → PCM - PCM → Mulaw - Streams data between telephony provider and AI engine - Manages: - Conversation state - Interruptions - Tool execution (e.g., call transfer) --- ### 🔄 Data Flow Caller ↔ Telephony (Smartflo) ↕ WebSocket Gateway (Audio Proxy) ↕ Audio Transcoding Layer ↕ AI Engine (ElevenLabs) ↕ Response Stream → Caller --- ## 🛠️ Tech Stack ### Frontend - Next.js (App Router) - Tailwind CSS - Shadcn UI + Radix UI ### Backend - NestJS - WebSockets (real-time communication) - REST APIs ### Database & Infrastructure - PostgreSQL - Prisma ORM - Redis (caching + job queues) ### AI & Telephony - ElevenLabs API - Tata Smartflo API ### Payments - Razorpay --- ## 📈 Impact - Reduced operational costs by up to **70%** (simulated benchmarks) - Automated customer support and outbound workflows - Delivered near real-time, human-like voice interactions - Built a scalable foundation for AI-driven communication platforms --- ## 👨‍💻 My Role - Architected the complete system (backend + infrastructure) - Developed real-time WebSocket audio streaming and proxy layer - Designed multi-tenant database architecture - Integrated AI engine, telephony APIs, and payment systems - Optimized latency and streaming performance --- ## 💡 Key Learnings - Real-time systems design and WebSocket streaming - Audio transcoding and low-latency communication - Multi-tenant SaaS architecture - Scalable backend system design - Integration of AI + Telephony + Payments in a single platform ---