Back to Projects
SmartFlow AI — Enterprise AI Voice Agent SaaS
SmartFlow AI is a scalable, multi-tenant SaaS platform that enables businesses to deploy human-like AI voice agents for automated customer support, sales, and outbound communication. The platform integrates enterprise telephony with real-time generative AI to deliver low-latency, natural conversations, effectively transforming traditional call centers into intelligent, automated systems.
4Views
Next.jsNestJSWebSocketsPostgreSQLPrisma ORMRedisElevenLabs APITata Smartflo APIRazorpayTailwind CSSShadcn UI

The Story
# 🚀 SmartFlow AI — Enterprise AI Voice Agent SaaS
## 📌 Project Description
SmartFlow AI is a scalable, multi-tenant SaaS platform that enables businesses to deploy human-like AI voice agents for automated customer support, sales, and outbound communication.
The platform integrates enterprise telephony with real-time generative AI to deliver low-latency, natural conversations, transforming traditional call centers into intelligent, automated systems.
---
## 🎯 Key Features
### 🧠 AI Voice Intelligence
- Integrated ElevenLabs ConvAI for realistic speech synthesis and natural conversations
- Custom AI agents with configurable:
- System prompts
- Opening scripts
- Personality traits
- Multilingual support (including Hinglish)
- Context-aware conversations with interruption handling
---
### 📞 Telephony & Call Automation
- Integrated Tata Smartflo API for enterprise-grade calling
- Click-to-call functionality
- Bulk outbound calling campaigns with intelligent delays
- Real-time call tracking and status updates
- Human-in-the-loop call transfer (AI → Human escalation)
---
### 🏢 Multi-Tenant SaaS Architecture
- Organization-based multi-tenancy
- BYOK (Bring Your Own Key) support for AI APIs
- Role-based access control:
- Super Admin
- Organization Owner
- Virtual number management system
---
### 💳 Billing & Analytics
- Razorpay integration for subscriptions and payments
- Usage-based billing (calls, duration, AI usage)
- Real-time dashboards for:
- Call logs
- Usage tracking
- Revenue analytics
---
## 🏗️ Core Engineering Challenge — AI Audio Proxy
### Problem
Telephony systems use **u-law (8kHz)** audio format, while AI engines require **PCM (16kHz+)**, causing compatibility issues and latency in real-time conversations.
---
### Solution
Designed and implemented a **Low-Latency WebSocket Audio Proxy** that:
- Intercepts live call audio streams
- Transcodes audio in real-time:
- Mulaw → PCM
- PCM → Mulaw
- Streams data between telephony provider and AI engine
- Manages:
- Conversation state
- Interruptions
- Tool execution (e.g., call transfer)
---
### 🔄 Data Flow
Caller ↔ Telephony (Smartflo)
↕
WebSocket Gateway (Audio Proxy)
↕
Audio Transcoding Layer
↕
AI Engine (ElevenLabs)
↕
Response Stream → Caller
---
## 🛠️ Tech Stack
### Frontend
- Next.js (App Router)
- Tailwind CSS
- Shadcn UI + Radix UI
### Backend
- NestJS
- WebSockets (real-time communication)
- REST APIs
### Database & Infrastructure
- PostgreSQL
- Prisma ORM
- Redis (caching + job queues)
### AI & Telephony
- ElevenLabs API
- Tata Smartflo API
### Payments
- Razorpay
---
## 📈 Impact
- Reduced operational costs by up to **70%** (simulated benchmarks)
- Automated customer support and outbound workflows
- Delivered near real-time, human-like voice interactions
- Built a scalable foundation for AI-driven communication platforms
---
## 👨💻 My Role
- Architected the complete system (backend + infrastructure)
- Developed real-time WebSocket audio streaming and proxy layer
- Designed multi-tenant database architecture
- Integrated AI engine, telephony APIs, and payment systems
- Optimized latency and streaming performance
---
## 💡 Key Learnings
- Real-time systems design and WebSocket streaming
- Audio transcoding and low-latency communication
- Multi-tenant SaaS architecture
- Scalable backend system design
- Integration of AI + Telephony + Payments in a single platform
---
