Member-only story
SQL Generator 2.0: How I build AI Query Wizard for Enterprise-Scale with 500+ Tables
This is part 3 of the series on how I built this. Beyond the simple text-to-SQL and enterprise scale
Previous Posts:
- Part 1: Building Multi-Agent AI Application with Bedrock, LlamaIndex and Slack bot
- Part 2: The Challenging Journey of Building an AI-Powered Knowledge Base: Ingestion Pipeline Development

The Journey So Far: Recap of the Confluence Agent
Before we dive into the SQL Agent, let’s briefly revisit the Confluence Agent we developed:
- Metadata Ingestion: Capturing the structure of our knowledge base.
- Content Extraction: Pulling in the meat of our documentation.
- Format Handling: Separating HTML and PDF content for optimal processing.
- Image Analysis: Leveraging LLM parsing to extract and understand image content.
- Performance Boost: Implementing async and multi-threading for a 10x speed improvement.
These enhancements laid the groundwork for a robust information retrieval system. Now, we’re expanding our toolkit to tackle one of the most common challenges in data-driven organizations: SQL query generation.
Why do I build this?
Imagine this scenario: You’re a new data analyst, and your boss drops by your desk with an urgent request:
“I need a comparative analysis of yesterday’s game metrics against last year’s data, focusing on velocity and revenue. Have it on my desk by EOD.”
As the color drains from your face, you realize you’re facing several challenges:
- You’re new and don’t know where to find the relevant data.
- You’re not sure which tables contain the information you need.
- Writing complex SQL queries isn’t your strong suit (yet).
- Your manager is in meetings all day, and you don’t want to bombard them with basic questions
This scenario highlights three critical challenges many organizations face:
- Data Volume: With…