Building a RAG-Powered Hackathon Winning App

Crafting Personalized Travel Itineraries with Retrieval-Augmented Generation (RAG) and Vector Search

Published in

Level Up Coding

13 min readNov 21, 2024

In this article, I’ll share how we built Hangout AI, a personalized travel itinerary generator that won at the TiDB Future App Hackathon 2024. We’ll explore how RAG was key to the project, its architecture, and future possibilities. Whether you’re interested in RAG or hackathons, you’ll find something useful here.

TiDB Future App Hackathon 🏆

The TiDB Future App Hackathon 2024 provided an exciting platform for developers to showcase their creativity by building innovative AI applications using TiDB Serverless, which integrates Vector Search capabilities.

Overview of the competition

The competition challenged developers to build AI applications using TiDB Serverless with Vector Search. Participants could work on various AI use cases, including image processing, NLP, and RAG, for a chance to win over $30,000 in prizes. The competition had strong participation, with two teams from Indonesia securing top spots — first place and fifth place, with my project, Hangout AI, taking fifth place.

Why we chose to focus on personalized travel itineraries

Hangout AI was built by me and Ayu Sudi Dwijayanti, was inspired by our habit of exploring cafes in Jakarta Timur as Work From Cafe (WFC) enthusiasts, often discussing new hangout spots but rarely following through. Over time, we realized our shared interest in exploring activities in Jakarta, Singapore, and Kuala Lumpur, and using AI to create personalized itineraries felt like the perfect solution.

Hangout AI generates custom travel plans based on user input, location, date, and weather, using a Large Language Model (LLM) with Retrieval-Augmented Generation (RAG) to offer relevant recommendations. By pulling data from a vector database using TiDB, it provides tailored suggestions and visual previews, making travel planning easy. Our target audience includes travelers and anyone looking for a personalized itinerary, and the app fits into the Recommendation System and RAG categories for the TiDB Future App Hackathon 2024.

🔍 Understanding RAG (Retrieval-Augmented Generation)

Retrieval-Augmented Generation (RAG) is an advanced AI technique that combines traditional Large Language Models (LLMs) with external data retrieval systems to enhance the quality and relevance of generated outputs.

How does RAG work?

RAG works by first retrieving relevant information from an external data source, such as a vector database or knowledge base, using techniques like similarity search or keyword matching. Once the relevant information is retrieved, it is fed into an LLM, which generates a response or output that is informed by the retrieved data. This two-step process — retrieving and then generating — helps the model produce more accurate and contextually rich content. For example, in a travel itinerary app like Hangout AI, RAG allows the model to pull location-specific data (e.g., weather, user reviews, and other real-time details) to generate personalized travel recommendations, ensuring the output is tailored to the user’s preferences and the current environment.

Why we chose RAG over a traditional LLM pipeline

In our project, we opted for RAG instead of a traditional LLM pipeline because RAG allows us to integrate external data sources into the generation process, leading to more accurate, relevant, and personalized outputs. A traditional LLM pipeline would rely solely on the model’s pre-existing knowledge, which can be limited and static.

With RAG, we can pull in real-time information from a database, ensuring that our responses, such as travel itineraries, reflect the latest conditions like weather, availability, or new attractions. This makes the system more dynamic and adaptable, providing a richer user experience. Furthermore, using RAG with a vector database allows for faster and more efficient data retrieval, enhancing both performance and accuracy.

❓ Why Do We Need RAG?

RAG improves AI models by combining data retrieval with text generation, making them more accurate and effective. While standalone LLMs are powerful, they face challenges like generating incorrect information, providing outdated responses, and struggling to scale. RAG solves these problems by allowing the AI to pull in relevant, real-time data from external sources, making it smarter, more reliable, and more efficient for tasks like personalized recommendations or real-time information retrieval.

Challenges of standalone LLMs:

Standalone LLMs, while powerful, often struggle with several challenges:

Accuracy: LLMs are trained on large datasets but don’t have access to real-time or up-to-date information. This can lead to inaccurate or outdated responses.
Hallucinations: LLMs may sometimes generate information that sounds correct but is actually made up. This happens because they lack a connection to real-world facts.
Scalability: When the task grows more complex or requires processing a lot of data, LLMs can struggle to keep up, as they only work with what they have already learned, without dynamic data.

How RAG addresses these issues with external data retrieval

RAG improves LLMs by using external data:

Accuracy: RAG pulls in real-time, relevant data from databases or other sources to ensure the information the AI generates is accurate. For example, it can use current weather or business hours to create more accurate recommendations.
Reducing Hallucinations: By using real data, RAG ensures the AI generates reliable responses, reducing the risk of made-up information. For example, in travel apps, the AI can suggest places based on actual user reviews and real-time availability.
Scalability: RAG can handle large volumes of data by retrieving relevant information from external sources, allowing the AI to generate more complex responses without slowing down. This helps the system stay efficient as it grows and adapts to changing data.

🏗️ The Architecture of Hangout AI

Hangout AI’s architecture is thoughtfully designed to provide efficient and personalized travel itineraries by combining cutting-edge technologies and seamless data flow between its components.

System overview: Components and data flow

Hangout AI is built around three core elements: data scraping, vector retrieval, and LLM generation.

Data Collection: A Google Maps Scraper fetches relevant location details such as ratings, descriptions, and coordinates. The scraped data is then cleaned using Pandas to ensure consistency and usability.
Data Embedding: The Gemini embedding model processes the cleaned data into high-dimensional vector representations that are optimized for similarity searches.
Storage and Retrieval: These vectors are stored in TiDB, a distributed database with vector search capabilities, allowing quick and scalable retrieval of relevant data points.
LLM Generation: The Groq llama3–70b-8192 LLM combines the user’s input parameters (like location, date, and weather) with retrieved data to generate a detailed and personalized itinerary.

Leveraging vector databases for efficient retrieval

TiDB serves as the backbone of Hangout AI’s retrieval system, enabling:

Fast searches: With its vector capabilities, TiDB matches user preferences with pre-embedded data in milliseconds.
Scalability: As the database grows with new locations and itineraries, its distributed nature ensures consistent performance.
Flexibility: TiDB seamlessly handles complex queries that mix structured data (like dates) with unstructured data (like reviews).

How the LLM and retrieval components interact

User Query Input: The user provides parameters such as their location, preferred activity type, and date of travel.
Data Retrieval: The query triggers a similarity search in TiDB, fetching the top matching location embeddings based on the user’s preferences.
Context Assembly: Retrieved data is formatted into a structured input prompt, enriched with weather data and other contextual information.
LLM Processing: The prompt is sent to the llama3–70b-8192 model, which uses its language generation capabilities to create a custom itinerary.
Result Delivery: The final itinerary is returned to the user through a FastAPI endpoint, ensuring a smooth and interactive experience.

🛠️ Building the RAG system

The development of Hangout AI leverages cutting-edge tools like LlamaIndex for efficient integration of LLMs, TiDB for managing vectorized location data, and external APIs for weather and location information. Data is ingested and indexed using the Gemini Embedding Model and stored in TiDB, enabling fast similarity searches. The app dynamically combines this data with weather insights and user inputs to generate personalized travel itineraries through Groq’s llama3–70b-8192 model, ensuring a seamless and intelligent user experience.

1. Vector Database Setup

In this project, we’re using TiDB for the vector store database. This snippet initializes the TiDB vector store, connecting it to a TiDB instance and configuring it to store vectors for efficient retrieval. It uses the cosine distance strategy and specifies the vector dimension for embedding.

from sqlalchemy import URL, create_engine
from llama_index.vector_stores.tidbvector import TiDBVectorStore
import os

# Construct the TiDB connection URL using environment variables for security
tidb_connection_url = URL(
    "mysql+pymysql",
    username=os.environ['TIDB_USERNAME'],  # TiDB username from environment variable
    password=os.environ['TIDB_PASSWORD'],  # TiDB password from environment variable
    host=os.environ['TIDB_HOST'],          # TiDB host from environment variable
    port=4000,                             # TiDB port
    database="test",                       # Database name
    query={"ssl_verify_cert": False, "ssl_verify_identity": True},  # SSL settings
)

# Create a SQLAlchemy engine with the TiDB connection URL
create_engine(tidb_connection_url, pool_recycle=3600)

# Initialize the TiDBVectorStore with the connection string and other parameters
tidbvec = TiDBVectorStore(
    connection_string=tidb_connection_url,  # Connection string for TiDB
    table_name=os.getenv("VECTOR_TABLE_NAME"),  # Table name from environment variable
    distance_strategy="cosine",  # Distance strategy for vector similarity search
    vector_dimension=768,        # Dimension of the vectors
    drop_existing_table=False,   # Whether to drop the existing table
)

2. Preprocessing Destination Data

This function preprocesses raw destination data by structuring it into Document objects, embedding relevant metadata such as location, title, and categories. This ensures the data is ready for vectorization and querying.

from llama_index.core import Document
import json

# Function to preprocess data from a JSON file
def preprocess_data(path):
    documents = []
    # Load data from the specified JSON file
    data = json.load(open(path))
    for item in data:
        # Construct the text content for each document
        text = f"""
        Address: {item['address']}
        Title: {item['title']}
        Country: {item['complete_address']['country']}
        Categories: {', '.join(item['categories'] if item['categories'] else [])}
        Description: {item.get('description', 'No description')}
        Review Count: {item['review_count']}
        Review Rating: {item['review_rating']}
        Open Hours: {json.dumps(item['open_hours'])}
        Latitude: {item['latitude']}
        Longitude: {item['longtitude']}
        """
        # Create metadata for each document
        metadata = {
            "id": item["cid"],
            "title": item["title"],
            "description": item["description"],
            "address": item["address"],
            "complete_address": item["complete_address"],
        }
        # Create a Document object with the text and metadata
        document = Document(text=text, metadata=metadata)
        # Add the document to the list of documents
        documents.append(document)
    return documents

3. Data Ingestion

This function is responsible for ingesting destination data into the TiDB vector store. It preprocesses the data, converts it into vectorized documents, and indexes it for efficient retrieval.

def init():
    """
    Ingests data into the TiDB vector store for use in itinerary planning.

    1. Preprocesses data from a JSON file.
    2. Converts the data into Document objects with metadata.
    3. Creates a vector index in the TiDB vector store.

    Returns:
        VectorStoreIndex: The initialized vector store index.
    """
    # Load and preprocess data from the JSON file
    documents = preprocess_data("./data/destinations.json")
    
    # Create a vector index from the preprocessed documents
    index = VectorStoreIndex.from_documents(
        documents, 
        storage_context=storage_context, 
        insert_batch_size=1000, 
        show_progress=True
    )
    return index

Vector Index Creation: The VectorStoreIndex.from_documents() method builds an index from the documents. This index allows fast similarity searches in the TiDB vector store.
Batch Insertions: Data is ingested in batches of 1,000 to optimize performance and ensure seamless processing for large datasets.

4. LLM Integration with Weather and Location Filters

This function combines location-based filtering and weather data integration with the LLM’s query engine. It retrieves relevant destinations while considering user preferences, and location.

from llama_index.core.vector_stores import MetadataFilters, MetadataFilter

# Function to query the vector store with specific filters and parameters
def query(date, country, startTime, endTime, address, lat, lng):
    # Create metadata filters based on the country
    filters = MetadataFilters(
        filters=[MetadataFilter(key="complete_address.country", value=[country])]
    )

    # Create a query engine with the specified filters and top-k similarity search
    query_engine = index.as_query_engine(filters=filters, similarity_top_k=2)
    
    # Query the vector store using the generated prompt
    response = query_engine.query(create_prompt(date, startTime, endTime, address))
    
    # Return the response, metadata, and weather data
    return {
        "response": response.response,  # The response from the query engine
        "metadata": get_data_from_cids([node.node.metadata["id"] for node in response.source_nodes]),  # Metadata from the source nodes
        "weathers": weather_data,  # Weather data (assumed to be defined elsewhere)
    }

Location Filters: The app uses metadata filters (such as country or location) to refine search results, ensuring recommendations are tailored to the user’s preferred area.
LLM-based Itinerary Generation: The LLM uses the weather summary and location filters to create a customized, practical travel itinerary that adapts to both the user’s preferences and real-time conditions.

5. Chat-Based Itinerary with History

This function enables chat-based itinerary generation by maintaining context through chat history. It refines the interaction with the LLM to provide dynamic and conversational responses tailored to the user’s input.

from llama_cloud import ChatMessage
from llama_index.core.chat_engine.types import ChatMode
from llama_index.core.vector_stores import MetadataFilters, MetadataFilter

# Function to handle chat queries for travel itinerary planning
def chat_query(messages, query, day, country, startTime, endTime, address):
    # Create metadata filters based on the country
    filters = MetadataFilters(filters=[MetadataFilter(key="complete_address.country", value=[country])])
    
    # Prepare chat history messages
    histories = [
        ChatMessage(content=msg.content, role=msg.role, additional_kwargs={}) for msg in messages.histories
    ]
    # Insert a system message at the beginning of the chat history
    histories.insert(0, ChatMessage(
        content=f"You are a travel itinerary planner. Plan a visit to {address} on {day} from {startTime} to {endTime}.",
        role="system"
    ))
    # Append the user's query to the chat history
    histories.append(ChatMessage(content=query, role="user"))

    # Create a chat engine with the specified filters and context mode
    query_engine = index.as_chat_engine(filters=filters, chat_mode=ChatMode.CONTEXT)
    
    # Perform the chat query with the prepared chat history
    response = query_engine.chat(query, chat_history=histories)
    
    # Return the response and metadata from the source nodes
    return {
        "response": response.response,  # The response from the chat engine
        "metadata": get_data_from_cids([node.node.metadata["id"] for node in response.source_nodes]),  # Metadata from the source nodes
    }

Chat Interface: Users interact with the app through a conversational interface, where they can ask for personalized travel itineraries.
Message History: The app maintains a history of user interactions to provide context, improving the quality of responses over time.
System Messages: The app uses predefined system messages to guide the LLM in generating relevant itineraries based on the user’s preferences and previous chats.
Dynamic Itinerary Updates: As the conversation progresses, the itinerary adapts to new inputs, including changes in preferences, dates, or weather conditions, ensuring the itinerary remains up-to-date.

💡 Attention:

For more detailed code, you can check it out here.
Apologies for the messy code — this is my first LLM project, so it’s a bit rough around the edges. There are other LLM repositories that are much cleaner, and We’ll discuss those in a future article. Thank you for your understanding!

⛓️ Building RAG Without Frameworks

It’s entirely possible to implement Retrieval-Augmented Generation (RAG) without relying on frameworks like LlamaIndex or LangChain. A custom-built RAG pipeline provides more control over each component, such as vector embeddings, database queries, and prompt engineering. Here’s how it typically works:

Data Ingestion: Collect and preprocess data, then convert it into embeddings using an embedding model like OpenAI or Hugging Face transformers.
Vector Storage: Use a vector database like TiDB, Pinecone, Milvus, or self-hosted solutions like FAISS to store and retrieve embeddings.
Retrieval: Perform similarity searches in the database to retrieve relevant documents for user queries.
Generation: Combine the retrieved documents with user input to create prompts and send them to an LLM for generating responses.

This manual approach is flexible but requires significant development effort and expertise to handle vector databases, manage pipelines, and ensure scalability.

Why I Chose LlamaIndex

In this project, I opted to use LlamaIndex because it was the official sponsor of the hackathon and offered tools that streamlined the RAG process. LlamaIndex simplifies tasks like data ingestion, query orchestration, and integration with TiDB, allowing me to focus on implementing key features.

⚠️ Disclaimers: Not Production-Ready Yet

While Hangout AI is a promising prototype that demonstrates the potential of using LLMs for personalized travel itineraries, it’s important to note that it is not yet production-ready. The app is still in development, and there are several areas that require attention to ensure robustness and scalability before it can be used in real-world, high-demand environments.

Why safeguards and improvements are needed

Accuracy: The app may occasionally provide inaccurate or incomplete itineraries due to limitations in the current data or errors in processing.
Error Handling: While we’ve implemented retries for failed queries, more advanced error-handling mechanisms need to be in place to ensure a smooth user experience, especially when dealing with external data like weather and location.
Security: Proper security measures, such as rate limiting and secure API key management, are needed to safeguard sensitive user data and prevent misuse of the app.
Scalability: The system’s current architecture may struggle to handle a large number of concurrent users without optimizations for scaling, load balancing, and database management.

The next steps for making the app scalable and reliable

Optimizing database performance: Improving the efficiency of TiDB with better indexing strategies and query optimization to handle large datasets.
Improved Error Management: Implementing a more robust logging and monitoring system to proactively identify and address issues before they affect users.
Security Enhancements: Enforcing stricter security protocols, such as OAuth for user authentication, and encrypting sensitive data.

For more information on securing LLM-based applications, you can learn more about LlamaIndex’s RAG security features through their guide on Secure RAG with LlamaIndex and LLM Guard by Protect AI.

🎉 Conclusion

Hangout AI, built for the TiDB Future App Hackathon 2024, showcases the power of Retrieval-Augmented Generation (RAG) combined with vector search to create personalized travel itineraries. This innovative approach leverages real-time data retrieval and large language models to provide accurate and relevant travel recommendations.

While promising, Hangout AI is still a prototype and not yet ready for production. Key areas for improvement include enhancing accuracy, error handling, security, and scalability. Future efforts will focus on optimizing database performance, implementing robust error management, and enforcing stricter security protocols.

We look forward to refining Hangout AI into a reliable and scalable solution for personalized travel planning.

Congratulations!

You have finished this article. I hope you enjoyed the article. If you have any interesting topics you would like to discuss or questions, please feel free to comment below or reach out to me on my GitHub or LinkedIn.

Thank you and don’t forget to give this post a clap 👏.

🌐 Explore Code

If you want to explore more about this app case and delve deeper into the project, you can find the complete repository at:

GitHub - luthfiarifin/hangout-llm

Contribute to luthfiarifin/hangout-llm development by creating an account on GitHub.

github.com