Build and Deploy an LLM-Powered Chat with Memory Using Streamlit

“`html

Build and Deploy an LLM-Powered Chat with Memory Using Streamlit

In this comprehensive guide, I’ll walk you through the process of building and deploying a chat application powered by Google’s Gemini large language model (LLM) using Streamlit. What makes this chat special is its ability to maintain memory of conversations – a crucial feature for creating truly interactive AI experiences.

Why Build an LLM-Powered Chat with Streamlit?

Streamlit has revolutionized how data scientists and developers create interactive web applications. Here’s why this combination works so well:

  • Rapid development: Turn Python scripts into web apps with minimal front-end work
  • Native Python: No need to learn additional frameworks or languages
  • Memory capabilities: Streamlit’s session_state maintains conversation history
  • Easy deployment: Streamlit Community Cloud offers free hosting

Project Overview

We’ll build this project in three main phases:

  1. Project Setup (Steps 1-6)
  2. Building the Chat Interface (Steps 7-13)
  3. Deployment and Monitoring (Steps 14-15)

1. Setting Up the Development Environment

Create and Clone GitHub Repository

Start by creating a new repository on GitHub. Then clone it locally:

git clone <your-repository-url>

Virtual Environment Setup

Using a virtual environment ensures project dependencies don’t conflict with your system Python:

pyenv virtualenv 3.9.14 chat-streamlit-tutorial
pyenv activate chat-streamlit-tutorial

Project Structure

Organize your project with these essential files:

chat-streamlit-tutorial/
│
├── .env
├── .gitignore
├── app.py
├── functions.py
├── requirements.txt
└── README.md

Key files:

  • .env: Stores your API key securely
  • app.py: Main Streamlit application
  • functions.py: Helper functions for better code organization

2. Configuring the Gemini API

Obtaining Your API Key

Get your free API key from Google AI Studio. The free tier offers generous limits:

  • 15 requests per minute
  • 1 million tokens per minute
  • 1,500 requests per day

Secure API Key Storage

Never hardcode API keys! Use environment variables with this secure approach in functions.py:

def get_secret(key):
    try:
        return st.secrets[key]
    except Exception:
        load_dotenv()
        return os.getenv(key)

3. Building the Chat Interface

Core Streamlit Components

We’ll use these key Streamlit features:

  • st.session_state: Maintains chat history between interactions
  • st.chat_message: Creates message bubbles with customizable avatars
  • st.chat_input: Handles user message input

Initializing Chat History

Set up the conversation memory system:

if "chat_history" not in st.session_state:
    st.session_state.chat_history = []

if not st.session_state.chat_history:
    st.session_state.chat_history.append(("assistant", "Hi! How can I help you?"))

Displaying Messages

Render the complete conversation history:

for role, message in st.session_state.chat_history:
    st.chat_message(role).write(message)

4. Implementing Conversation Memory

The key to a truly interactive chat is maintaining context. Here’s how we implement memory:

context = [
    *[
        {"role": role, "parts": [{"text": msg}]} 
        for role, msg in st.session_state.chat_history
    ],
    {"role": "user", "parts": [{"text": full_input}]}
]

This structure provides the LLM with the complete conversation history, enabling coherent multi-turn discussions.

5. Advanced Features

Prompt Engineering

A well-crafted system prompt dramatically improves responses. Compare these examples:

Basic Prompt:

You are an assistant. Be nice and kind in all your responses.

Advanced Prompt:

You are a friendly and knowledgeable programming tutor. 
Always explain concepts in a simple and clear way, using examples when possible. 
If the user asks something unrelated to programming, politely bring the conversation back to programming topics.

Model Configuration

Customize the model’s behavior with these parameters:

  • Temperature: Controls response creativity (0-2)
  • max_output_tokens: Limits response length

Add a slider for interactive control:

temperature = st.sidebar.slider(
    label="Select the temperature",
    min_value=0.0,
    max_value=2.0,
    value=1.0
)

6. Deployment to Streamlit Community Cloud

Deploying is straightforward:

  1. Push your code to GitHub
  2. Click “Deploy” in your local Streamlit app
  3. Configure deployment settings
  4. Add your API key in Advanced Settings

Important: Never commit API keys to public repositories!

7. Monitoring API Usage

Track your usage in Google Cloud Console:

  • Monitor request volume and errors
  • Track latency metrics
  • Check quota usage against free tier limits

Key metrics to watch:

  • Requests: Total API calls
  • Error rate: Failed requests percentage
  • Latency (p95): Response time for 95% of requests

Conclusion

You’ve now built a fully functional LLM-powered chat with conversation memory that you can deploy anywhere. This foundation allows for numerous enhancements:

  • Add multi-modal capabilities (images, documents)
  • Implement retrieval-augmented generation (RAG)
  • Create specialized assistants for different domains

The complete code is available on GitHub. Experiment with different prompts and model configurations to create your perfect AI assistant!

Have questions or ideas for improvement? Share them in the comments below!

“`
#LLM #LargeLanguageModels #AI #ArtificialIntelligence #Gemini #Streamlit #ChatApp #AIChat #ConversationMemory #PromptEngineering #APIIntegration #MachineLearning #NLP #NaturalLanguageProcessing #AIDevelopment #TechTutorial #Coding #Python #AIApplications #CloudDeployment #GoogleAI #APIs #TechTrends #AIIntegration #DeveloperTools #AIInnovation #Chatbots #AIAssistant #TechGuide #Programming

Jonathan Fernandes (AI Engineer) http://llm.knowlatest.com

Jonathan Fernandes is an accomplished AI Engineer with over 10 years of experience in Large Language Models and Artificial Intelligence. Holding a Master's in Computer Science, he has spearheaded innovative projects that enhance natural language processing. Renowned for his contributions to conversational AI, Jonathan's work has been published in leading journals and presented at major conferences. He is a strong advocate for ethical AI practices, dedicated to developing technology that benefits society while pushing the boundaries of what's possible in AI.

You May Also Like

More From Author

+ There are no comments

Add yours