Tutorial 2: Integrating Django Vector Database
Welcome to Tutorial 2! In the previous tutorial, we setup our Blog Post project that we will use for the rest of the tutorial to demonstrate the feature of django-vectordb
. In this tutorial, we will integrate django-vectordb
into our Blog Site. By the end of this tutorial, we will:
- Integrate
django-vectordb
into our project. - Update our Post model to allow
django-vectordb
to automatically extract the text for vector search, as well as any metadata we want to filter on. - Update the vector database with all the blog Posts that are currently on our site.
- Configure our site to automatically sync with the vector database whenever we
create
,update
ordelete
posts. - Configure our site to display related blog recommendations using the
django-vectordb
.
Let's get started!
Adding VectorDB to the Project
If you haven't installed vectordb already, install it with the following command:
Next lets add django-vectordb
to the INSTALLED_APPS
. Open tutorial/settings.py
and add vectordb
to the installed apps.
# settings.py
INSTALLED_APPS = [
"django.contrib.admin",
"django.contrib.auth",
"django.contrib.contenttypes",
"django.contrib.sessions",
"django.contrib.messages",
"django.contrib.staticfiles",
"blog",
"vectordb", # <--
]
Finally, lets run the migrations to create the vectordb
table.
Update Post Model for VectorDB to Automatically Extract Text and Metadata
Next lets add to methods to our model to allow vectordb to properly index our Posts.
First we need to pick the text we want to search by. For this tutorial we use both the title
and the description
of each blog. We tell vectordb
about the text we want to use through the get_vectordb_text
method.
Second, we need to select the metadata we want to use for filtering. For this example, we will use the user_id
, username
and created_at
of each blog post. We will break down the create_at
into year
, month
and day
to demonstrate nested filtering. We tell django-vectordb
about the metadata we want to use through the get_vectordb_metadata
method. If we don,t provide this method, vectordb will use the django serializer to serialize the entire model. This maybe suboptimal for large models.
Update VectorDB with Existing Posts
Django vector database provides a management command to update the vector database with existing models. It takes two arguments the app_name
and the model_name
, which in this case are blog
and Post
.
Lets run this command to update the vector database with all the existing posts.
Info
Django vector database also provides a management command for reseting the vector database. Note that this is distractive and irreversible.
Summary
In this tutorial, we successfully integrated the Django vector database into our blog post project. Additionally, we synced our existing blog posts with the Django vector database using the vectordb_sync
management command. In the upcoming tutorial, we will set up our blog post app to automatically synchronize with the Django vector database whenever we create
, update
, or delete
blog posts.