mirror of
https://github.com/symfony/ai.git
synced 2026-03-23 23:42:18 +01:00
- Fix class names: CachedPlatform→CachePlatform, OllamaApiCatalog→ModelCatalog, Indexer→DocumentIndexer/SourceIndexer, Store\Vector→Platform\Vector\Vector - Fix namespaces: TokenUsage, MemoryInputProcessor, ConfiguredSourceIndexer, SourceIndexer, DocumentIndexer, StructuredOutputSerializer - Fix method calls: asText()→getContent(), private property access→getters - Fix API signatures: StoreInterface::query() now requires QueryInterface, ChromaDb\Store constructor updated, Ollama PlatformFactory simplified - Remove unused imports: Gpt, Embeddings, VertexAi Model - Fix incorrect file reference: _model-listing.php→_model.php - Fix incorrect link target: Crawler Tool→firecrawl-crawl.php - Fix MessageBagInterface→MessageBag (interface doesn't exist)
291 lines
12 KiB
ReStructuredText
291 lines
12 KiB
ReStructuredText
Symfony AI - Store Component
|
|
============================
|
|
|
|
The Store component provides a low-level abstraction for storing and retrieving documents in a vector store.
|
|
|
|
Installation
|
|
------------
|
|
|
|
.. code-block:: terminal
|
|
|
|
$ composer require symfony/ai-store
|
|
|
|
Purpose
|
|
-------
|
|
|
|
A typical use-case in agentic applications is a dynamic context-extension with similar and useful information, for so
|
|
called `Retrieval Augmented Generation`_ (RAG). The Store component implements low-level interfaces, that can be
|
|
implemented by different concrete and vendor-specific implementations, so called bridges.
|
|
On top of those bridges, the Store component provides higher level features to populate and query those stores with and
|
|
for documents.
|
|
|
|
Indexing
|
|
--------
|
|
|
|
One higher level feature is the :class:`Symfony\\AI\\Store\\Indexer\\DocumentIndexer`. The purpose of this service is to populate a store with documents.
|
|
Therefore it accepts one or multiple :class:`Symfony\\AI\\Store\\Document\\TextDocument` objects, converts them into embeddings and stores them in the
|
|
used vector store::
|
|
|
|
use Symfony\AI\Store\Document\TextDocument;
|
|
use Symfony\AI\Store\Document\Vectorizer;
|
|
use Symfony\AI\Store\Indexer\DocumentIndexer;
|
|
use Symfony\AI\Store\Indexer\DocumentProcessor;
|
|
|
|
$vectorizer = new Vectorizer($platform, $model);
|
|
$indexer = new DocumentIndexer(new DocumentProcessor($vectorizer, $store));
|
|
$document = new TextDocument('id-1', 'This is a sample document.');
|
|
$indexer->index($document);
|
|
|
|
You can find more advanced usage in combination with an Agent using the store for RAG in the examples folder.
|
|
|
|
Retrieving
|
|
----------
|
|
|
|
The opposite of indexing is retrieving. The :class:`Symfony\\AI\\Store\\Retriever` is a higher level feature that allows you to
|
|
search for documents in a store based on a query string. It vectorizes the query and retrieves similar documents from the store::
|
|
|
|
use Symfony\AI\Store\Retriever;
|
|
|
|
$retriever = new Retriever($store, $vectorizer);
|
|
$documents = $retriever->retrieve('What is the capital of France?');
|
|
|
|
foreach ($documents as $document) {
|
|
echo $document->metadata->get('source');
|
|
}
|
|
|
|
The retriever accepts optional parameters to customize the retrieval:
|
|
|
|
* ``$options``: An array of options to pass to the underlying store query (e.g., limit, filters)
|
|
|
|
Example Usage
|
|
~~~~~~~~~~~~~
|
|
|
|
* `Basic Retriever Example`_
|
|
|
|
Similarity Search Examples
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
* `Similarity Search with Cloudflare (RAG)`_
|
|
* `Similarity Search with Manticore Search (RAG)`_
|
|
* `Similarity Search with MariaDB (RAG)`_
|
|
* `Similarity Search with Meilisearch (RAG)`_
|
|
* `Similarity Search with memory storage (RAG)`_
|
|
* `Similarity Search with Milvus (RAG)`_
|
|
* `Similarity Search with MongoDB (RAG)`_
|
|
* `Similarity Search with Neo4j (RAG)`_
|
|
* `Similarity Search with OpenSearch (RAG)`_
|
|
* `Similarity Search with Pinecone (RAG)`_
|
|
* `Similarity Search with Qdrant (RAG)`_
|
|
* `Similarity Search with SurrealDB (RAG)`_
|
|
* `Similarity Search with SQLite (RAG)`_
|
|
* `Similarity Search with Symfony Cache (RAG)`_
|
|
* `Similarity Search with Typesense (RAG)`_
|
|
* `Similarity Search with Vektor (RAG)`_
|
|
* `Similarity Search with Weaviate (RAG)`_
|
|
* `Similarity Search with Supabase (RAG)`_
|
|
|
|
.. note::
|
|
|
|
Both ``InMemory`` and ``PSR-6 cache`` vector stores will load all the data into the
|
|
memory of the PHP process. They can be used only the amount of data fits in the
|
|
PHP memory limit, typically for testing.
|
|
|
|
Supported Stores
|
|
----------------
|
|
|
|
* `Azure AI Search`_
|
|
* `Chroma`_ (requires ``codewithkyrian/chromadb-php`` as additional dependency)
|
|
* `ClickHouse`_
|
|
* `Cloudflare`_
|
|
* `Elasticsearch`_
|
|
* `InMemory`_
|
|
* `Manticore Search`_
|
|
* `MariaDB`_ (requires ``ext-pdo``)
|
|
* `Meilisearch`_
|
|
* `Milvus`_
|
|
* `MongoDB Atlas`_ (requires ``mongodb/mongodb`` as additional dependency)
|
|
* `Neo4j`_
|
|
* `OpenSearch`_
|
|
* `Pinecone`_ (requires ``probots-io/pinecone-php`` as additional dependency)
|
|
* `Postgres`_ (requires ``ext-pdo``)
|
|
* `Qdrant`_
|
|
* `Redis`_
|
|
* `S3 Vectors`_
|
|
* `SQLite`_ (requires ``ext-pdo_sqlite``)
|
|
* `Supabase`_ (requires manual database setup)
|
|
* `SurrealDB`_
|
|
* `Symfony Cache`_ (requires ``symfony/cache`` as additional dependency)
|
|
* `Typesense`_
|
|
* `Vektor`_
|
|
* `Weaviate`_
|
|
|
|
Document Loader
|
|
---------------
|
|
|
|
Creating and/or loading documents is a critical part of any RAG-based system, as it provides the foundation for the system to understand and respond to queries.
|
|
Document loaders are responsible for fetching and preparing documents for indexing and retrieval.
|
|
|
|
To help loading documents and integrate them into your RAG system, you can use the provided document loaders or create your own custom loaders to suit your specific needs:
|
|
|
|
* :class:`Symfony\\AI\\Store\\Document\\Loader\\CsvLoader`
|
|
* :class:`Symfony\\AI\\Store\\Document\\Loader\\InMemoryLoader`
|
|
* :class:`Symfony\\AI\\Store\\Document\\Loader\\JsonFileLoader`
|
|
* :class:`Symfony\\AI\\Store\\Document\\Loader\\MarkdownLoader`
|
|
* :class:`Symfony\\AI\\Store\\Document\\Loader\\RssFeedLoader`
|
|
* :class:`Symfony\\AI\\Store\\Document\\Loader\\RstLoader`
|
|
* :class:`Symfony\\AI\\Store\\Document\\Loader\\RstToctreeLoader`
|
|
* :class:`Symfony\\AI\\Store\\Document\\Loader\\TextFileLoader`
|
|
|
|
Create a Custom Loader
|
|
----------------------
|
|
|
|
The main extension points of the Store component for document loaders is the :class:`Symfony\\AI\\Store\\Document\\LoaderInterface`,
|
|
that defines the method to load a document from a source. This leads to a loader implementing one method::
|
|
|
|
use Symfony\AI\Store\Document\LoaderInterface;
|
|
use Symfony\AI\Store\Document\Metadata;
|
|
use Symfony\AI\Store\Document\TextDocument;
|
|
use Symfony\Component\Uid\Uuid;
|
|
|
|
class MyDocumentLoader implements LoaderInterface
|
|
{
|
|
public function load(?string $source = null, array $options = []): iterable
|
|
{
|
|
$content = ...
|
|
|
|
yield new TextDocument(Uuid::v7()->toRfc4122(), $content, new Metadata($metadata));
|
|
}
|
|
}
|
|
|
|
Commands
|
|
--------
|
|
|
|
While using the ``Store`` component in your Symfony application along with the ``AiBundle``,
|
|
you can use the ``bin/console ai:store:setup`` command to initialize the store and ``bin/console ai:store:drop`` to clean up the store:
|
|
|
|
.. code-block:: yaml
|
|
|
|
# config/packages/ai.yaml
|
|
ai:
|
|
# ...
|
|
|
|
store:
|
|
chromadb:
|
|
symfonycon:
|
|
collection: 'symfony_blog'
|
|
|
|
.. code-block:: terminal
|
|
|
|
$ php bin/console ai:store:setup symfonycon
|
|
$ php bin/console ai:store:drop symfonycon
|
|
|
|
|
|
Implementing a Bridge
|
|
---------------------
|
|
|
|
The main extension points of the Store component is the :class:`Symfony\\AI\\Store\\StoreInterface`, that defines the methods
|
|
for adding, removing and querying vectorized documents in the store.
|
|
|
|
This leads to a store implementing the following methods::
|
|
|
|
use Symfony\AI\Store\Document\VectorDocument;
|
|
use Symfony\AI\Store\Query\QueryInterface;
|
|
use Symfony\AI\Store\StoreInterface;
|
|
|
|
class MyStore implements StoreInterface
|
|
{
|
|
public function add(VectorDocument|array $documents): void
|
|
{
|
|
// Implementation to add a document to the store
|
|
}
|
|
|
|
public function remove(string|array $ids, array $options = []): void
|
|
{
|
|
// Implementation to remove documents from the store
|
|
}
|
|
|
|
public function query(QueryInterface $query, array $options = []): iterable
|
|
{
|
|
// Implementation to query the store for documents
|
|
return $documents;
|
|
}
|
|
|
|
public function supports(string $queryClass): bool
|
|
{
|
|
// Return true if the given query class is supported
|
|
return false;
|
|
}
|
|
}
|
|
|
|
Managing a store
|
|
----------------
|
|
|
|
Some vector store might requires to create table, indexes and so on before storing vectors,
|
|
the :class:`Symfony\\AI\\Store\\ManagedStoreInterface` defines the methods to setup and drop the store.
|
|
|
|
This leads to a store implementing two methods::
|
|
|
|
use Symfony\AI\Store\ManagedStoreInterface;
|
|
use Symfony\AI\Store\StoreInterface;
|
|
|
|
class MyCustomStore implements ManagedStoreInterface, StoreInterface
|
|
{
|
|
# ...
|
|
|
|
public function setup(array $options = []): void
|
|
{
|
|
// Implementation to create the store
|
|
}
|
|
|
|
public function drop(array $options = []): void
|
|
{
|
|
// Implementation to drop the store (and related vectors)
|
|
}
|
|
}
|
|
|
|
.. _`Retrieval Augmented Generation`: https://en.wikipedia.org/wiki/Retrieval-augmented_generation
|
|
.. _`Basic Retriever Example`: https://github.com/symfony/ai/blob/main/examples/retriever/basic.php
|
|
.. _`Similarity Search with Cloudflare (RAG)`: https://github.com/symfony/ai/blob/main/examples/rag/cloudflare.php
|
|
.. _`Similarity Search with Manticore Search (RAG)`: https://github.com/symfony/ai/blob/main/examples/rag/manticore.php
|
|
.. _`Similarity Search with MariaDB (RAG)`: https://github.com/symfony/ai/blob/main/examples/rag/mariadb-gemini.php
|
|
.. _`Similarity Search with Meilisearch (RAG)`: https://github.com/symfony/ai/blob/main/examples/rag/meilisearch.php
|
|
.. _`Similarity Search with memory storage (RAG)`: https://github.com/symfony/ai/blob/main/examples/rag/in-memory.php
|
|
.. _`Similarity Search with Milvus (RAG)`: https://github.com/symfony/ai/blob/main/examples/rag/milvus.php
|
|
.. _`Similarity Search with MongoDB (RAG)`: https://github.com/symfony/ai/blob/main/examples/rag/mongodb.php
|
|
.. _`Similarity Search with Neo4j (RAG)`: https://github.com/symfony/ai/blob/main/examples/rag/neo4j.php
|
|
.. _`Similarity Search with OpenSearch (RAG)`: https://github.com/symfony/ai/blob/main/examples/rag/opensearch.php
|
|
.. _`Similarity Search with Pinecone (RAG)`: https://github.com/symfony/ai/blob/main/examples/rag/pinecone.php
|
|
.. _`Similarity Search with SQLite (RAG)`: https://github.com/symfony/ai/blob/main/examples/rag/sqlite.php
|
|
.. _`Similarity Search with Symfony Cache (RAG)`: https://github.com/symfony/ai/blob/main/examples/rag/cache.php
|
|
.. _`Similarity Search with Qdrant (RAG)`: https://github.com/symfony/ai/blob/main/examples/rag/qdrant.php
|
|
.. _`Similarity Search with SurrealDB (RAG)`: https://github.com/symfony/ai/blob/main/examples/rag/surrealdb.php
|
|
.. _`Similarity Search with Typesense (RAG)`: https://github.com/symfony/ai/blob/main/examples/rag/typesense.php
|
|
.. _`Similarity Search with Supabase (RAG)`: https://github.com/symfony/ai/blob/main/examples/rag/supabase.php
|
|
.. _`Similarity Search with Vektor (RAG)`: https://github.com/symfony/ai/blob/main/examples/rag/vektor.php
|
|
.. _`Similarity Search with Weaviate (RAG)`: https://github.com/symfony/ai/blob/main/examples/rag/weaviate.php
|
|
.. _`Azure AI Search`: https://azure.microsoft.com/products/ai-services/ai-search
|
|
.. _`Chroma`: https://www.trychroma.com/
|
|
.. _`ClickHouse`: https://clickhouse.com/
|
|
.. _`Cloudflare`: https://developers.cloudflare.com/vectorize/
|
|
.. _`Elasticsearch`: https://www.elastic.co/elasticsearch
|
|
.. _`Manticore Search`: https://manticoresearch.com/
|
|
.. _`MariaDB`: https://mariadb.org/projects/mariadb-vector/
|
|
.. _`Pinecone`: https://www.pinecone.io/
|
|
.. _`Postgres`: https://www.postgresql.org/about/news/pgvector-070-released-2852/
|
|
.. _`Meilisearch`: https://www.meilisearch.com/
|
|
.. _`Milvus`: https://milvus.io/
|
|
.. _`MongoDB Atlas`: https://www.mongodb.com/atlas
|
|
.. _`SurrealDB`: https://surrealdb.com/
|
|
.. _`InMemory`: https://www.php.net/manual/en/language.types.array.php
|
|
.. _`Qdrant`: https://qdrant.tech/
|
|
.. _`Redis`: https://redis.io/
|
|
.. _`S3 Vectors`: https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-vectors.html
|
|
.. _`Neo4j`: https://neo4j.com/
|
|
.. _`OpenSearch`: https://opensearch.org/
|
|
.. _`Typesense`: https://typesense.org/
|
|
.. _`Symfony Cache`: https://symfony.com/doc/current/components/cache.html
|
|
.. _`Vektor`: https://github.com/centamiv/vektor
|
|
.. _`Weaviate`: https://weaviate.io/
|
|
.. _`SQLite`: https://www.sqlite.org/
|
|
.. _`Supabase`: https://supabase.com/
|