Weaviate

Weaviate's object-centric model
Quick start with embedded Weaviate
Schema and collections
Inserting and searching objects
Hybrid search
Multi-tenancy
Gotchas

SECTION 01

Weaviate's object-centric model

Most vector databases store vectors and then attach metadata as a side-car. Weaviate flips this: the primary unit is a typed object with properties — the vector is derived from the object's content automatically if you configure a vectoriser module. This means you can insert raw text and Weaviate handles embedding on ingestion, without your application managing the embed API call.

The tradeoff: auto-vectorisation couples your data pipeline to Weaviate's model integrations. For maximum control over embeddings (custom models, batch optimisation), providing your own vectors is usually better.

SECTION 02

Quick start with embedded Weaviate

pip install weaviate-client

import weaviate
import weaviate.classes as wvc

# Embedded Weaviate (in-process, good for dev/testing)
with weaviate.connect_to_embedded() as client:
    print(client.is_ready())   # True

# Weaviate Cloud or local Docker
client = weaviate.connect_to_wcs(
    cluster_url="https://your-cluster.weaviate.network",
    auth_credentials=weaviate.auth.AuthApiKey("your-api-key")
)

# Local Docker: docker run -p 8080:8080 semitechnologies/weaviate:latest
client = weaviate.connect_to_local()

SECTION 03

Schema and collections

import weaviate
import weaviate.classes as wvc
from weaviate.classes.config import Property, DataType, Configure

client = weaviate.connect_to_local()

# Create a collection with explicit schema + bring-your-own vectors
client.collections.create(
    name="Article",
    properties=[
        Property(name="title",   data_type=DataType.TEXT),
        Property(name="content", data_type=DataType.TEXT),
        Property(name="source",  data_type=DataType.TEXT),
    ],
    vectorizer_config=Configure.Vectorizer.none()   # we'll provide vectors ourselves
)

collection = client.collections.get("Article")
print(collection.config.get())

SECTION 04

Inserting and searching objects

from openai import OpenAI
import weaviate, weaviate.classes as wvc

client = weaviate.connect_to_local()
oai = OpenAI()
collection = client.collections.get("Article")

def embed(text):
    return oai.embeddings.create(input=[text], model="text-embedding-3-small").data[0].embedding

# Insert with explicit vector
docs = [
    {"title": "Return Policy", "content": "Refunds accepted within 30 days.", "source": "faq"},
    {"title": "Shipping Info", "content": "Free shipping on orders over $50.", "source": "faq"},
]
with collection.batch.dynamic() as batch:
    for doc in docs:
        batch.add_object(
            properties={"title": doc["title"], "content": doc["content"], "source": doc["source"]},
            vector=embed(doc["content"])
        )

# Vector search
query_vector = embed("What is the refund policy?")
results = collection.query.near_vector(
    near_vector=query_vector,
    limit=3,
    return_metadata=wvc.query.MetadataQuery(score=True)
)
for obj in results.objects:
    print(f"Score {obj.metadata.score:.3f}: {obj.properties['content']}")

SECTION 05

Hybrid search

from weaviate.classes.query import HybridFusion

results = collection.query.hybrid(
    query="return policy refund",      # used for BM25 keyword search
    vector=embed("return policy"),     # used for vector search
    alpha=0.5,                         # 0=keyword only, 1=vector only, 0.5=balanced
    fusion_type=HybridFusion.RELATIVE_SCORE,
    limit=5,
    return_metadata=wvc.query.MetadataQuery(score=True)
)
for obj in results.objects:
    print(f"Score {obj.metadata.score:.3f}: {obj.properties['content']}")

alpha=0.5 gives equal weight to BM25 and vector scores. Increase alpha for semantic-heavy queries, decrease it for keyword-heavy queries. RELATIVE_SCORE fusion normalises both score distributions before combining — usually better than RANKED fusion for production.

SECTION 06

Multi-tenancy

from weaviate.classes.config import Configure

# Enable multi-tenancy on collection creation
client.collections.create(
    name="UserDocs",
    multi_tenancy_config=Configure.multi_tenancy(enabled=True),
    vectorizer_config=Configure.Vectorizer.none()
)

collection = client.collections.get("UserDocs")

# Create tenants
collection.tenants.create([
    wvc.tenants.Tenant(name="tenant-alice"),
    wvc.tenants.Tenant(name="tenant-bob"),
])

# Insert into a specific tenant
tenant_collection = collection.with_tenant("tenant-alice")
tenant_collection.data.insert(
    properties={"content": "Alice's private document."},
    vector=embed("Alice's private document.")
)

# Query — only returns Alice's data
results = tenant_collection.query.near_vector(vector=embed("private"), limit=5)

SECTION 07

Gotchas

Collection name capitalisation. Weaviate requires collection names to start with an uppercase letter. article will fail; Article works.

Schema migrations are destructive. Changing a property's data type requires deleting and recreating the collection. Plan your schema before ingesting large datasets.

Batch error handling. The batch.dynamic() context manager silently swallows individual item errors. After ingestion, check collection.batch.failed_objects to detect partial failures.

gRPC for bulk ingestion. The v4 client uses gRPC for batch operations by default — much faster than REST for large ingestion jobs. Make sure port 50051 is open if running a self-hosted Weaviate.

SECTION 08

Weaviate deployment and scaling

Weaviate runs as a containerized service and scales horizontally by sharding data across nodes. A single node is fine for <100K vectors; beyond that, distributed clusters provide resilience and parallel search. Weaviate also integrates with cloud providers: Weaviate Cloud (managed), AWS (self-hosted on ECS), Kubernetes, or local Docker for development.

Performance depends on vector dimensionality, index type, and query patterns. HNSW indexing is fast for small–medium datasets (millions of vectors); for very large clusters (hundreds of millions), hybrid disk+RAM indices or approximate techniques become necessary. Batch imports and periodic optimization (garbage collection, index compaction) help maintain speed.

Deployment Model	Vectors Supported	Latency (p50)	Cost	Effort
Local Docker	Up to 1M	<100ms	Free (compute only)	Low
Single node (cloud VM)	Up to 10M	100–500ms	Low–Medium	Low
Weaviate Cloud	Unlimited	50–200ms	Medium–High	Very low
Kubernetes cluster	Unlimited	50–300ms	High (ops)	High

Weaviate advanced features: Weaviate supports hybrid search (combining vector similarity with keyword search using BM25), multi-hop graph queries (traverse relationships in a knowledge graph), and integrations with LLMs via the "Bring Your Own" API (BYO) features. You can attach custom vectorizers (OpenAI, Hugging Face, local models) at schema definition time, so every object is automatically vectorized on import. This "auto-vectorization" simplifies workflows but adds latency and cost.

For large-scale deployments, Weaviate's sharding and replication features ensure resilience. A 3-node cluster survives single-node failure; larger clusters can handle millions of concurrent requests. The query language (GraphQL) is expressive but requires learning; the REST API is simpler for basic operations. Community modules extend Weaviate with custom operators, and the ecosystem includes integrations with LangChain, LlamaIndex, and Hugging Face.

Weaviate vs. other vector databases: Pinecone (serverless, easy to scale), Qdrant (high performance, self-hosted), Milvus (open-source, production-ready), and Chroma (lightweight, for prototyping) are alternatives. Weaviate's advantages: GraphQL querying (expressive), multi-vectorizer support (flexible), and strong community. Disadvantages: steeper learning curve (GraphQL), more operational overhead for self-hosted deployments. For teams already using GraphQL, Weaviate is natural; for teams preferring REST or simple filtering, Pinecone or Chroma may be better.

Weaviate's hybrid search (combining vector + keyword) is powerful for retrieval-augmented generation (RAG). You can search by semantic meaning (vector similarity) and keyword presence simultaneously, getting the best of both worlds. Reranking (retrieve 100 results, rerank with an LLM, return top 10) further improves quality at the cost of latency.

Scaling Weaviate requires planning: decide on sharding strategy early (by data source, by customer, by region), monitor performance, and adjust gradually. For new projects, start simple (single node) and upgrade to clustering only if needed. Most projects never need more than a 3-node cluster.

Integration Patterns & Best Practices

Weaviate integrates effectively with various vector processing pipelines and ML frameworks. When building RAG systems, Weaviate can serve as the vector store backend while your application handles orchestration and LLM interaction. The object-centric design enables rich metadata filtering that goes beyond typical vector databases. Integration with existing MLOps infrastructure is straightforward, with support for Python, Go, JavaScript, and other languages.

Best practices for Weaviate integration include proper schema design upfront, careful consideration of batch size for imports, and appropriate configuration of replication for reliability. Using GraphQL queries effectively requires understanding Weaviate's query language but provides powerful filtering and aggregation capabilities. For production deployments, monitoring query performance and rebalancing shards as data grows becomes important.

import weaviate

client = weaviate.Client("http://localhost:8080")

# Create schema
schema = {
    "class": "Document",
    "vectorizer": "text2vec-openai",
    "properties": [
        {"name": "content", "dataType": ["text"]},
        {"name": "source", "dataType": ["string"]},
        {"name": "timestamp", "dataType": ["date"]}
    ]
}

# Query with filtering
result = client.query.get("Document").with_where({
    "path": ["source"],
    "operator": "Equal",
    "valueString": "handbook.pdf"
}).with_limit(10).do()

Feature	Strength	Trade-off
Object-centric model	Rich metadata filtering	More complex queries
GraphQL interface	Powerful & flexible	Steeper learning curve
Hybrid search	Combines vector & keyword	Added latency
Replication	High availability	Complexity & cost

Weaviate continues to evolve with new capabilities. Recent releases have added support for generative modules, enabling in-database answer generation. The community actively contributes integrations and extensions, making Weaviate increasingly valuable in complex RAG architectures.