As of June 2024:
LLM: Gemini 1.5 Flash
Database: Vespa
Embeddings: text-embedding-3-small (Open AI)
Reranker: Cohere rerank 3
Relevance Scoring: Hybrid (normalized cosine similarity and bm25), over several fields
Chunking: 1024 tokens, one document with an array of chunks
Overlap: None
Retrieval: Either entire document, or best chunk n with chunk n-1 and n+1
VPS: Hetzner, cloud in Virginia, US