Funds for NGOs
AI / RAG2024

FUNDSFORNGOS

From 10,000 daily crawled pages to a streaming AI answer in under a second — with 70% lower API costs.

70%
LLM cost vs naive RAG
10K+
Grant pages crawled daily
1K+
Daily active NGO users
SCROLL
Client
FundsForNGOs.org
Domain
AI / RAG Infrastructure
Platform
Web + Flutter
Duration
Jan 2024 — Present
Core Stack
Rust · pgvector · Next.js
The Brief

THE
PROBLEM

FundsForNGOs needed NGO professionals to ask natural language questions and get precise, sourced answers from a living corpus of 10,000+ grant pages updated daily.

A Rust/Axum backend with a Tokio-powered async crawler, content-hash-based deduplication, pgvector HNSW semantic search, and a WordPress→JWT→Rust auth bridge. Two platforms — Next.js web and Flutter mobile — served from a single API.

Core Engineering Challenge

Build a RAG pipeline that stays accurate as grants expire daily, scales to 1,000+ concurrent users, enforces subscription tiers at the API layer, and does it at a fraction of what a naive LLM integration would cost.

70%
Cost Reduction
1K+
Daily Active Users
1s<
Time to First Token
10K+
Pages Indexed Daily
Extended stack
backend
Rust / Axum
backend
Tokio
database
pgvector
database
PostgreSQL
frontend
Next.js
mobile
Flutter
ai
OpenAI GPT-4
auth
JWT
database
Redis
Flutter Mobile — Chat Interface
Flutter Mobile — Chat Interface
How we built it

THE ARCHITECTURE

01
Async Rust Crawler

Tokio-powered async crawler ingesting 10,000+ grant pages daily. Content-based hashing skips unchanged pages entirely.

RustTokioAxumContent Hashing
02
Chunk Deduplication

Chunks fingerprinted and compared before embedding. Semantically duplicate content collapsed to one canonical embedding, cutting API calls ~45%.

Chunk FingerprintingpgvectorCosine Similarity
03
pgvector Semantic Search

HNSW approximate nearest-neighbour search with re-ranking by recency and source authority before LLM context assembly.

pgvectorHNSW IndexRe-ranking
04
WordPress → JWT → Rust Auth

Custom WP plugin issues short-lived JWTs. Rust /auth/callback validates, creates sessions, enforces subscription tiers server-side.

WordPress IdPJWTRust SessionsRate Limiting
05
Streaming Chat Interface

Rust API streams completions via SSE through Next.js API routes. Flutter mobile uses the same API with native streaming UI.

SSENext.jsFlutterGPT-4
System Data Flow
Ingestion
CrawlerContent Hash10K+ pages
Processing
ChunkerDeduplicatorEmbedding API
Storage
pgvectorPostgreSQLRedis Cache
Auth
Identity ProviderJWT BridgeSessions
Query
HNSW SearchRe-rankerLLM Stream
Delivery
Web AppMobile AppUsers
What we delivered

THE RESULTS

70%

OpenAI API spend vs naive RAG via content-hash skipping on crawl, chunk fingerprint deduplication before embedding, and a Redis embedding cache — validated in production billing dashboards month over month.

70%
Cost Reduction

OpenAI API spend vs naive RAG via content-hash skipping on crawl, chunk fingerprint deduplication before embedding, and a Redis embedding cache — validated in production billing dashboards month over month.

1K+
Daily Active Users

NGO professionals use the AI assistant daily for grant discovery and drafting; autoscaling Rust + connection pooling kept p95 latency flat during campaign spikes without customer-visible outages.

1s<
Time to First Token

Rust Axum streams completion tokens over SSE through Next.js API routes (and analogous Flutter streaming clients), so users see answers begin almost immediately after hitting send.

10K+
Pages Indexed Daily

Fully automated crawl pipeline with sitemap and WordPress REST awareness — zero manual re-index jobs — so the corpus stays aligned with expiring grants without a content ops team babysitting uploads.

45%
Embedding Calls Cut

Chunk deduplication and content-hash skips eliminate redundant OpenAI embedding calls when pages or paragraphs unchanged, directly lowering variable cost per crawl cycle.

2×
Platforms from One API

Next.js web and Flutter mobile both consume the same Rust Axum API, JWT/session bridge, and rate limits — one auth and billing story instead of divergent backends per client.

Visual documentation

SCREENS &
INTERFACES

Next.js Web — Grant Discovery
Next.js Web — Grant Discovery
Admin Dashboard
Admin Dashboard
Usage Analytics
Usage Analytics
Flutter — Search Results
Flutter — Search Results
Engineering decisions

TECH
DEEP DIVE

RUST
Why Rust for the Backend

Memory safety without GC pauses matters when streaming LLM responses to 1,000+ concurrent users. Axum tower middleware gives per-route rate limiting with zero overhead.

The Rust backend handles auth, crawling, embedding pipeline, vector search, and streaming — all from a single binary.

VEC
pgvector over Pinecone

pgvector lives in the same PostgreSQL instance as user and subscription data. JOIN-based filtering is trivial. Cross-service calls eliminated.

HNSW gives ~95% recall at 10× the query speed of exact kNN at this corpus size.

COST
The 70% Cost Reduction

Content-hash crawl skipping, chunk fingerprinting deduplication, and Redis embedding cache (24h TTL) for popular query vectors.

Most RAG pipelines re-embed everything every crawl. Incremental approach cut our bill from ~$2,400/mo to ~$720/mo.

AUTH
The WordPress Auth Bridge

Rather than migrating auth — a 6-month project — a custom WP plugin issues JWT, Rust validates it, creates server-side session, maps WP tier to API permissions.

Session cookies are HttpOnly and bound to the Rust session store. Next.js frontend has no knowledge of subscription limits.

Next Case Study
Blockchain
NFTAttributionEngine

Raw EVM Log Correlation

RustEVM InternalsPostgreSQL
Start a project

LET'S
BUILD
SOMETHING.

We take on a small number of projects at a time. If the problem is hard, we're interested.

Email
hello@techmusketeers.com
Response time
Within 24 hours
Availability
Open for new projects · 2025