1 points | by talbuilds 20 hours ago ago
2 comments
300+ installs in 24h, RAG Firewall now with GraphRAG support.
A couple of extra notes I didn’t fit in the main post:
– The firewall runs entirely client-side, so no data ever leaves your environment.
– It focuses on *retrieval-time* risks, not output moderation — so the LLM never sees poisoned chunks in the first place.
– Policies are YAML: you can choose to deny, allow, or just re-rank risky docs (based on recency, provenance, relevance).
– Overhead is low: scanners are regex/heuristic, so for ~5–20 chunks it adds only a few ms.
I’d love feedback on two things in particular:
1. Do you think retrieval-time filtering belongs in the pipeline, or should it all be done at ingest/output?
2. If you’ve got prompt injection payloads or edge cases you use to test your own RAG stacks, I’d love to try them against this.
Thanks for taking a look — always happy to hear critique, especially from folks running LangChain/LlamaIndex in production.
300+ installs in 24h, RAG Firewall now with GraphRAG support.
A couple of extra notes I didn’t fit in the main post:
– The firewall runs entirely client-side, so no data ever leaves your environment.
– It focuses on *retrieval-time* risks, not output moderation — so the LLM never sees poisoned chunks in the first place.
– Policies are YAML: you can choose to deny, allow, or just re-rank risky docs (based on recency, provenance, relevance).
– Overhead is low: scanners are regex/heuristic, so for ~5–20 chunks it adds only a few ms.
I’d love feedback on two things in particular:
1. Do you think retrieval-time filtering belongs in the pipeline, or should it all be done at ingest/output?
2. If you’ve got prompt injection payloads or edge cases you use to test your own RAG stacks, I’d love to try them against this.
Thanks for taking a look — always happy to hear critique, especially from folks running LangChain/LlamaIndex in production.