grepctl is a command-line and programmatic utility that enables semantic search across heterogeneous data lakes. By leveraging Google Cloud's advanced AI services and BigQuery's vector search capabilities, grepctl transforms unstructured data into a semantically searchable index. We describe the data ingestion pipeline, multimodal processing architecture, and the multiple interfaces—CLI, Web, Python, and SQL—that make this system both powerful and accessible.
grepctl is a command-line and programmatic utility that enables semantic search across heterogeneous data lakes. By leveraging Google Cloud's advanced AI services and BigQuery's vector search capabilities, grepctl transforms unstructured data into a semantically searchable index. We describe the data ingestion pipeline, multimodal processing architecture, and the multiple interfaces—CLI, Web, Python, and SQL—that make this system both powerful and accessible.