Upload any chip datasheet. Ask anything. CircuitSage answers with the exact page citation — grounded in YOUR document, not ChatGPT’s training data. 17/17 register-intent queries verified across STM32, ESP32, nRF52, and RP2040.
Built by an embedded engineer who got tired of ChatGPT hallucinating bit positions at 11pm.
Drop any PDF datasheet below — ESP32, STM32, nRF52, RP2040, TI, AVR, anything. First upload takes about a minute to index (we embed every page for semantic retrieval). Then ask questions in plain English. Every answer carries the exact page number and real register names from YOUR document.
A new sensor. A 180-page datasheet. Buried register tables, contradictory notes, and a reference code example that's clearly wrong. You've been here before. Every new component is a research project.
GPT-4 confidently gives you register values. You flash the firmware. Nothing works. The values were hallucinated. Generic AI has no concept of a register map, a timing constraint, or a voltage rail. Your hardware paid the price.
STM32 HAL is 40,000 lines. ESP-IDF examples are outdated. Arduino libraries don't expose what you actually need. You're back to reading the datasheet anyway — just angrier.
Every generic ChatGPT-with-PDF tool will confidently cite bit 14 of RCC_APB2ENR for USART1 clock enable. It’s bit 4. Bit 14 is SYSCFGEN. Your hardware will silently fail. CircuitSage is built to NOT do that.
BM25 catches exact hex-address and register-name matches. Embedding cosine catches semantic matches BM25 misses. We rerank BM25’s top 1000 chunks by semantic similarity. Either alone is brittle on dense 1700-page reference manuals. Combined, they surface the canonical register page every time.
Register definitions routinely span page breaks: the heading is on page 189, the bit-field list continues on 190. We track register IDs across pages so the bit “Bit 4 USART1EN” always stays attached to “RCC_APB2ENR” even if they’re in different chunks.
When a page hosts the tail of one register’s bits and the start of the next register’s section, a generic LLM will mis-attribute. CircuitSage’s system prompt enforces: a bit belongs to the register heading that APPEARS BEFORE it, never after. Tested on STM32 page 184 (AHB1ENR tail + AHB2ENR start), nails it every time.
17/17 register-intent queries verified across STM32 RM0090 (1757 pages), ESP32 TRM (783), nRF52840 PS (979), RP2040 datasheet (641). Every claimed register name confirmed to appear within ±2 pages of the citation. When we don’t find it, we SAY so — not make something up.
We extract text per page, sub-chunk long register-definition pages, and embed every chunk with text-embedding-3-small. Your 1700-page STM32 RM becomes ~3500 searchable chunks in under a minute.
BM25 picks its top 1000 for the query. We rerank those by embedding cosine similarity and keep the top 15. This catches both “USART1EN at 0x40023844” (BM25 win) and “enable the clock for UART” (embedding win).
GPT-4o gets the retrieved chunks plus a domain-tuned system prompt: cite the exact page, never guess bit positions, flag the answer if the info isn’t in context. Temperature 0 so you get the same grounded answer every time.
CircuitSage is live and free during MVP. Drop your email to hear when we add vendor-specific optimizations, more models, or a priced tier — so you don’t miss the next thing that saves you a 200-page read.
No spam. One email when there’s something real to say.
ChatGPT-with-PDF reads your doc into a very long context and lets the model retrieve from attention. That works until the doc is bigger than the context window, or the answer is on page 190 while page 201 (a sibling register) scores higher on the LLM’s internal attention. CircuitSage uses a two-stage retriever (BM25 for exact hex/register matches + embedding cosine rerank for semantic matches) that we’ve measurably tuned to land the canonical register page in the top 15 on dense manuals. We also enforce domain rules in the system prompt (bit-to-register attribution, grounded citations, honest “I don’t know”) that generic chat UX doesn’t. Net: 17/17 register-intent queries verified grounded across 4 vendors.
This is the core. Every register name in an answer is checked to appear within ±2 pages of the citation on a live verification harness. Across STM32 RM0090, ESP32 Technical Reference Manual, nRF52840 Product Spec, and RP2040 datasheet, we’ve run 17 register-intent queries and all 17 are either grounded or honestly abstain. GPT-4o plus our system prompt will say “this doesn’t appear in the sections I found” rather than invent.
Any chip whose datasheet is a text-based PDF (not a scanned image). We’ve verified on ESP32, STM32 F4 family, nRF52840, and RP2040. TI, Microchip, Renesas, Nordic — as long as the PDF has extractable text, it works. Scanned-image PDFs are rejected explicitly.
Depends on the PDF. A 78-page ESP32 datasheet takes ~5 seconds. A 1757-page STM32 reference manual takes about 60 seconds (we embed every one of ~3500 chunks). After that, questions answer in 5–15 seconds.
Free during MVP. Chunks and embeddings live in memory only for the duration of your session — we don’t persist uploaded PDFs or your questions. The server has no database.
The repo is private for now while we stabilize. The design decisions (hybrid retriever, cross-page register carryforward, bit-to-register discipline) are described above so you can evaluate the approach before trusting it with your hardware work.