SparkNotes
Posts
PART II: What Developers Need to Know About AI Workflows

PART II: What Developers Need to Know About AI Workflows

Making AI Reliable for Renewable Developers

Shivam Agarwal
October 29, 2025

This article is Part II of our three-part series on how developers should evaluate AI tools for their workflows. In Part I, we explored why developers should think of AI not as magic, but as infrastructure that transforms scattered data into actionable insights.

Imagine you’re screening a hundred parcels spread across multiple counties. You could ask an AI tool to “check zoning for each site,” but what you’d receive depends entirely on how that AI model works. One kind of system will hand you a structured, reproducible dataset; the other will return a hundred different narratives that sound convincing but can’t be audited and may vary from one LLM to another.

At the core, this difference comes down to methodology. Large language models such as ChatGPT operate on a probabilistic principle. They generate text token by token, drawing on general web training and whatever they can retrieve at the moment of a query. They are exceptional for reasoning and writing, but they have no built-in concept of coverage or persistence. Each answer exists in isolation, dependent on a prompt, and a model state that recedes once the chat ends. The model is also not always designed to clarify or question the users intention, which can sometimes lead to an output on an entirely wrong parcel of land, which might have a similar sounding name. In our internal testing, we asked ChatGPT and Gemini to perform research on the same parcel of land, but both the tools ran the query on two completely different parcels with similar sounding names in two different states.

By contrast, purpose-built systems for specific industries like Spark use a deterministic, coverage-first approach. These platforms pre-define the universe of sources, such as zoning regulation portals, public meeting archives, local news, and filings, and then crawl them continuously on a schedule. Data is normalized, indexed, and versioned so that every finding has a timestamp and citation. The workflow isn’t conversational; it’s engineered: crawl → extract → normalize → index → alert → cite.

For developers, this difference has real implications. Site screening involves multiple dependent steps: mapping a parcel, identifying the zoning district, extracting permitted uses, cross-checking for moratoria, and flagging political risk. In a probabilistic model, those steps blur together, a failure mid-way is rarely visible because the system doesn’t track state or raise an error. A deterministic pipeline, on the other hand, treats each stage as a verifiable checkpoint. If an ordinance section can’t be found, the field remains blank and flagged for review. Nothing is fabricated, and every value can be traced back to its source.

This structure also enables portfolio-scale reproducibility. When you run 200 parcels through a deterministic workflow, you get standardized outputs that can be compared apples-to-apples. When you run the same 200 prompts through a general LLM, you get variations because its design prioritises fluency over consistency.

In diligence, consistency often matters more than creativity. The ability to prove where every data point came from, and to re-run the same process tomorrow and get the same result, is what turns AI from a research assistant into an audit-grade tool.

Stay tuned for Part III, where we’ll explore how document-level AI can bring audit-grade diligence to zoning codes, filings, and contracts without compromising security or confidentiality.