Vane Data / Examples

Tender Compliance Check

This example describes a Vane Data pattern for tender or procurement compliance checks.

Vane Data does not ship a dedicated tender parser or compliance engine. It provides SQL, relation APIs, and AI helpers that can be composed into a review workflow.

Goal

Compare tender documents against a checklist and produce structured compliance results that can be reviewed, sampled, and audited.

Input tables

Documents:

  • document_id
  • section
  • text
  • source_uri

Checklist:

  • rule_id
  • requirement
  • severity

SQL preparation

example.py
import vane


con = vane.connect()


docs = con.sql("""
    select document_id, section, text, source_uri
    from read_parquet('data/tender_docs/*.parquet')
    where text is not null
""")


rules = con.sql("""
    select rule_id, requirement, severity
    from read_csv_auto('data/tender_rules.csv')
""")

Build checks

Use SQL to make the model workload explicit. Review the cross join size before sending prompts to a provider.

example.py
docs.to_table("docs")
rules.to_table("rules")


checks = con.sql("""
    select
        d.document_id,
        d.section,
        d.source_uri,
        r.rule_id,
        r.severity,
        'Requirement: ' || r.requirement || '\nDocument text: ' || d.text as prompt
    from docs d
    cross join rules r
""")

Structured model check

prompt(...) returns the configured output column. Explicitly combine that result with the check rows before summarizing, and validate row counts before writing the result.

example.py
from pydantic import BaseModel


class ComplianceResult(BaseModel):
    compliant: bool
    evidence: str
    reason: str


result_only = checks.prompt(
    "prompt",
    provider="openai",
    return_format=ComplianceResult,
    output_column="result_json",
    execution_backend="subprocess_actor",
)


checks_table = checks.to_arrow_table()
result_table = result_only.to_arrow_table()
checked = con.from_arrow(checks_table.append_column("result_json", result_table["result_json"]))

Summarize

example.py
checked.to_table("checked")


summary = con.sql("""
    select
        document_id,
        severity,
        count(*) as checks
    from checked
    group by document_id, severity
""")


summary.show()

Scale-out

Use Ray when SQL preparation, UDF stages, or model calls need distributed workers:

example.py
import vane


vane.configure(runner="ray")

Set provider concurrency based on API limits, rate limits, and model-serving capacity.

Validation

Before using results in a compliance workflow:

  • Version the checklist and prompt.
  • Keep source document references.
  • Review high-severity failures manually.
  • Sample compliant and non-compliant outputs.
  • Treat model output as review evidence, not legal advice.

Scope notes

This page does not define legal advice, procurement policy, document parsing, or a tender-specific connector. It documents how to express the workflow with Vane Data SQL and AI helpers.