Pricing & Latency
How Neurovn estimates cost and latency for every node in your workflow. All numbers are computed locally — zero external API calls.
How pricing works
Neurovn maintains a local pricing registry — a JSON file with per-model rates for 38 models across 7 providers. When you assign a model to an Agent node, the estimator looks up that model's input and output token rates.
Actual pricing is set by each provider and can change at any time. Neurovn's registry is updated regularly, but for production-critical budgets, always verify against the provider's official pricing page.
Providers can change prices without notice. Neurovn estimates are approximations for planning — always confirm sensitive budgets against official rate cards.
Cost formulas
LLM (Agent) Node
input_cost = (input_tokens / 1,000,000) × model.input_per_million
output_cost = (output_tokens / 1,000,000) × model.output_per_million
node_cost = input_cost + output_costInput tokens are counted from the system prompt + context using native tokenization. Output tokens are estimated via a task-type multiplier.
Tool Node
tool_overhead = +schema_tokens + avg_response_tokens from tool registry
fallback = +200 schema +800 response tokens when tool metadata is missingTool latency/cost effects come from registry definitions. The fallback latency is 200 ms when a tool is undefined.
Workflow Total
workflow_cost = SUM(node_costs)
+ branch_probabilities × branch_costs
+ loop_iterations × loop_body_costsFor branched workflows, each path's cost is weighted by its probability. For loops, cost scales by expected iterations (with a configurable max).
Latency model
Per-Node Latency
agent_latency = (output_tokens / model.tokens_per_sec) × 1000 ms
tool_latency = tool_registry.latency_ms (fallback 200 ms)Graph Latency
sequential = SUM(node_latencies) along the path
parallel = MAX(branch_latencies) across branches
loop/retry = expected_iterations × single_lap_latencyThe critical path — the longest-latency path through the graph — determines the end-to-end P95 latency estimate.
Supported providers
Neurovn ships with pricing data for 38 models across 7 providers. Adding a new model is a single entry in backend/data/model_pricing.json — no frontend code changes needed.
| Provider | Models | Pricing |
|---|---|---|
| OpenAI | GPT-4, GPT-4o, GPT-4o mini, o3, o4 mini + more | Official |
| Anthropic | Claude 4 Sonnet, Claude 4 Opus, Claude 3.7 Sonnet + more | Official |
| Gemini 1.5 Pro/Flash, Gemini 2.0 Pro/Flash, Gemini Exp-1206 + more | Official | |
| Meta | Llama 3.1 (405B/70B/8B), Llama 3.2, Llama 3.3 | Official |
| Mistral | Mistral Large/Medium/Small, Codestral | Official |
| DeepSeek | DeepSeek-V3, DeepSeek-R1 | Official |
| Cohere | Command R, Command R+, Command Nightly | Official |