Skip to content

experiment: Add windmill-local crate with libSQL/Turso for local mode preview#7662

Open
rubenfiszel wants to merge 2 commits intomainfrom
experiment/libsql-local-mode
Open

experiment: Add windmill-local crate with libSQL/Turso for local mode preview#7662
rubenfiszel wants to merge 2 commits intomainfrom
experiment/libsql-local-mode

Conversation

@rubenfiszel
Copy link
Contributor

Summary

Experimental implementation of a local mode for Windmill using libSQL (Turso's SQLite fork) instead of PostgreSQL. This enables lightweight preview execution without requiring a full PostgreSQL setup.

What's included:

  • New windmill-local crate with embedded libSQL database
  • Script preview endpoint supporting bash, python, deno, bun
  • Flow preview with full flow executor supporting:
    • ForloopFlow (iterate over arrays/ranges)
    • WhileloopFlow (execute while condition is true)
    • BranchOne (if/else branching with expression evaluation)
    • BranchAll (parallel branches)
    • RawScript (inline scripts)
    • Identity (pass-through)
  • Expression evaluation for input transforms with comparisons and logical operators
  • Single embedded worker with in-memory job queue

Known limitations (by design for this experiment):

  • Script/Flow references (path-based) not supported
  • FlowScript, AIAgent modules not implemented
  • BranchAll executes sequentially rather than truly in parallel
  • No persistence of scripts/flows (only jobs)

To test:

cargo run -p windmill-local --example local_server

# Script preview
curl -X POST http://localhost:8000/api/w/local/jobs/run_wait_result/preview \
  -H "Content-Type: application/json" \
  -d '{"content": "echo Hello", "language": "bash"}'

# Flow preview with branching
curl -X POST http://localhost:8000/api/w/local/jobs/run_wait_result/preview_flow \
  -H "Content-Type: application/json" \
  -d '{
    "value": {
      "modules": [{
        "id": "branch",
        "value": {
          "type": "branchone",
          "branches": [{"expr": "flow_input.x > 5", "modules": [{"id": "big", "value": {"type": "rawscript", "language": "bash", "content": "echo big"}}]}],
          "default": [{"id": "small", "value": {"type": "rawscript", "language": "bash", "content": "echo small"}}]
        }
      }]
    },
    "args": {"x": 10}
  }'

Test plan

  • All 14 unit tests pass (cargo test -p windmill-local)
  • Script preview works for bash, python
  • Flow preview works with linear flows
  • Flow preview works with branch-one conditions
  • Flow preview works with branch-all parallel execution

🤖 Generated with Claude Code

rubenfiszel and others added 2 commits January 23, 2026 07:01
This experimental crate demonstrates running Windmill preview endpoints
with libSQL (SQLite/Turso) instead of PostgreSQL. Key features:

- Schema: SQLite-compatible schema for jobs, queue, and results
  - ENUMs → TEXT with CHECK constraints
  - JSONB → JSON (TEXT)
  - Arrays → JSON arrays
  - No FOR UPDATE SKIP LOCKED (single worker, mutex coordination)

- Database: Supports three modes via libsql crate:
  - In-memory SQLite (for testing)
  - File-based SQLite (local persistence)
  - Remote Turso (multi-writer scenarios)

- API: Compatible preview endpoints:
  - POST /api/w/{workspace}/jobs/run/preview
  - POST /api/w/{workspace}/jobs/run_wait_result/preview
  - POST /api/w/{workspace}/jobs/run/preview_flow
  - POST /api/w/{workspace}/jobs/run_wait_result/preview_flow

- Executor: Simple script execution for bash, python3, deno, bun

- Worker: Single embedded worker that processes queue

Run with: cargo run -p windmill-local --example local_server

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add flow_executor.rs that supports executing complex flows in local mode:
- ForloopFlow: iterate over arrays/ranges with sequential execution
- WhileloopFlow: execute modules while condition is true
- BranchOne: if/else branching based on conditions
- BranchAll: parallel branch execution (sequential for now)
- RawScript: inline script execution (bash, python, deno, bun)
- Identity: pass-through module

Key features:
- Uses windmill-common FlowValue types for compatibility
- Expression evaluation for input transforms with comparisons
- Proper flow status tracking with module results
- Recursive async execution with async_recursion crate

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@claude
Copy link
Contributor

claude bot commented Jan 23, 2026

Claude finished @rubenfiszel's task in 2m 9s —— View job


PR Review Complete: windmill-local crate with libSQL/Turso

  • Read all changed files and understand the architecture
  • Review code quality and best practices
  • Check for potential bugs or issues
  • Evaluate performance considerations
  • Assess security implications
  • Provide detailed feedback with inline comments
  • Add testing instructions for verification

Summary

This is a well-structured experimental implementation of a lightweight local mode for Windmill using libSQL instead of PostgreSQL. The architecture is clean with good separation of concerns across modules.

Strengths

  1. Clean Architecture: Good module separation (db, queue, jobs, executor, flow_executor, worker, server)
  2. Proper Use of windmill-common Types: Correctly integrates with existing FlowValue, FlowModule, InputTransform types
  3. Comprehensive Flow Support: Implements ForloopFlow, WhileloopFlow, BranchOne, BranchAll with proper context handling
  4. Good Test Coverage: 14 unit tests covering core functionality
  5. Clear Documentation: Well-documented limitations and non-goals in module comments

Issues Found (9 inline comments)

Severity File Issue
🔴 Security executor.rs:107 JSON injection risk in Python code wrapper
🟡 Correctness queue.rs:44 Non-atomic job pull could cause race conditions in multi-client Turso mode
🟡 Safety db.rs:78 Exposing mutex guard could lead to deadlocks
🟡 Safety flow_executor.rs:551 String slice on UTF-8 boundaries
🟢 Cleanup Cargo.toml:41 Unused rquickjs dependency
🟢 Cleanup flow_executor.rs:17 Unused db parameter passed through flow execution
🟢 Info server.rs:104 Wide-open CORS should be documented as intentional
🟢 Info flow_executor.rs:347 Magic number for MAX_ITERATIONS
🟢 Info executor.rs:184 Deprecated --unstable Deno flag

Recommendations

  1. Fix the Python injection vulnerability - This is the most critical issue. Use base64 encoding or pass args via environment variable
  2. Consider atomic job pull - Especially important if Turso remote mode will be used with multiple clients
  3. Remove or use rquickjs - Either implement proper JS evaluation for complex expressions or remove the unused dependency

Testing Instructions

To verify the changes through the app interface:

This is a backend-only experimental crate that runs as a standalone local server, not integrated into the main Windmill app UI. To test:

  1. Build and run the local server: cargo run -p windmill-local --example local_server
  2. The server starts on http://localhost:8000
  3. Test script preview with curl:
    curl -X POST http://localhost:8000/api/w/local/jobs/run_wait_result/preview \
      -H "Content-Type: application/json" \
      -d '{"content": "echo Hello", "language": "bash"}'
  4. Test flow preview with branching:
    curl -X POST http://localhost:8000/api/w/local/jobs/run_wait_result/preview_flow \
      -H "Content-Type: application/json" \
      -d '{
        "value": {
          "modules": [{
            "id": "branch",
            "value": {"type": "branchone", "branches": [{"expr": "flow_input.x > 5", "modules": [{"id": "big", "value": {"type": "rawscript", "language": "bash", "content": "echo big"}}]}], "default": [{"id": "small", "value": {"type": "rawscript", "language": "bash", "content": "echo small"}}]}
          }]
        },
        "args": {"x": 10}
      }'
  5. Run the unit tests: cargo test -p windmill-local

@cloudflare-workers-and-pages
Copy link

Deploying windmill with  Cloudflare Pages  Cloudflare Pages

Latest commit: 741526b
Status: ✅  Deploy successful!
Preview URL: https://45bd47c0.windmill.pages.dev
Branch Preview URL: https://experiment-libsql-local-mode.windmill.pages.dev

View logs

import sys

# Args passed as JSON
args = json.loads('''{}''')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security Concern: JSON Injection in Python Wrapper

The args are injected directly into the Python code using triple-quoted strings, which could allow code injection if the JSON contains special Python sequences.

Consider using base64 encoding or passing via environment variable:

Suggested change
args = json.loads('''{}''')
// Wrap the code to handle args and return JSON result
let args_b64 = base64::engine::general_purpose::STANDARD.encode(serde_json::to_string(args)?);
let wrapped_code = format!(
r#"
import json
import sys
import base64
# Args passed as base64-encoded JSON
args = json.loads(base64.b64decode('{}').decode('utf-8'))
# User code
{}
# Call main if it exists
if 'main' in dir():
result = main(**args)
print(json.dumps(result))
"#,
args_b64,
code
);

Alternatively, pass via stdin or environment variable for better security.

}

/// Get a reference to the connection (locked)
pub async fn conn(&self) -> tokio::sync::MutexGuard<'_, Connection> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential Deadlock Risk

Exposing the raw mutex guard can lead to deadlocks if callers hold the guard across await points or while calling other LocalDb methods.

Consider removing this public method and ensuring all database operations go through the internal methods (execute, query, execute_batch) which properly manage the lock scope. If raw connection access is truly needed, document the risks clearly:

/// Get a reference to the connection (locked)
/// 
/// **Warning**: Hold this lock for the minimum time necessary.
/// Do not call other LocalDb methods while holding this guard or
/// await across the guard - this will cause deadlocks.
pub async fn conn(&self) -> tokio::sync::MutexGuard<'_, Connection> {

};

let id_str: String = row.get(0)?;
let id = Uuid::parse_str(&id_str).map_err(|e| LocalError::InvalidJobState(e.to_string()))?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Race Condition: Non-Atomic Job Pull

The SELECT and UPDATE operations are not atomic. Between the SELECT (line 23-37) and UPDATE (line 48-52), another concurrent request could theoretically pull the same job.

While this is mitigated by the single-connection mutex design mentioned in comments, if the crate is ever used with multiple connections (e.g., Turso remote mode with multiple clients), this could cause duplicate job processing.

Consider using a single atomic UPDATE...RETURNING statement:

UPDATE v2_job_queue SET running = 1, started_at = ?1 
WHERE id = (
    SELECT q.id FROM v2_job_queue q
    JOIN v2_job j ON q.id = j.id
    WHERE q.running = 0 AND q.scheduled_for <= ?1
    ORDER BY q.priority DESC, q.scheduled_for ASC
    LIMIT 1
)
RETURNING id

Then fetch the full job data separately.

skip_failures: bool,
ctx: &mut FlowContext,
) -> Result<serde_json::Value> {
const MAX_ITERATIONS: usize = 1000;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Magic Number: MAX_ITERATIONS

The value 1000 is hardcoded without configuration. For an experimental crate this is fine, but consider making it configurable:

const DEFAULT_MAX_ITERATIONS: usize = 1000;

And potentially adding it as a parameter to the flow context or a configuration struct for future flexibility.

// Handle string literals
if (expr.starts_with('"') && expr.ends_with('"')) ||
(expr.starts_with('\'') && expr.ends_with('\'')) {
return Ok(serde_json::json!(&expr[1..expr.len()-1]));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

String Slice Safety Issue

Slicing &expr[1..expr.len()-1] could panic on multi-byte UTF-8 characters if the string boundaries don't align with character boundaries. While unlikely for typical inputs, it's safer to use:

Suggested change
return Ok(serde_json::json!(&expr[1..expr.len()-1]));
if (expr.starts_with('"') && expr.ends_with('"')) ||
(expr.starts_with('\'') && expr.ends_with('\'')) {
let inner = &expr[1..expr.len()-1];
return Ok(serde_json::json!(inner));

Or even better, use chars():

if (expr.starts_with('"') && expr.ends_with('"')) ||
   (expr.starts_with('\'') && expr.ends_with('\'')) {
    let inner: String = expr.chars().skip(1).take(expr.chars().count() - 2).collect();
    return Ok(serde_json::json!(inner));
}

Though for ASCII quotes this is fine as-is.


/// Create the API router
fn create_router(state: Arc<AppState>) -> Router {
let cors = CorsLayer::new()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security Note: Wide-Open CORS

Allowing any origin/method/header is appropriate for local development but should be documented:

    // Note: Wide-open CORS is intentional for local development mode.
    // This should NOT be used in production deployments.
    let cors = CorsLayer::new()

Consider adding a feature flag or configuration option to restrict CORS for more secure deployments.

};
use windmill_common::scripts::ScriptLang as WmScriptLang;

use crate::db::LocalDb;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused Import

LocalDb is imported but the db parameter in many functions appears unused for the actual flow execution logic (it's passed through but not used for persistence during flow execution). This is noted as "by design" but the import and passing db everywhere adds noise.

If db is intended for future use (e.g., persisting intermediate results), that's fine - otherwise consider removing it from function signatures to simplify the API.


let mut child = Command::new("deno")
.arg("eval")
.arg("--unstable")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deprecation Warning: --unstable Flag

The --unstable flag for Deno is deprecated in recent versions. Consider using specific unstable feature flags instead, or removing it entirely if no unstable features are needed:

Suggested change
.arg("--unstable")
let mut child = Command::new("deno")
.arg("eval")
.arg(&wrapped_code)

Or if specific unstable features are needed:

.arg("--unstable-kv")  // Example: specific unstable feature

windmill-common = { path = "../windmill-common", default-features = false }

# For input transforms (JavaScript evaluation)
rquickjs = { version = "0.8", features = ["bindgen", "classes", "loader", "array-buffer", "futures"] }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused Dependency

rquickjs is listed as a dependency but doesn't appear to be used anywhere in the codebase. The expression evaluation in flow_executor.rs uses custom string parsing instead.

If this was intended for a more robust JS evaluation implementation, consider either:

  1. Using it for the evaluate_expr function to handle complex expressions properly
  2. Removing it to reduce compilation time and binary size
Suggested change
rquickjs = { version = "0.8", features = ["bindgen", "classes", "loader", "array-buffer", "futures"] }
# For input transforms (JavaScript evaluation) - TODO: implement proper JS runtime
# rquickjs = { version = "0.8", features = ["bindgen", "classes", "loader", "array-buffer", "futures"] }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant