Get Started

Quick Start

CrabGraph is a Gremlin engine that attaches to your existing DuckDB connection. Define vertices and edges as DuckDB views named V_* and E_*, then traverse them with the full Gremlin language. No separate server, no schema language, no migrations.

1
Add the Maven dependency
CrabGraph for Java is a plain JVM jar that embeds the engine in-process — no native lib needed. JDK 17+.
pom.xml
<dependency>
  <groupId>net.crabgraph</groupId>
  <artifactId>crabgraph</artifactId>
  <version>0.1.0</version>
</dependency>
<dependency>
  <groupId>org.duckdb</groupId>
  <artifactId>duckdb_jdbc</artifactId>
  <version>1.5.0.0</version>
</dependency>
2
Create your graph as DuckDB views
A view named V_<Label> in the public schema is a vertex label; E_<EdgeLabel> is an edge. Edge views carry endpoint columns of the form "public.<Vertex>__O" (out) and "public.<Vertex>__I" (in).
Java
try (Connection c = DriverManager.getConnection("jdbc:duckdb:/tmp/g.duckdb");
     Statement s = c.createStatement()) {
  s.execute("CREATE SCHEMA IF NOT EXISTS public");
  s.execute("""
      CREATE VIEW public.V_Person AS
        SELECT 1::BIGINT AS \"ID\", 'Alice' AS \"name\" UNION ALL
        SELECT 2::BIGINT,                'Bob'""");
  s.execute("""
      CREATE VIEW public.E_KNOWS AS
        SELECT 1::BIGINT AS \"ID\",
               2::BIGINT AS \"public.Person__I\",
               1::BIGINT AS \"public.Person__O\"""");
}
3
Attach and traverse
Pass an open java.sql.Connection to Crabgraph.attach(...). The engine inspects information_schema, registers your views as topology, and exposes Gremlin.
Java
import net.crabgraph.Crabgraph;
import java.sql.*;

try (Connection c = DriverManager.getConnection("jdbc:duckdb:/tmp/g.duckdb");
     Crabgraph g = Crabgraph.attach(c)) {

  String count   = g.gremlin("g.V().hasLabel('Person').count()");
  String friends = g.gremlin(
      "g.V().has('name','Alice').out('KNOWS').values('name').toList()");
}
1
Install via pip
The wheel bundles a native libcrabgraph — the engine runs embedded, with no separate runtime to install. Python 3.9+.
shell
pip install crabgraph
2
Create V_/E_ views
Use the duckdb package to set up your file. V_<Label> for vertices, E_<EdgeLabel> for edges, with endpoint columns named "public.<Vertex>__O" / "public.<Vertex>__I".
Python
import duckdb

conn = duckdb.connect("/tmp/g.duckdb")
conn.execute("CREATE SCHEMA IF NOT EXISTS public")
conn.execute("""
    CREATE VIEW public.V_Person AS
      SELECT 1::BIGINT AS "ID", 'Alice' AS "name" UNION ALL
      SELECT 2::BIGINT,                'Bob'
""")
conn.execute("""
    CREATE VIEW public.E_KNOWS AS
      SELECT 1::BIGINT AS "ID",
             2::BIGINT AS "public.Person__I",
             1::BIGINT AS "public.Person__O"
""")
3
Attach and traverse
Crabgraph.attach reads the database file path from your connection and auto-discovers the views.
Python
import crabgraph

with crabgraph.Crabgraph.attach(conn) as g:
    print(g.gremlin("g.V().hasLabel('Person').count()"))
    print(g.gremlin(
        "g.V().has('name','Alice').out('KNOWS').values('name').toList()"))
1
Install
Install crabgraph alongside the official DuckDB Node API. A small N-API addon is built at install time. Node.js 18+.
shell
npm install crabgraph @duckdb/node-api
2
Create V_/E_ views
Set up the DuckDB file with @duckdb/node-api. Vertex views are V_<Label>; edge views E_<EdgeLabel> with "public.<Vertex>__O" / "__I" endpoints.
TypeScript
import { DuckDBInstance } from "@duckdb/node-api";

const inst = await DuckDBInstance.create("/tmp/g.duckdb");
const conn = await inst.connect();
await conn.run("CREATE SCHEMA IF NOT EXISTS public");
await conn.run(`
  CREATE VIEW public.V_Person AS
    SELECT 1::BIGINT AS "ID", 'Alice' AS "name" UNION ALL
    SELECT 2::BIGINT,                'Bob'`);
await conn.run(`
  CREATE VIEW public.E_KNOWS AS
    SELECT 1::BIGINT AS "ID",
           2::BIGINT AS "public.Person__I",
           1::BIGINT AS "public.Person__O"`);
3
Attach and traverse
Crabgraph.attach queries PRAGMA database_list on the connection to find the database file, then attaches the engine.
TypeScript
import { Crabgraph } from "crabgraph";

const g = await Crabgraph.attach(conn);
try {
  console.log(g.gremlin("g.V().hasLabel('Person').count()"));
  console.log(g.gremlin(
    "g.V().has('name','Alice').out('KNOWS').values('name').toList()"));
} finally {
  g.close();
}
1
Add the modules
CrabGraph for Go is a cgo wrapper around the embedded libcrabgraph. Pair it with the official duckdb-go/v2 driver. Requires Go 1.24+ and CGO_ENABLED=1.
shell
go get github.com/henneberger/crabgraph/go
go get github.com/duckdb/duckdb-go/v2
2
Create V_/E_ views, attach, traverse
crabgraph.Attach(*sql.DB) reads the DuckDB file path from the connection and auto-discovers your views.
Go
package main

import (
    "database/sql"
    "fmt"

    _ "github.com/duckdb/duckdb-go/v2"
    crabgraph "github.com/henneberger/crabgraph/go"
)

func main() {
    db, err := sql.Open("duckdb", "/tmp/g.duckdb")
    if err != nil { panic(err) }
    defer db.Close()

    db.Exec("CREATE SCHEMA IF NOT EXISTS public")
    db.Exec(`CREATE VIEW public.V_Person AS
        SELECT 1::BIGINT AS "ID", 'Alice' AS "name" UNION ALL
        SELECT 2::BIGINT,                'Bob'`)
    db.Exec(`CREATE VIEW public.E_KNOWS AS
        SELECT 1::BIGINT AS "ID",
               2::BIGINT AS "public.Person__I",
               1::BIGINT AS "public.Person__O"`)

    g, err := crabgraph.Attach(db)
    if err != nil { panic(err) }
    defer g.Close()

    out, _ := g.Gremlin("g.V().hasLabel('Person').count()")
    fmt.Println(out)
}
1
Add the crates
The crate links against the embedded libcrabgraph shared library at build time. The duckdb feature enables the typed attach(&duckdb::Connection) entry point.
Cargo.toml
[dependencies]
crabgraph = { version = "0.1", features = ["duckdb"] }
duckdb    = "1.10"
2
Create V_/E_ views, attach, traverse
Open a file-backed duckdb::Connection, set up your views, and pass the connection to Crabgraph::attach.
Rust
fn main() -> Result<(), Box<dyn std::error::Error>> {
    let conn = duckdb::Connection::open("/tmp/g.duckdb")?;
    conn.execute_batch(r#"
        CREATE SCHEMA IF NOT EXISTS public;
        CREATE VIEW public.V_Person AS
            SELECT 1::BIGINT AS "ID", 'Alice' AS "name" UNION ALL
            SELECT 2::BIGINT,                'Bob';
        CREATE VIEW public.E_KNOWS AS
            SELECT 1::BIGINT AS "ID",
                   2::BIGINT AS "public.Person__I",
                   1::BIGINT AS "public.Person__O";
    "#)?;

    let g = crabgraph::Crabgraph::attach(&conn)?;
    println!("{}", g.gremlin("g.V().hasLabel('Person').count()")?);
    println!("{}", g.gremlin(
        "g.V().has('name','Alice').out('KNOWS').values('name').toList()")?);
    Ok(())
}

No schema configuration. CrabGraph reads information_schema at attach time. Whatever V_* / E_* views are present become your graph — drop or alter them and reattach.

Setup

Installation

CrabGraph is distributed through each language's native package manager. Install the crabgraph package for your language alongside a DuckDB driver — the engine runs embedded in your process, no separate server to deploy.

LanguagePackage ManagerPackageDuckDB driver
Java 17+Mavennet.crabgraph:crabgraph:0.1.0org.duckdb:duckdb_jdbc
Python 3.9+pip, Poetry, uvcrabgraph==0.1.0duckdb (auto-installed)
Node.js 18+npm, yarn, pnpmcrabgraph@0.1.0@duckdb/node-api
Go 1.24+go getgithub.com/henneberger/crabgraph/gogithub.com/duckdb/duckdb-go/v2
Rust 1.75+Cargocrabgraph = "0.1"duckdb = "1.10"

System libduckdb. The Rust binding links against your system libduckdb via duckdb-rs. Set DUCKDB_LIB_DIR / DUCKDB_INCLUDE_DIR if it isn't on the standard search path.

Core Concepts

Define a Schema

Your schema lives in DuckDB itself, as views in the public schema. There is no separate schema file, no annotations, no migrations. CrabGraph discovers everything from information_schema at attach time.

Vertices: V_<Label>

A view named V_<Label> becomes a vertex label. Each row is a vertex. The view must have a BIGINT column called "ID" as the vertex's primary key; every other column becomes a property.

SQL
CREATE VIEW public.V_Person AS
  SELECT id    AS "ID",    -- BIGINT
         name  AS "name",       -- property
         age   AS "age"
  FROM   raw_people;

Edges: E_<EdgeLabel>

A view named E_<EdgeLabel> becomes an edge label. Required columns:

  • "ID"BIGINT, the edge's primary key
  • "public.<OutVertexLabel>__O"BIGINT, the source vertex's ID
  • "public.<InVertexLabel>__I"BIGINT, the target vertex's ID

Any other columns are edge properties. The __O / __I column-name suffixes tell CrabGraph which vertex labels the edge connects.

SQL
CREATE VIEW public.E_KNOWS AS
  SELECT id           AS "ID",
         src_person   AS "public.Person__O",   -- out-vertex (Person)
         dst_person   AS "public.Person__I",   -- in-vertex (Person)
         since_year   AS "since"              -- edge property
  FROM   raw_friendships;

Backed by anything DuckDB can read

The view's SELECT is just SQL. It can read parquet files, Iceberg tables (via the iceberg extension), CSVs, PostgreSQL via the postgres extension, MotherDuck — any DuckDB source.

Iceberg (direct warehouse attach)

SQL
INSTALL iceberg; LOAD iceberg;
ATTACH 's3://my-bucket/warehouse/' AS cat (TYPE ICEBERG);

CREATE VIEW public.V_Order AS
  SELECT order_id AS "ID", total AS "total"
  FROM   cat.orders;

Apache Polaris catalog

For a Polaris-managed warehouse, register a secret with your OAuth2 credentials and attach via the catalog endpoint. Polaris hands DuckDB the table metadata; reads still go directly to the underlying storage.

SQL
INSTALL iceberg; LOAD iceberg;

CREATE SECRET polaris_auth (
    TYPE             ICEBERG,
    CLIENT_ID        '<your-principal-id>',
    CLIENT_SECRET    '<your-principal-secret>',
    OAUTH2_SERVER_URI 'https://<polaris-host>/api/catalog/v1/oauth/tokens',
    OAUTH2_SCOPE     'PRINCIPAL_ROLE:ALL'
);

ATTACH '<warehouse-name>' AS polaris (
    TYPE     ICEBERG,
    ENDPOINT 'https://<polaris-host>/api/catalog',
    SECRET   polaris_auth
);

CREATE VIEW public.V_Customer AS
  SELECT id AS "ID", email AS "email"
  FROM   polaris.analytics.customers;

Turn on cache_https. When your views read remote data (Iceberg, Polaris, S3 parquet), DuckDB's HTTPS response cache avoids refetching the same metadata/data files across queries — often a 10× speedup on repeated traversals. Set it once on the connection before attaching:

SET cache_https = true;

Read-only graph. Views are not writable, so g.addV(...) / g.addE(...) aren't supported. Add data by inserting into the underlying tables; the views surface it on the next traversal.

Core Concepts

Your First Query

CrabGraph speaks the Apache TinkerPop Gremlin language via the ANTLR parser — every standard Gremlin step works as expected. Pass a query string to g.gremlin(...) and you get a JSON result.

Gremlin
// Count vertices of a label
g.V().hasLabel("Person").count()

// Friends of Alice
g.V().has("Person", "name", "Alice")
 .out("KNOWS")
 .values("name")
 .order()
 .toList()

// Two-hop expansion
g.V().has("Person", "name", "Alice")
 .repeat(__.out("KNOWS").simplePath())
 .times(2)
 .dedup()
 .values("name")
 .toList()

Results are returned JSON-encoded. Vertices and edges materialize as objects with type, id, label, and properties fields:

JSON
[
  {"type":"vertex","id":"v[Person][1]","label":"Person",
   "properties":{"name":"Alice","age":34}}
]

New to Gremlin? See the Gremlin Primer below, or the TinkerPop reference for the full spec. The Java binding also exposes a programmatic g.traversal() for typed Gremlin without strings.

Core Concepts

Architecture

CrabGraph is embedded. It runs in your application's process as a library call — no daemon, no port to bind, no extra process to supervise. attach opens its own connection to the DuckDB file you pointed at and exposes Gremlin synchronously through whatever language you called it from.

Engine-managed schema

The first time CrabGraph attaches to a database, it creates a small set of internal tables in a reserved schema alongside your views. These hold topology metadata — which V_* / E_* views exist, their column types, and edge endpoints. The tables are created automatically; there's nothing for you to run or migrate.

Subsequent attaches reuse what's already there. If you add, drop, or re-shape a V_* / E_* view, rebuild the internal schema by dropping it and reattaching — CrabGraph rediscovers your views from scratch.

How each binding embeds

Non-Java bindings load a shared library (libcrabgraph) that's bundled with the package. The Java binding is a plain jar — no native load step needed.

BindingHow it loads
JavaPlain Maven jar
PythonBundled libcrabgraph via ctypes
Node.jsN-API addon over libcrabgraph
Gocgo binding to libcrabgraph
RustDynamic link to libcrabgraph

Storage

Storage is whatever DuckDB does. Persist by pointing at a .duckdb file. For ephemeral / testing setups, write your views into a temp file. In-memory connections are not yet supported by attach — file-backed DBs work because the engine opens its own link to the same file you did.

Testing tip. Use a fresh temp file per test (e.g. tmp_path in pytest, t.TempDir() in Go). Each attach rediscovers views from scratch.

Core Concepts

Schema Reference

CrabGraph interprets a small set of view-naming and column-naming conventions. Stick to them and the engine handles topology discovery automatically.

View naming

View nameBecomesNotes
public.V_<Label>Vertex label <Label>Label is case-sensitive in Gremlin queries
public.E_<EdgeLabel>Edge label <EdgeLabel>Endpoints inferred from __O/__I column names

Required columns

WhereColumnTypePurpose
Every V_ view"ID"BIGINTVertex primary key
Every E_ view"ID"BIGINTEdge primary key
E_ views"public.<OutVertex>__O"BIGINTSource vertex's ID
E_ views"public.<InVertex>__I"BIGINTTarget vertex's ID

Property type mapping

All non-ID, non-endpoint columns are exposed as Gremlin properties. CrabGraph infers the property type from DuckDB's information_schema.columns.data_type:

DuckDB typeGremlin property type
BOOLEANBoolean
TINYINTByte
SMALLINTShort
INTEGER / INTInteger
BIGINT / HUGEINTLong
REALFloat
DOUBLE / FLOATDouble
DATELocalDate
TIMESTAMPLocalDateTime
TIMESTAMP WITH TIME ZONEZonedDateTime
BLOB / BYTEAbyte[]
JSONJSON (string)
VARCHAR, TEXT, UUID, anything elseString
Core Concepts

Gremlin Primer

Gremlin is a functional, data-flow traversal language. A traversal starts at a set of elements and threads through a pipeline of steps. Each step transforms the current traversers.

Starting a traversal

Gremlin
g.V()                          // all vertices
g.E()                          // all edges
g.V().hasLabel("Person")      // vertices of a specific label (V_Person)
g.V().has("Person", "name", "Alice")  // filter by property

Step reference

.out(label?)
Move to outgoing adjacent vertices. Label narrows to a specific edge type.
.in(label?)
Move to incoming adjacent vertices.
.both(label?)
Move to adjacent vertices in either direction.
.outE() / .inE()
Move to incident edges instead of vertices.
.has(key, val)
Filter traversers where the property matches. Supports P.* predicates.
.hasNot(key)
Filter traversers where the property is absent.
.values(key…)
Extract property values as the new traverser stream.
.valueMap(key…)
Extract properties as a Map. Good for final projection.
.project(k, …)
Build a named result map from sub-traversals. Preferred over valueMap for complex projections.
.select(k, …)
Retrieve labelled steps previously tagged with .as().
.repeat(t).until(c)
Loop traversal t until condition c is met. Use .times(n) for fixed depth.
.path()
Emit the full traversal history (vertices and edges) as a Path object.
.group().by()
Aggregate traversers into a Map grouped by a key.
.order().by()
Sort traversers by a property. Order.desc for descending.
.limit(n)
Take the first n traversers. Always prefer to toList() unbounded.
.dedup()
Remove duplicate traversers from the stream.

Predicate reference (P)

PredicateMeaning
P.eq(x)Equal to x
P.neq(x)Not equal
P.gt(x) / P.lt(x)Greater / less than
P.gte(x) / P.lte(x)Greater or equal / less or equal
P.between(lo, hi)lo ≤ value < hi
P.within(x, y, …)Value is one of the listed options
P.without(x, y, …)Value is none of the listed options
TextP.containing(s)String contains s
TextP.startingWith(s)String starts with s
Core Concepts

Traversal Patterns

Common graph query patterns expressed in Gremlin. Examples assume V_Person and E_KNOWS views.

Neighbourhood queries

Gremlin
// Direct neighbours
g.V().has("Person", "name", "Alice").out("KNOWS").values("name")

// N-hop expansion (BFS up to 3 hops)
g.V().has("Person", "name", "Alice")
  .repeat(__.out("KNOWS").simplePath())
  .times(3)
  .dedup()
  .values("name")

Shortest path

Gremlin
g.V().has("Person", "name", "Alice")
  .repeat(__.bothE().otherV().simplePath())
  .until(__.has("Person", "name", "Bob"))
  .path()
  .limit(1)
  .next()

Aggregation and grouping

Gremlin
// Friend count per person
g.V().hasLabel("Person")
  .project("name", "friends")
  .by("name")
  .by(__.out("KNOWS").count())
  .order().by("friends", Order.desc)
  .limit(10)
  .toList()

// Group people by age, alphabetised within each bucket
g.V().hasLabel("Person")
  .group()
  .by("age")
  .by(__.values("name").order().fold())

Filtering with where

Gremlin
// People who know someone older than them
g.V().hasLabel("Person").as("a")
  .out("KNOWS").as("b")
  .where("a", P.lt("b")).by("age")
  .select("a").values("name")
  .toList()
Languages

Java / Kotlin

The Java binding is a plain JVM jar that pulls in its transitive dependencies alongside duckdb_jdbc. JDK 17+. The engine runs embedded in your application's JVM, sharing memory with whatever else is on your classpath — no native library, no external process.

Programmatic vs. string Gremlin

Two equivalent ways to express a query: a Gremlin string via g.gremlin(...), or the typed GraphTraversalSource from g.traversal().

Java
try (Connection c = DriverManager.getConnection("jdbc:duckdb:/var/g.duckdb");
     Crabgraph g = Crabgraph.attach(c)) {

  // String API — returns JSON
  String json = g.gremlin("g.V().hasLabel('Person').count()");

  // Programmatic API — typed TinkerPop traversal
  GraphTraversalSource t = g.traversal();
  List<Object> friends = t.V()
      .has("Person", "name", "Alice")
      .out("KNOWS")
      .values("name")
      .order().toList();
}

Spring Boot

Build a single Crabgraph bean tied to your DuckDB DataSource. Reuse it across requests — traversals are thread-safe.

Java
@Configuration
public class GraphConfig {

  @Bean(destroyMethod = "close")
  public Crabgraph crabgraph(DataSource duckdb) throws SQLException {
    return Crabgraph.attach(duckdb.getConnection());
  }
}

@Service
public class PersonService {

  private final Crabgraph g;

  public PersonService(Crabgraph g) { this.g = g; }

  public String friendsOf(String name) {
    return g.gremlin(
        "g.V().has('Person','name','" + name + "').out('KNOWS').values('name').toList()");
  }
}

Kotlin

Kotlin
DriverManager.getConnection("jdbc:duckdb:/var/g.duckdb").use { conn ->
  Crabgraph.attach(conn).use { g ->
    val friends = g.traversal().V()
      .has("Person", "name", "Alice")
      .out("KNOWS")
      .values<String>("name")
      .toList()
  }
}
Languages

Python

The Python package wraps the embedded libcrabgraph via ctypes. Crabgraph.attach(conn) takes any open duckdb.DuckDBPyConnection; Crabgraph.open(path) is a path-based shortcut. Python 3.9+.

Synchronous core API

Python
import duckdb
import crabgraph

conn = duckdb.connect("/var/g.duckdb")

with crabgraph.Crabgraph.attach(conn) as g:
    count = g.gremlin("g.V().hasLabel('Person').count()")
    friends = g.gremlin(
        "g.V().has('Person','name','Alice').out('KNOWS').values('name').toList()")

Using with Flask / FastAPI

Build the Crabgraph instance once at startup and reuse it for the lifetime of the process. The engine is thread-safe; concurrent requests can call gremlin in parallel.

app.py
from flask import Flask, jsonify
import duckdb, crabgraph

conn = duckdb.connect("/var/g.duckdb")
g    = crabgraph.Crabgraph.attach(conn)

app  = Flask(__name__)

@app.route("/people")
def people():
    return jsonify(g.gremlin("g.V().hasLabel('Person').values('name').toList()"))

Locating the native lib in development

For non-pip installs (e.g. testing against a workspace-local libcrabgraph), set CRABGRAPH_NATIVE_DIR to the directory containing libcrabgraph.{so,dylib,dll}.

shell
export CRABGRAPH_NATIVE_DIR=/path/to/native/build
python my_app.py
Languages

Node.js

Source is TypeScript. The package ships with a small N-API addon (built at install via node-gyp) that bridges to the embedded libcrabgraph. Crabgraph.attach(conn) takes a @duckdb/node-api DuckDBConnection.

TypeScript types

TypeScript
import { DuckDBInstance } from "@duckdb/node-api";
import { Crabgraph } from "crabgraph";

let g: Crabgraph;

export async function initGraph(): Promise<void> {
  const inst = await DuckDBInstance.create(process.env.GRAPH_DB!);
  const conn = await inst.connect();
  g = await Crabgraph.attach(conn);
}

export function getFriends(name: string): string {
  return g.gremlin(
    `g.V().has('Person','name','${name}').out('KNOWS').values('name').toList()`);
}

Path-based attach

If you don't already have a DuckDB connection in scope, attach by path directly. The engine opens its own JDBC link to the file.

TypeScript
const g = Crabgraph.open("/var/g.duckdb");
Languages

Go

The Go binding is a cgo wrapper around libcrabgraph. Pair it with github.com/duckdb/duckdb-go/v2 (the official DuckDB driver) and you get a graph engine attached to your *sql.DB. Requires Go 1.24+ and CGO_ENABLED=1.

HTTP server example

Go
package main

import (
    "database/sql"
    "net/http"

    _ "github.com/duckdb/duckdb-go/v2"
    crabgraph "github.com/henneberger/crabgraph/go"
)

func main() {
    db, err := sql.Open("duckdb", "/var/g.duckdb")
    if err != nil { panic(err) }
    defer db.Close()

    g, err := crabgraph.Attach(db)
    if err != nil { panic(err) }
    defer g.Close()

    http.HandleFunc("/people", func(w http.ResponseWriter, r *http.Request) {
        out, err := g.Gremlin("g.V().hasLabel('Person').valueMap().toList()")
        if err != nil { http.Error(w, err.Error(), 500); return }
        w.Header().Set("Content-Type", "application/json")
        w.Write([]byte(out))
    })
    http.ListenAndServe(":3000", nil)
}

Locating libcrabgraph

By default the binding looks for libcrabgraph.{so,dylib} at ../native/build/ relative to the Go module. To override, set CGo flags at build time:

shell
CGO_CFLAGS="-I/path/to/include" \
CGO_LDFLAGS="-L/path/to/lib -lcrabgraph" \
go build
Languages

Rust

The Rust crate is a thin synchronous wrapper around libcrabgraph. The duckdb feature enables typed integration with the duckdb crate's Connection; without it, only the path-based open entrypoint is available.

Cargo features

FeatureDefaultDescription
duckdbEnables Crabgraph::attach(&duckdb::Connection)
download-native(Reserved) fetch libcrabgraph from a GitHub Release at build time
Cargo.toml
[dependencies]
crabgraph = { version = "0.1", features = ["duckdb"] }
duckdb    = "1.10"

Attach to an existing duckdb-rs connection

Rust
use crabgraph::Crabgraph;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let conn = duckdb::Connection::open("/var/g.duckdb")?;
    conn.execute_batch(r#"
        CREATE SCHEMA IF NOT EXISTS public;
        CREATE VIEW public.V_Person AS
            SELECT 1::BIGINT AS "ID", 'Alice' AS "name" UNION ALL
            SELECT 2::BIGINT,                'Bob';
        CREATE VIEW public.E_KNOWS AS
            SELECT 1::BIGINT AS "ID",
                   2::BIGINT AS "public.Person__I",
                   1::BIGINT AS "public.Person__O";
    "#)?;

    let g = Crabgraph::attach(&conn)?;
    let friends = g.gremlin(
        "g.V().has('Person','name','Alice').out('KNOWS').values('name').toList()")?;
    println!("{friends}");
    Ok(())
}

Without the duckdb feature

If you don't want a hard dep on duckdb-rs, attach by path:

Rust
let g = crabgraph::Crabgraph::open("/var/g.duckdb")?;

duckdb-rs dynamically links a system libduckdb. Set DUCKDB_LIB_DIR / DUCKDB_INCLUDE_DIR if needed. libcrabgraph must be on the runtime loader path (LD_LIBRARY_PATH / DYLD_LIBRARY_PATH).

Cloud

CrabGraph Cloud

CrabGraph Cloud is a hosted notebook experience for exploring graphs without setting up a local DuckDB or any of the language bindings. Drop in a DuckDB file or point at an Iceberg catalog, define your V_* / E_* views in the editor, and run Gremlin queries in the browser.

Skip the local setup.

Same Gremlin, same view convention — running on a managed DuckDB. Try queries against your Iceberg or Parquet data without writing a line of glue code.

Cloud and the embedded library share the same view-discovery convention. A graph that works in cloud.crabgraph.net will work as-is when you switch to attaching from your own application.

Reference

Configuration

The current API surface is intentionally minimal — all you configure is which DuckDB file to attach to. Tuning happens in DuckDB itself (memory limits, threads, extension config) on the connection you pass in.

Build-time / install-time

VariableUsed byPurpose
CRABGRAPH_NATIVE_DIRPython, Node, Rust, Go (dev)Directory containing libcrabgraph.{so,dylib,dll}. Bypasses the bundled library — useful for workspace-local builds.
DUCKDB_LIB_DIR / DUCKDB_INCLUDE_DIRRustWhere duckdb-rs finds your system libduckdb.
CGO_LDFLAGS / CGO_CFLAGSGoOverride the cgo flags used to link libcrabgraph.
LD_LIBRARY_PATH / DYLD_LIBRARY_PATHRuntime (all non-Java)Loader search path for libcrabgraph if not bundled with the package.

Tuning DuckDB

CrabGraph executes Gremlin by translating it to SQL and running it through your DuckDB connection. Standard DuckDB tuning applies — set memory limits, thread counts, and any extension config before attaching:

SQL
SET memory_limit = '8GB';
SET threads      = 8;

INSTALL iceberg; LOAD iceberg;
ATTACH 's3://...' AS cat (TYPE ICEBERG);
-- now create your V_/E_ views
Reference

API Reference

Methods available on every binding's Crabgraph instance. Names and types are language-idiomatic; semantics are identical.

MethodReturnsDescription
Crabgraph.attach(conn)CrabgraphAttach to an open DuckDB connection. Reads the file path from the connection and bootstraps from V_* / E_* views.
Crabgraph.open(path)CrabgraphAttach by path to an existing DuckDB file. Convenience for cases without a live connection.
.gremlin(query)String (JSON)Execute a Gremlin query string and return the result JSON-encoded.
.exec_sql(sql)voidRun raw SQL on the engine's connection. Useful for ad-hoc DuckDB ops alongside graph queries.
.traversal() Java onlyGraphTraversalSourceReturns the typed TinkerPop traversal source for programmatic Gremlin.
.close()voidRelease the engine and its underlying JDBC link.

Result format

gremlin(query) returns Jackson-serialized JSON. Primitive results are encoded as scalars ("3", "\"Alice\""). Collection results are JSON arrays. Vertices and edges are JSON objects:

JSON
{
  "type": "vertex",
  "id":   "v[Person][1]",
  "label": "Person",
  "properties": { "name": "Alice", "age": 34 }
}

Gremlin steps

Anything in the TinkerPop 3.7 reference grammar parses. See the TinkerPop reference for the complete step library.

Reference

Changelog

v0.1.0
Apr 2026
Initial

Tweaks

Default Language
Compact sidebar