Get Started

Quick Start

CrabGraph is a Gremlin engine that attaches to your existing DuckDB connection. Define vertices and edges as DuckDB views named V_* and E_*, then traverse them with the full Gremlin language. No separate server, no schema language, no migrations.

1

Add the Maven dependency

CrabGraph for Java is a plain JVM jar that embeds the engine in-process — no native lib needed. JDK 17+.

pom.xml

<dependency>
  <groupId>net.crabgraph</groupId>
  <artifactId>crabgraph</artifactId>
  <version>0.1.0</version>
</dependency>
<dependency>
  <groupId>org.duckdb</groupId>
  <artifactId>duckdb_jdbc</artifactId>
  <version>1.5.0.0</version>
</dependency>

2

Create your graph as DuckDB views

A view named V_<Label> in the public schema is a vertex label; E_<EdgeLabel> is an edge. Edge views carry endpoint columns of the form "public.<Vertex>__O" (out) and "public.<Vertex>__I" (in).

Java

try (Connection c = DriverManager.getConnection("jdbc:duckdb:/tmp/g.duckdb");
     Statement s = c.createStatement()) {
  s.execute("CREATE SCHEMA IF NOT EXISTS public");
  s.execute("""
      CREATE VIEW public.V_Person AS
        SELECT 1::BIGINT AS \"ID\", 'Alice' AS \"name\" UNION ALL
        SELECT 2::BIGINT,                'Bob'""");
  s.execute("""
      CREATE VIEW public.E_KNOWS AS
        SELECT 1::BIGINT AS \"ID\",
               2::BIGINT AS \"public.Person__I\",
               1::BIGINT AS \"public.Person__O\"""");
}

3

Attach and traverse

Pass an open java.sql.Connection to Crabgraph.attach(...). The engine inspects information_schema, registers your views as topology, and exposes Gremlin.

Java

import net.crabgraph.Crabgraph;
import java.sql.*;

try (Connection c = DriverManager.getConnection("jdbc:duckdb:/tmp/g.duckdb");
     Crabgraph g = Crabgraph.attach(c)) {

  String count   = g.gremlin("g.V().hasLabel('Person').count()");
  String friends = g.gremlin(
      "g.V().has('name','Alice').out('KNOWS').values('name').toList()");
}

1

Install via pip

The wheel bundles a native libcrabgraph — the engine runs embedded, with no separate runtime to install. Python 3.9+.

shell

pip install crabgraph

2

Create V_/E_ views

Use the duckdb package to set up your file. V_<Label> for vertices, E_<EdgeLabel> for edges, with endpoint columns named "public.<Vertex>__O" / "public.<Vertex>__I".

Python

import duckdb

conn = duckdb.connect("/tmp/g.duckdb")
conn.execute("CREATE SCHEMA IF NOT EXISTS public")
conn.execute("""
    CREATE VIEW public.V_Person AS
      SELECT 1::BIGINT AS "ID", 'Alice' AS "name" UNION ALL
      SELECT 2::BIGINT,                'Bob'
""")
conn.execute("""
    CREATE VIEW public.E_KNOWS AS
      SELECT 1::BIGINT AS "ID",
             2::BIGINT AS "public.Person__I",
             1::BIGINT AS "public.Person__O"
""")

3

Attach and traverse

Crabgraph.attach reads the database file path from your connection and auto-discovers the views.

Python

import crabgraph

with crabgraph.Crabgraph.attach(conn) as g:
    print(g.gremlin("g.V().hasLabel('Person').count()"))
    print(g.gremlin(
        "g.V().has('name','Alice').out('KNOWS').values('name').toList()"))

1

Install

Install crabgraph alongside the official DuckDB Node API. A small N-API addon is built at install time. Node.js 18+.

shell

npm install crabgraph @duckdb/node-api

2

Create V_/E_ views

Set up the DuckDB file with @duckdb/node-api. Vertex views are V_<Label>; edge views E_<EdgeLabel> with "public.<Vertex>__O" / "__I" endpoints.

TypeScript

import { DuckDBInstance } from "@duckdb/node-api";

const inst = await DuckDBInstance.create("/tmp/g.duckdb");
const conn = await inst.connect();
await conn.run("CREATE SCHEMA IF NOT EXISTS public");
await conn.run(`
  CREATE VIEW public.V_Person AS
    SELECT 1::BIGINT AS "ID", 'Alice' AS "name" UNION ALL
    SELECT 2::BIGINT,                'Bob'`);
await conn.run(`
  CREATE VIEW public.E_KNOWS AS
    SELECT 1::BIGINT AS "ID",
           2::BIGINT AS "public.Person__I",
           1::BIGINT AS "public.Person__O"`);

3

Attach and traverse

Crabgraph.attach queries PRAGMA database_list on the connection to find the database file, then attaches the engine.

TypeScript

import { Crabgraph } from "crabgraph";

const g = await Crabgraph.attach(conn);
try {
  console.log(g.gremlin("g.V().hasLabel('Person').count()"));
  console.log(g.gremlin(
    "g.V().has('name','Alice').out('KNOWS').values('name').toList()"));
} finally {
  g.close();
}

1

Add the modules

CrabGraph for Go is a cgo wrapper around the embedded libcrabgraph. Pair it with the official duckdb-go/v2 driver. Requires Go 1.24+ and CGO_ENABLED=1.

shell

go get github.com/henneberger/crabgraph/go
go get github.com/duckdb/duckdb-go/v2

2

Create V_/E_ views, attach, traverse

crabgraph.Attach(*sql.DB) reads the DuckDB file path from the connection and auto-discovers your views.

Go

package main

import (
    "database/sql"
    "fmt"

    _ "github.com/duckdb/duckdb-go/v2"
    crabgraph "github.com/henneberger/crabgraph/go"
)

func main() {
    db, err := sql.Open("duckdb", "/tmp/g.duckdb")
    if err != nil { panic(err) }
    defer db.Close()

    db.Exec("CREATE SCHEMA IF NOT EXISTS public")
    db.Exec(`CREATE VIEW public.V_Person AS
        SELECT 1::BIGINT AS "ID", 'Alice' AS "name" UNION ALL
        SELECT 2::BIGINT,                'Bob'`)
    db.Exec(`CREATE VIEW public.E_KNOWS AS
        SELECT 1::BIGINT AS "ID",
               2::BIGINT AS "public.Person__I",
               1::BIGINT AS "public.Person__O"`)

    g, err := crabgraph.Attach(db)
    if err != nil { panic(err) }
    defer g.Close()

    out, _ := g.Gremlin("g.V().hasLabel('Person').count()")
    fmt.Println(out)
}

1

Add the crates

The crate links against the embedded libcrabgraph shared library at build time. The duckdb feature enables the typed attach(&duckdb::Connection) entry point.

Cargo.toml

[dependencies]
crabgraph = { version = "0.1", features = ["duckdb"] }
duckdb    = "1.10"

2

Create V_/E_ views, attach, traverse

Open a file-backed duckdb::Connection, set up your views, and pass the connection to Crabgraph::attach.

Rust

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let conn = duckdb::Connection::open("/tmp/g.duckdb")?;
    conn.execute_batch(r#"
        CREATE SCHEMA IF NOT EXISTS public;
        CREATE VIEW public.V_Person AS
            SELECT 1::BIGINT AS "ID", 'Alice' AS "name" UNION ALL
            SELECT 2::BIGINT,                'Bob';
        CREATE VIEW public.E_KNOWS AS
            SELECT 1::BIGINT AS "ID",
                   2::BIGINT AS "public.Person__I",
                   1::BIGINT AS "public.Person__O";
    "#)?;

    let g = crabgraph::Crabgraph::attach(&conn)?;
    println!("{}", g.gremlin("g.V().hasLabel('Person').count()")?);
    println!("{}", g.gremlin(
        "g.V().has('name','Alice').out('KNOWS').values('name').toList()")?);
    Ok(())
}

No schema configuration. CrabGraph reads information_schema at attach time. Whatever V_* / E_* views are present become your graph — drop or alter them and reattach.

Setup

Installation

CrabGraph is distributed through each language's native package manager. Install the crabgraph package for your language alongside a DuckDB driver — the engine runs embedded in your process, no separate server to deploy.

Language	Package Manager	Package	DuckDB driver
Java 17+	Maven	`net.crabgraph:crabgraph:0.1.0`	`org.duckdb:duckdb_jdbc`
Python 3.9+	pip, Poetry, uv	`crabgraph==0.1.0`	`duckdb` (auto-installed)
Node.js 18+	npm, yarn, pnpm	`crabgraph@0.1.0`	`@duckdb/node-api`
Go 1.24+	go get	`github.com/henneberger/crabgraph/go`	`github.com/duckdb/duckdb-go/v2`
Rust 1.75+	Cargo	`crabgraph = "0.1"`	`duckdb = "1.10"`

System libduckdb. The Rust binding links against your system libduckdb via duckdb-rs. Set DUCKDB_LIB_DIR / DUCKDB_INCLUDE_DIR if it isn't on the standard search path.

Core Concepts

Define a Schema

Your schema lives in DuckDB itself, as views in the public schema. There is no separate schema file, no annotations, no migrations. CrabGraph discovers everything from information_schema at attach time.

Vertices: `V_<Label>`

A view named V_<Label> becomes a vertex label. Each row is a vertex. The view must have a BIGINT column called "ID" as the vertex's primary key; every other column becomes a property.

SQL

CREATE VIEW public.V_Person AS
  SELECT id    AS "ID",    -- BIGINT
         name  AS "name",       -- property
         age   AS "age"
  FROM   raw_people;

Edges: `E_<EdgeLabel>`

A view named E_<EdgeLabel> becomes an edge label. Required columns:

"ID" — BIGINT, the edge's primary key
"public.<OutVertexLabel>__O" — BIGINT, the source vertex's ID
"public.<InVertexLabel>__I" — BIGINT, the target vertex's ID

Any other columns are edge properties. The __O / __I column-name suffixes tell CrabGraph which vertex labels the edge connects.

SQL

CREATE VIEW public.E_KNOWS AS
  SELECT id           AS "ID",
         src_person   AS "public.Person__O",   -- out-vertex (Person)
         dst_person   AS "public.Person__I",   -- in-vertex (Person)
         since_year   AS "since"              -- edge property
  FROM   raw_friendships;

Backed by anything DuckDB can read

The view's SELECT is just SQL. It can read parquet files, Iceberg tables (via the iceberg extension), CSVs, PostgreSQL via the postgres extension, MotherDuck — any DuckDB source.

Iceberg (direct warehouse attach)

SQL

INSTALL iceberg; LOAD iceberg;
ATTACH 's3://my-bucket/warehouse/' AS cat (TYPE ICEBERG);

CREATE VIEW public.V_Order AS
  SELECT order_id AS "ID", total AS "total"
  FROM   cat.orders;

Apache Polaris catalog

For a Polaris-managed warehouse, register a secret with your OAuth2 credentials and attach via the catalog endpoint. Polaris hands DuckDB the table metadata; reads still go directly to the underlying storage.

SQL

INSTALL iceberg; LOAD iceberg;

CREATE SECRET polaris_auth (
    TYPE             ICEBERG,
    CLIENT_ID        '<your-principal-id>',
    CLIENT_SECRET    '<your-principal-secret>',
    OAUTH2_SERVER_URI 'https://<polaris-host>/api/catalog/v1/oauth/tokens',
    OAUTH2_SCOPE     'PRINCIPAL_ROLE:ALL'
);

ATTACH '<warehouse-name>' AS polaris (
    TYPE     ICEBERG,
    ENDPOINT 'https://<polaris-host>/api/catalog',
    SECRET   polaris_auth
);

CREATE VIEW public.V_Customer AS
  SELECT id AS "ID", email AS "email"
  FROM   polaris.analytics.customers;

Turn on cache_https. When your views read remote data (Iceberg, Polaris, S3 parquet), DuckDB's HTTPS response cache avoids refetching the same metadata/data files across queries — often a 10× speedup on repeated traversals. Set it once on the connection before attaching:

SET cache_https = true;

Read-only graph. Views are not writable, so g.addV(...) / g.addE(...) aren't supported. Add data by inserting into the underlying tables; the views surface it on the next traversal.

Core Concepts

Your First Query

CrabGraph speaks the Apache TinkerPop Gremlin language via the ANTLR parser — every standard Gremlin step works as expected. Pass a query string to g.gremlin(...) and you get a JSON result.

Gremlin

// Count vertices of a label
g.V().hasLabel("Person").count()

// Friends of Alice
g.V().has("Person", "name", "Alice")
 .out("KNOWS")
 .values("name")
 .order()
 .toList()

// Two-hop expansion
g.V().has("Person", "name", "Alice")
 .repeat(__.out("KNOWS").simplePath())
 .times(2)
 .dedup()
 .values("name")
 .toList()

Results are returned JSON-encoded. Vertices and edges materialize as objects with type, id, label, and properties fields:

JSON

[
  {"type":"vertex","id":"v[Person][1]","label":"Person",
   "properties":{"name":"Alice","age":34}}
]

New to Gremlin? See the Gremlin Primer below, or the TinkerPop reference for the full spec. The Java binding also exposes a programmatic g.traversal() for typed Gremlin without strings.

Core Concepts

Architecture

CrabGraph is embedded. It runs in your application's process as a library call — no daemon, no port to bind, no extra process to supervise. attach opens its own connection to the DuckDB file you pointed at and exposes Gremlin synchronously through whatever language you called it from.

Engine-managed schema

The first time CrabGraph attaches to a database, it creates a small set of internal tables in a reserved schema alongside your views. These hold topology metadata — which V_* / E_* views exist, their column types, and edge endpoints. The tables are created automatically; there's nothing for you to run or migrate.

Subsequent attaches reuse what's already there. If you add, drop, or re-shape a V_* / E_* view, rebuild the internal schema by dropping it and reattaching — CrabGraph rediscovers your views from scratch.

How each binding embeds

Non-Java bindings load a shared library (libcrabgraph) that's bundled with the package. The Java binding is a plain jar — no native load step needed.

Binding	How it loads
Java	Plain Maven jar
Python	Bundled `libcrabgraph` via `ctypes`
Node.js	N-API addon over `libcrabgraph`
Go	cgo binding to `libcrabgraph`
Rust	Dynamic link to `libcrabgraph`

Storage

Storage is whatever DuckDB does. Persist by pointing at a .duckdb file. For ephemeral / testing setups, write your views into a temp file. In-memory connections are not yet supported by attach — file-backed DBs work because the engine opens its own link to the same file you did.

Testing tip. Use a fresh temp file per test (e.g. tmp_path in pytest, t.TempDir() in Go). Each attach rediscovers views from scratch.

Core Concepts

Schema Reference

CrabGraph interprets a small set of view-naming and column-naming conventions. Stick to them and the engine handles topology discovery automatically.

View naming

View name	Becomes	Notes
`public.V_<Label>`	Vertex label `<Label>`	Label is case-sensitive in Gremlin queries
`public.E_<EdgeLabel>`	Edge label `<EdgeLabel>`	Endpoints inferred from `__O`/`__I` column names

Required columns

Where	Column	Type	Purpose
Every `V_` view	`"ID"`	BIGINT	Vertex primary key
Every `E_` view	`"ID"`	BIGINT	Edge primary key
`E_` views	`"public.<OutVertex>__O"`	BIGINT	Source vertex's `ID`
`E_` views	`"public.<InVertex>__I"`	BIGINT	Target vertex's `ID`

Property type mapping

All non-ID, non-endpoint columns are exposed as Gremlin properties. CrabGraph infers the property type from DuckDB's information_schema.columns.data_type:

DuckDB type	Gremlin property type
`BOOLEAN`	Boolean
`TINYINT`	Byte
`SMALLINT`	Short
`INTEGER` / `INT`	Integer
`BIGINT` / `HUGEINT`	Long
`REAL`	Float
`DOUBLE` / `FLOAT`	Double
`DATE`	LocalDate
`TIMESTAMP`	LocalDateTime
`TIMESTAMP WITH TIME ZONE`	ZonedDateTime
`BLOB` / `BYTEA`	byte[]
`JSON`	JSON (string)
`VARCHAR`, `TEXT`, `UUID`, anything else	String

Core Concepts

Gremlin Primer

Gremlin is a functional, data-flow traversal language. A traversal starts at a set of elements and threads through a pipeline of steps. Each step transforms the current traversers.

Starting a traversal

Gremlin

g.V()                          // all vertices
g.E()                          // all edges
g.V().hasLabel("Person")      // vertices of a specific label (V_Person)
g.V().has("Person", "name", "Alice")  // filter by property

Step reference

.out(label?)

Move to outgoing adjacent vertices. Label narrows to a specific edge type.

.in(label?)

Move to incoming adjacent vertices.

.both(label?)

Move to adjacent vertices in either direction.

.outE() / .inE()

Move to incident edges instead of vertices.

.has(key, val)

Filter traversers where the property matches. Supports P.* predicates.

.hasNot(key)

Filter traversers where the property is absent.

.values(key…)

Extract property values as the new traverser stream.

.valueMap(key…)

Extract properties as a Map. Good for final projection.

.project(k, …)

Build a named result map from sub-traversals. Preferred over valueMap for complex projections.

.select(k, …)

Retrieve labelled steps previously tagged with .as().

.repeat(t).until(c)

Loop traversal t until condition c is met. Use .times(n) for fixed depth.

.path()

Emit the full traversal history (vertices and edges) as a Path object.

.group().by()

Aggregate traversers into a Map grouped by a key.

.order().by()

Sort traversers by a property. Order.desc for descending.

.limit(n)

Take the first n traversers. Always prefer to toList() unbounded.

.dedup()

Remove duplicate traversers from the stream.

Predicate reference (`P`)

Predicate	Meaning
P.eq(x)	Equal to `x`
P.neq(x)	Not equal
P.gt(x) / P.lt(x)	Greater / less than
P.gte(x) / P.lte(x)	Greater or equal / less or equal
P.between(lo, hi)	lo ≤ value < hi
P.within(x, y, …)	Value is one of the listed options
P.without(x, y, …)	Value is none of the listed options
TextP.containing(s)	String contains `s`
TextP.startingWith(s)	String starts with `s`

Core Concepts

Traversal Patterns

Common graph query patterns expressed in Gremlin. Examples assume V_Person and E_KNOWS views.

Neighbourhood queries

Gremlin

// Direct neighbours
g.V().has("Person", "name", "Alice").out("KNOWS").values("name")

// N-hop expansion (BFS up to 3 hops)
g.V().has("Person", "name", "Alice")
  .repeat(__.out("KNOWS").simplePath())
  .times(3)
  .dedup()
  .values("name")

Shortest path

Gremlin

g.V().has("Person", "name", "Alice")
  .repeat(__.bothE().otherV().simplePath())
  .until(__.has("Person", "name", "Bob"))
  .path()
  .limit(1)
  .next()

Aggregation and grouping

Gremlin

// Friend count per person
g.V().hasLabel("Person")
  .project("name", "friends")
  .by("name")
  .by(__.out("KNOWS").count())
  .order().by("friends", Order.desc)
  .limit(10)
  .toList()

// Group people by age, alphabetised within each bucket
g.V().hasLabel("Person")
  .group()
  .by("age")
  .by(__.values("name").order().fold())

Filtering with `where`

Gremlin

// People who know someone older than them
g.V().hasLabel("Person").as("a")
  .out("KNOWS").as("b")
  .where("a", P.lt("b")).by("age")
  .select("a").values("name")
  .toList()

Languages

Java / Kotlin

The Java binding is a plain JVM jar that pulls in its transitive dependencies alongside duckdb_jdbc. JDK 17+. The engine runs embedded in your application's JVM, sharing memory with whatever else is on your classpath — no native library, no external process.

Programmatic vs. string Gremlin

Two equivalent ways to express a query: a Gremlin string via g.gremlin(...), or the typed GraphTraversalSource from g.traversal().

Java

try (Connection c = DriverManager.getConnection("jdbc:duckdb:/var/g.duckdb");
     Crabgraph g = Crabgraph.attach(c)) {

  // String API — returns JSON
  String json = g.gremlin("g.V().hasLabel('Person').count()");

  // Programmatic API — typed TinkerPop traversal
  GraphTraversalSource t = g.traversal();
  List<Object> friends = t.V()
      .has("Person", "name", "Alice")
      .out("KNOWS")
      .values("name")
      .order().toList();
}

Spring Boot

Build a single Crabgraph bean tied to your DuckDB DataSource. Reuse it across requests — traversals are thread-safe.

Java

@Configuration
public class GraphConfig {

  @Bean(destroyMethod = "close")
  public Crabgraph crabgraph(DataSource duckdb) throws SQLException {
    return Crabgraph.attach(duckdb.getConnection());
  }
}

@Service
public class PersonService {

  private final Crabgraph g;

  public PersonService(Crabgraph g) { this.g = g; }

  public String friendsOf(String name) {
    return g.gremlin(
        "g.V().has('Person','name','" + name + "').out('KNOWS').values('name').toList()");
  }
}

Kotlin

DriverManager.getConnection("jdbc:duckdb:/var/g.duckdb").use { conn ->
  Crabgraph.attach(conn).use { g ->
    val friends = g.traversal().V()
      .has("Person", "name", "Alice")
      .out("KNOWS")
      .values<String>("name")
      .toList()
  }
}

Languages

Python

The Python package wraps the embedded libcrabgraph via ctypes. Crabgraph.attach(conn) takes any open duckdb.DuckDBPyConnection; Crabgraph.open(path) is a path-based shortcut. Python 3.9+.

Synchronous core API

Python

import duckdb
import crabgraph

conn = duckdb.connect("/var/g.duckdb")

with crabgraph.Crabgraph.attach(conn) as g:
    count = g.gremlin("g.V().hasLabel('Person').count()")
    friends = g.gremlin(
        "g.V().has('Person','name','Alice').out('KNOWS').values('name').toList()")

Using with Flask / FastAPI

Build the Crabgraph instance once at startup and reuse it for the lifetime of the process. The engine is thread-safe; concurrent requests can call gremlin in parallel.

app.py

from flask import Flask, jsonify
import duckdb, crabgraph

conn = duckdb.connect("/var/g.duckdb")
g    = crabgraph.Crabgraph.attach(conn)

app  = Flask(__name__)

@app.route("/people")
def people():
    return jsonify(g.gremlin("g.V().hasLabel('Person').values('name').toList()"))

Locating the native lib in development

For non-pip installs (e.g. testing against a workspace-local libcrabgraph), set CRABGRAPH_NATIVE_DIR to the directory containing libcrabgraph.{so,dylib,dll}.

shell

export CRABGRAPH_NATIVE_DIR=/path/to/native/build
python my_app.py

Languages

Node.js

Source is TypeScript. The package ships with a small N-API addon (built at install via node-gyp) that bridges to the embedded libcrabgraph. Crabgraph.attach(conn) takes a @duckdb/node-api DuckDBConnection.

TypeScript types

TypeScript

import { DuckDBInstance } from "@duckdb/node-api";
import { Crabgraph } from "crabgraph";

let g: Crabgraph;

export async function initGraph(): Promise<void> {
  const inst = await DuckDBInstance.create(process.env.GRAPH_DB!);
  const conn = await inst.connect();
  g = await Crabgraph.attach(conn);
}

export function getFriends(name: string): string {
  return g.gremlin(
    `g.V().has('Person','name','${name}').out('KNOWS').values('name').toList()`);
}

Path-based attach

If you don't already have a DuckDB connection in scope, attach by path directly. The engine opens its own JDBC link to the file.

TypeScript

const g = Crabgraph.open("/var/g.duckdb");

Languages

Go

The Go binding is a cgo wrapper around libcrabgraph. Pair it with github.com/duckdb/duckdb-go/v2 (the official DuckDB driver) and you get a graph engine attached to your *sql.DB. Requires Go 1.24+ and CGO_ENABLED=1.

HTTP server example

Go

package main

import (
    "database/sql"
    "net/http"

    _ "github.com/duckdb/duckdb-go/v2"
    crabgraph "github.com/henneberger/crabgraph/go"
)

func main() {
    db, err := sql.Open("duckdb", "/var/g.duckdb")
    if err != nil { panic(err) }
    defer db.Close()

    g, err := crabgraph.Attach(db)
    if err != nil { panic(err) }
    defer g.Close()

    http.HandleFunc("/people", func(w http.ResponseWriter, r *http.Request) {
        out, err := g.Gremlin("g.V().hasLabel('Person').valueMap().toList()")
        if err != nil { http.Error(w, err.Error(), 500); return }
        w.Header().Set("Content-Type", "application/json")
        w.Write([]byte(out))
    })
    http.ListenAndServe(":3000", nil)
}

Locating libcrabgraph

By default the binding looks for libcrabgraph.{so,dylib} at ../native/build/ relative to the Go module. To override, set CGo flags at build time:

shell

CGO_CFLAGS="-I/path/to/include" \
CGO_LDFLAGS="-L/path/to/lib -lcrabgraph" \
go build

Languages

Rust

The Rust crate is a thin synchronous wrapper around libcrabgraph. The duckdb feature enables typed integration with the duckdb crate's Connection; without it, only the path-based open entrypoint is available.

Cargo features

Feature	Default	Description
duckdb		Enables `Crabgraph::attach(&duckdb::Connection)`
download-native		(Reserved) fetch `libcrabgraph` from a GitHub Release at build time

Cargo.toml

[dependencies]
crabgraph = { version = "0.1", features = ["duckdb"] }
duckdb    = "1.10"

Attach to an existing duckdb-rs connection

Rust

use crabgraph::Crabgraph;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let conn = duckdb::Connection::open("/var/g.duckdb")?;
    conn.execute_batch(r#"
        CREATE SCHEMA IF NOT EXISTS public;
        CREATE VIEW public.V_Person AS
            SELECT 1::BIGINT AS "ID", 'Alice' AS "name" UNION ALL
            SELECT 2::BIGINT,                'Bob';
        CREATE VIEW public.E_KNOWS AS
            SELECT 1::BIGINT AS "ID",
                   2::BIGINT AS "public.Person__I",
                   1::BIGINT AS "public.Person__O";
    "#)?;

    let g = Crabgraph::attach(&conn)?;
    let friends = g.gremlin(
        "g.V().has('Person','name','Alice').out('KNOWS').values('name').toList()")?;
    println!("{friends}");
    Ok(())
}

Without the `duckdb` feature

If you don't want a hard dep on duckdb-rs, attach by path:

Rust

let g = crabgraph::Crabgraph::open("/var/g.duckdb")?;

duckdb-rs dynamically links a system libduckdb. Set DUCKDB_LIB_DIR / DUCKDB_INCLUDE_DIR if needed. libcrabgraph must be on the runtime loader path (LD_LIBRARY_PATH / DYLD_LIBRARY_PATH).

Cloud

CrabGraph Cloud

CrabGraph Cloud is a hosted notebook experience for exploring graphs without setting up a local DuckDB or any of the language bindings. Drop in a DuckDB file or point at an Iceberg catalog, define your V_* / E_* views in the editor, and run Gremlin queries in the browser.

Skip the local setup.

Same Gremlin, same view convention — running on a managed DuckDB. Try queries against your Iceberg or Parquet data without writing a line of glue code.

Start free → See pricing

Cloud and the embedded library share the same view-discovery convention. A graph that works in cloud.crabgraph.net will work as-is when you switch to attaching from your own application.

Reference

Configuration

The current API surface is intentionally minimal — all you configure is which DuckDB file to attach to. Tuning happens in DuckDB itself (memory limits, threads, extension config) on the connection you pass in.

Build-time / install-time

Variable	Used by	Purpose
`CRABGRAPH_NATIVE_DIR`	Python, Node, Rust, Go (dev)	Directory containing `libcrabgraph.{so,dylib,dll}`. Bypasses the bundled library — useful for workspace-local builds.
`DUCKDB_LIB_DIR` / `DUCKDB_INCLUDE_DIR`	Rust	Where `duckdb-rs` finds your system `libduckdb`.
`CGO_LDFLAGS` / `CGO_CFLAGS`	Go	Override the cgo flags used to link `libcrabgraph`.
`LD_LIBRARY_PATH` / `DYLD_LIBRARY_PATH`	Runtime (all non-Java)	Loader search path for `libcrabgraph` if not bundled with the package.

Tuning DuckDB

CrabGraph executes Gremlin by translating it to SQL and running it through your DuckDB connection. Standard DuckDB tuning applies — set memory limits, thread counts, and any extension config before attaching:

SQL

SET memory_limit = '8GB';
SET threads      = 8;

INSTALL iceberg; LOAD iceberg;
ATTACH 's3://...' AS cat (TYPE ICEBERG);
-- now create your V_/E_ views

Reference

API Reference

Methods available on every binding's Crabgraph instance. Names and types are language-idiomatic; semantics are identical.

Method	Returns	Description
`Crabgraph.attach(conn)`	Crabgraph	Attach to an open DuckDB connection. Reads the file path from the connection and bootstraps from `V_` / `E_` views.
`Crabgraph.open(path)`	Crabgraph	Attach by path to an existing DuckDB file. Convenience for cases without a live connection.
`.gremlin(query)`	String (JSON)	Execute a Gremlin query string and return the result JSON-encoded.
`.exec_sql(sql)`	void	Run raw SQL on the engine's connection. Useful for ad-hoc DuckDB ops alongside graph queries.
`.traversal()` Java only	GraphTraversalSource	Returns the typed TinkerPop traversal source for programmatic Gremlin.
`.close()`	void	Release the engine and its underlying JDBC link.

Result format

gremlin(query) returns Jackson-serialized JSON. Primitive results are encoded as scalars ("3", "\"Alice\""). Collection results are JSON arrays. Vertices and edges are JSON objects:

JSON

{
  "type": "vertex",
  "id":   "v[Person][1]",
  "label": "Person",
  "properties": { "name": "Alice", "age": 34 }
}

Gremlin steps

Anything in the TinkerPop 3.7 reference grammar parses. See the TinkerPop reference for the complete step library.

Reference

Changelog

v0.1.0

Apr 2026

Initial

Quick Start

Installation

Define a Schema

Vertices: V_<Label>

Edges: E_<EdgeLabel>

Backed by anything DuckDB can read

Iceberg (direct warehouse attach)

Apache Polaris catalog

Your First Query

Architecture

Engine-managed schema

How each binding embeds

Storage

Schema Reference

View naming

Required columns

Property type mapping

Gremlin Primer

Starting a traversal

Step reference

Predicate reference (P)

Traversal Patterns

Neighbourhood queries

Shortest path

Aggregation and grouping

Filtering with where

Java / Kotlin

Programmatic vs. string Gremlin

Spring Boot

Kotlin

Python

Synchronous core API

Using with Flask / FastAPI

Locating the native lib in development

Node.js

TypeScript types

Path-based attach

Go

HTTP server example

Locating libcrabgraph

Rust

Cargo features

Attach to an existing duckdb-rs connection

Without the duckdb feature

CrabGraph Cloud

Skip the local setup.

Configuration

Build-time / install-time

Tuning DuckDB

API Reference

Result format

Gremlin steps

Changelog

Tweaks ×

Vertices: `V_<Label>`

Edges: `E_<EdgeLabel>`

Predicate reference (`P`)

Filtering with `where`

Without the `duckdb` feature

Tweaks