Quick Start
CrabGraph is a Gremlin engine that attaches to your existing DuckDB connection. Define vertices and edges as DuckDB views named V_* and E_*, then traverse them with the full Gremlin language. No separate server, no schema language, no migrations.
<dependency> <groupId>net.crabgraph</groupId> <artifactId>crabgraph</artifactId> <version>0.1.0</version> </dependency> <dependency> <groupId>org.duckdb</groupId> <artifactId>duckdb_jdbc</artifactId> <version>1.5.0.0</version> </dependency>
V_<Label> in the public schema is a vertex label; E_<EdgeLabel> is an edge. Edge views carry endpoint columns of the form "public.<Vertex>__O" (out) and "public.<Vertex>__I" (in).try (Connection c = DriverManager.getConnection("jdbc:duckdb:/tmp/g.duckdb"); Statement s = c.createStatement()) { s.execute("CREATE SCHEMA IF NOT EXISTS public"); s.execute(""" CREATE VIEW public.V_Person AS SELECT 1::BIGINT AS \"ID\", 'Alice' AS \"name\" UNION ALL SELECT 2::BIGINT, 'Bob'"""); s.execute(""" CREATE VIEW public.E_KNOWS AS SELECT 1::BIGINT AS \"ID\", 2::BIGINT AS \"public.Person__I\", 1::BIGINT AS \"public.Person__O\""""); }
java.sql.Connection to Crabgraph.attach(...). The engine inspects information_schema, registers your views as topology, and exposes Gremlin.import net.crabgraph.Crabgraph; import java.sql.*; try (Connection c = DriverManager.getConnection("jdbc:duckdb:/tmp/g.duckdb"); Crabgraph g = Crabgraph.attach(c)) { String count = g.gremlin("g.V().hasLabel('Person').count()"); String friends = g.gremlin( "g.V().has('name','Alice').out('KNOWS').values('name').toList()"); }
libcrabgraph — the engine runs embedded, with no separate runtime to install. Python 3.9+.pip install crabgraph
duckdb package to set up your file. V_<Label> for vertices, E_<EdgeLabel> for edges, with endpoint columns named "public.<Vertex>__O" / "public.<Vertex>__I".import duckdb conn = duckdb.connect("/tmp/g.duckdb") conn.execute("CREATE SCHEMA IF NOT EXISTS public") conn.execute(""" CREATE VIEW public.V_Person AS SELECT 1::BIGINT AS "ID", 'Alice' AS "name" UNION ALL SELECT 2::BIGINT, 'Bob' """) conn.execute(""" CREATE VIEW public.E_KNOWS AS SELECT 1::BIGINT AS "ID", 2::BIGINT AS "public.Person__I", 1::BIGINT AS "public.Person__O" """)
Crabgraph.attach reads the database file path from your connection and auto-discovers the views.import crabgraph with crabgraph.Crabgraph.attach(conn) as g: print(g.gremlin("g.V().hasLabel('Person').count()")) print(g.gremlin( "g.V().has('name','Alice').out('KNOWS').values('name').toList()"))
crabgraph alongside the official DuckDB Node API. A small N-API addon is built at install time. Node.js 18+.npm install crabgraph @duckdb/node-api
@duckdb/node-api. Vertex views are V_<Label>; edge views E_<EdgeLabel> with "public.<Vertex>__O" / "__I" endpoints.import { DuckDBInstance } from "@duckdb/node-api"; const inst = await DuckDBInstance.create("/tmp/g.duckdb"); const conn = await inst.connect(); await conn.run("CREATE SCHEMA IF NOT EXISTS public"); await conn.run(` CREATE VIEW public.V_Person AS SELECT 1::BIGINT AS "ID", 'Alice' AS "name" UNION ALL SELECT 2::BIGINT, 'Bob'`); await conn.run(` CREATE VIEW public.E_KNOWS AS SELECT 1::BIGINT AS "ID", 2::BIGINT AS "public.Person__I", 1::BIGINT AS "public.Person__O"`);
Crabgraph.attach queries PRAGMA database_list on the connection to find the database file, then attaches the engine.import { Crabgraph } from "crabgraph"; const g = await Crabgraph.attach(conn); try { console.log(g.gremlin("g.V().hasLabel('Person').count()")); console.log(g.gremlin( "g.V().has('name','Alice').out('KNOWS').values('name').toList()")); } finally { g.close(); }
libcrabgraph. Pair it with the official duckdb-go/v2 driver. Requires Go 1.24+ and CGO_ENABLED=1.go get github.com/henneberger/crabgraph/go go get github.com/duckdb/duckdb-go/v2
crabgraph.Attach(*sql.DB) reads the DuckDB file path from the connection and auto-discovers your views.package main import ( "database/sql" "fmt" _ "github.com/duckdb/duckdb-go/v2" crabgraph "github.com/henneberger/crabgraph/go" ) func main() { db, err := sql.Open("duckdb", "/tmp/g.duckdb") if err != nil { panic(err) } defer db.Close() db.Exec("CREATE SCHEMA IF NOT EXISTS public") db.Exec(`CREATE VIEW public.V_Person AS SELECT 1::BIGINT AS "ID", 'Alice' AS "name" UNION ALL SELECT 2::BIGINT, 'Bob'`) db.Exec(`CREATE VIEW public.E_KNOWS AS SELECT 1::BIGINT AS "ID", 2::BIGINT AS "public.Person__I", 1::BIGINT AS "public.Person__O"`) g, err := crabgraph.Attach(db) if err != nil { panic(err) } defer g.Close() out, _ := g.Gremlin("g.V().hasLabel('Person').count()") fmt.Println(out) }
libcrabgraph shared library at build time. The duckdb feature enables the typed attach(&duckdb::Connection) entry point.[dependencies] crabgraph = { version = "0.1", features = ["duckdb"] } duckdb = "1.10"
duckdb::Connection, set up your views, and pass the connection to Crabgraph::attach.fn main() -> Result<(), Box<dyn std::error::Error>> { let conn = duckdb::Connection::open("/tmp/g.duckdb")?; conn.execute_batch(r#" CREATE SCHEMA IF NOT EXISTS public; CREATE VIEW public.V_Person AS SELECT 1::BIGINT AS "ID", 'Alice' AS "name" UNION ALL SELECT 2::BIGINT, 'Bob'; CREATE VIEW public.E_KNOWS AS SELECT 1::BIGINT AS "ID", 2::BIGINT AS "public.Person__I", 1::BIGINT AS "public.Person__O"; "#)?; let g = crabgraph::Crabgraph::attach(&conn)?; println!("{}", g.gremlin("g.V().hasLabel('Person').count()")?); println!("{}", g.gremlin( "g.V().has('name','Alice').out('KNOWS').values('name').toList()")?); Ok(()) }
No schema configuration. CrabGraph reads information_schema at attach time. Whatever V_* / E_* views are present become your graph — drop or alter them and reattach.
Installation
CrabGraph is distributed through each language's native package manager. Install the crabgraph package for your language alongside a DuckDB driver — the engine runs embedded in your process, no separate server to deploy.
| Language | Package Manager | Package | DuckDB driver |
|---|---|---|---|
| Java 17+ | Maven | net.crabgraph:crabgraph:0.1.0 | org.duckdb:duckdb_jdbc |
| Python 3.9+ | pip, Poetry, uv | crabgraph==0.1.0 | duckdb (auto-installed) |
| Node.js 18+ | npm, yarn, pnpm | crabgraph@0.1.0 | @duckdb/node-api |
| Go 1.24+ | go get | github.com/henneberger/crabgraph/go | github.com/duckdb/duckdb-go/v2 |
| Rust 1.75+ | Cargo | crabgraph = "0.1" | duckdb = "1.10" |
System libduckdb. The Rust binding links against your system libduckdb via duckdb-rs. Set DUCKDB_LIB_DIR / DUCKDB_INCLUDE_DIR if it isn't on the standard search path.
Define a Schema
Your schema lives in DuckDB itself, as views in the public schema. There is no separate schema file, no annotations, no migrations. CrabGraph discovers everything from information_schema at attach time.
Vertices: V_<Label>
A view named V_<Label> becomes a vertex label. Each row is a vertex. The view must have a BIGINT column called "ID" as the vertex's primary key; every other column becomes a property.
CREATE VIEW public.V_Person AS SELECT id AS "ID", -- BIGINT name AS "name", -- property age AS "age" FROM raw_people;
Edges: E_<EdgeLabel>
A view named E_<EdgeLabel> becomes an edge label. Required columns:
"ID"—BIGINT, the edge's primary key"public.<OutVertexLabel>__O"—BIGINT, the source vertex'sID"public.<InVertexLabel>__I"—BIGINT, the target vertex'sID
Any other columns are edge properties. The __O / __I column-name suffixes tell CrabGraph which vertex labels the edge connects.
CREATE VIEW public.E_KNOWS AS SELECT id AS "ID", src_person AS "public.Person__O", -- out-vertex (Person) dst_person AS "public.Person__I", -- in-vertex (Person) since_year AS "since" -- edge property FROM raw_friendships;
Backed by anything DuckDB can read
The view's SELECT is just SQL. It can read parquet files, Iceberg tables (via the iceberg extension), CSVs, PostgreSQL via the postgres extension, MotherDuck — any DuckDB source.
Iceberg (direct warehouse attach)
INSTALL iceberg; LOAD iceberg; ATTACH 's3://my-bucket/warehouse/' AS cat (TYPE ICEBERG); CREATE VIEW public.V_Order AS SELECT order_id AS "ID", total AS "total" FROM cat.orders;
Apache Polaris catalog
For a Polaris-managed warehouse, register a secret with your OAuth2 credentials and attach via the catalog endpoint. Polaris hands DuckDB the table metadata; reads still go directly to the underlying storage.
INSTALL iceberg; LOAD iceberg; CREATE SECRET polaris_auth ( TYPE ICEBERG, CLIENT_ID '<your-principal-id>', CLIENT_SECRET '<your-principal-secret>', OAUTH2_SERVER_URI 'https://<polaris-host>/api/catalog/v1/oauth/tokens', OAUTH2_SCOPE 'PRINCIPAL_ROLE:ALL' ); ATTACH '<warehouse-name>' AS polaris ( TYPE ICEBERG, ENDPOINT 'https://<polaris-host>/api/catalog', SECRET polaris_auth ); CREATE VIEW public.V_Customer AS SELECT id AS "ID", email AS "email" FROM polaris.analytics.customers;
Turn on cache_https. When your views read remote data (Iceberg, Polaris, S3 parquet), DuckDB's HTTPS response cache avoids refetching the same metadata/data files across queries — often a 10× speedup on repeated traversals. Set it once on the connection before attaching:
SET cache_https = true;
Read-only graph. Views are not writable, so g.addV(...) / g.addE(...) aren't supported. Add data by inserting into the underlying tables; the views surface it on the next traversal.
Your First Query
CrabGraph speaks the Apache TinkerPop Gremlin language via the ANTLR parser — every standard Gremlin step works as expected. Pass a query string to g.gremlin(...) and you get a JSON result.
// Count vertices of a label g.V().hasLabel("Person").count() // Friends of Alice g.V().has("Person", "name", "Alice") .out("KNOWS") .values("name") .order() .toList() // Two-hop expansion g.V().has("Person", "name", "Alice") .repeat(__.out("KNOWS").simplePath()) .times(2) .dedup() .values("name") .toList()
Results are returned JSON-encoded. Vertices and edges materialize as objects with type, id, label, and properties fields:
[
{"type":"vertex","id":"v[Person][1]","label":"Person",
"properties":{"name":"Alice","age":34}}
]New to Gremlin? See the Gremlin Primer below, or the TinkerPop reference for the full spec. The Java binding also exposes a programmatic g.traversal() for typed Gremlin without strings.
Architecture
CrabGraph is embedded. It runs in your application's process as a library call — no daemon, no port to bind, no extra process to supervise. attach opens its own connection to the DuckDB file you pointed at and exposes Gremlin synchronously through whatever language you called it from.
Engine-managed schema
The first time CrabGraph attaches to a database, it creates a small set of internal tables in a reserved schema alongside your views. These hold topology metadata — which V_* / E_* views exist, their column types, and edge endpoints. The tables are created automatically; there's nothing for you to run or migrate.
Subsequent attaches reuse what's already there. If you add, drop, or re-shape a V_* / E_* view, rebuild the internal schema by dropping it and reattaching — CrabGraph rediscovers your views from scratch.
How each binding embeds
Non-Java bindings load a shared library (libcrabgraph) that's bundled with the package. The Java binding is a plain jar — no native load step needed.
| Binding | How it loads |
|---|---|
| Java | Plain Maven jar |
| Python | Bundled libcrabgraph via ctypes |
| Node.js | N-API addon over libcrabgraph |
| Go | cgo binding to libcrabgraph |
| Rust | Dynamic link to libcrabgraph |
Storage
Storage is whatever DuckDB does. Persist by pointing at a .duckdb file. For ephemeral / testing setups, write your views into a temp file. In-memory connections are not yet supported by attach — file-backed DBs work because the engine opens its own link to the same file you did.
Testing tip. Use a fresh temp file per test (e.g. tmp_path in pytest, t.TempDir() in Go). Each attach rediscovers views from scratch.
Schema Reference
CrabGraph interprets a small set of view-naming and column-naming conventions. Stick to them and the engine handles topology discovery automatically.
View naming
| View name | Becomes | Notes |
|---|---|---|
public.V_<Label> | Vertex label <Label> | Label is case-sensitive in Gremlin queries |
public.E_<EdgeLabel> | Edge label <EdgeLabel> | Endpoints inferred from __O/__I column names |
Required columns
| Where | Column | Type | Purpose |
|---|---|---|---|
Every V_ view | "ID" | BIGINT | Vertex primary key |
Every E_ view | "ID" | BIGINT | Edge primary key |
E_ views | "public.<OutVertex>__O" | BIGINT | Source vertex's ID |
E_ views | "public.<InVertex>__I" | BIGINT | Target vertex's ID |
Property type mapping
All non-ID, non-endpoint columns are exposed as Gremlin properties. CrabGraph infers the property type from DuckDB's information_schema.columns.data_type:
| DuckDB type | Gremlin property type |
|---|---|
BOOLEAN | Boolean |
TINYINT | Byte |
SMALLINT | Short |
INTEGER / INT | Integer |
BIGINT / HUGEINT | Long |
REAL | Float |
DOUBLE / FLOAT | Double |
DATE | LocalDate |
TIMESTAMP | LocalDateTime |
TIMESTAMP WITH TIME ZONE | ZonedDateTime |
BLOB / BYTEA | byte[] |
JSON | JSON (string) |
VARCHAR, TEXT, UUID, anything else | String |
Gremlin Primer
Gremlin is a functional, data-flow traversal language. A traversal starts at a set of elements and threads through a pipeline of steps. Each step transforms the current traversers.
Starting a traversal
g.V() // all vertices g.E() // all edges g.V().hasLabel("Person") // vertices of a specific label (V_Person) g.V().has("Person", "name", "Alice") // filter by property
Step reference
P.* predicates.Map. Good for final projection.valueMap for complex projections..as().t until condition c is met. Use .times(n) for fixed depth.Path object.Map grouped by a key.Order.desc for descending.n traversers. Always prefer to toList() unbounded.Predicate reference (P)
| Predicate | Meaning |
|---|---|
| P.eq(x) | Equal to x |
| P.neq(x) | Not equal |
| P.gt(x) / P.lt(x) | Greater / less than |
| P.gte(x) / P.lte(x) | Greater or equal / less or equal |
| P.between(lo, hi) | lo ≤ value < hi |
| P.within(x, y, …) | Value is one of the listed options |
| P.without(x, y, …) | Value is none of the listed options |
| TextP.containing(s) | String contains s |
| TextP.startingWith(s) | String starts with s |
Traversal Patterns
Common graph query patterns expressed in Gremlin. Examples assume V_Person and E_KNOWS views.
Neighbourhood queries
// Direct neighbours g.V().has("Person", "name", "Alice").out("KNOWS").values("name") // N-hop expansion (BFS up to 3 hops) g.V().has("Person", "name", "Alice") .repeat(__.out("KNOWS").simplePath()) .times(3) .dedup() .values("name")
Shortest path
g.V().has("Person", "name", "Alice") .repeat(__.bothE().otherV().simplePath()) .until(__.has("Person", "name", "Bob")) .path() .limit(1) .next()
Aggregation and grouping
// Friend count per person g.V().hasLabel("Person") .project("name", "friends") .by("name") .by(__.out("KNOWS").count()) .order().by("friends", Order.desc) .limit(10) .toList() // Group people by age, alphabetised within each bucket g.V().hasLabel("Person") .group() .by("age") .by(__.values("name").order().fold())
Filtering with where
// People who know someone older than them g.V().hasLabel("Person").as("a") .out("KNOWS").as("b") .where("a", P.lt("b")).by("age") .select("a").values("name") .toList()
Java / Kotlin
The Java binding is a plain JVM jar that pulls in its transitive dependencies alongside duckdb_jdbc. JDK 17+. The engine runs embedded in your application's JVM, sharing memory with whatever else is on your classpath — no native library, no external process.
Programmatic vs. string Gremlin
Two equivalent ways to express a query: a Gremlin string via g.gremlin(...), or the typed GraphTraversalSource from g.traversal().
try (Connection c = DriverManager.getConnection("jdbc:duckdb:/var/g.duckdb"); Crabgraph g = Crabgraph.attach(c)) { // String API — returns JSON String json = g.gremlin("g.V().hasLabel('Person').count()"); // Programmatic API — typed TinkerPop traversal GraphTraversalSource t = g.traversal(); List<Object> friends = t.V() .has("Person", "name", "Alice") .out("KNOWS") .values("name") .order().toList(); }
Spring Boot
Build a single Crabgraph bean tied to your DuckDB DataSource. Reuse it across requests — traversals are thread-safe.
@Configuration public class GraphConfig { @Bean(destroyMethod = "close") public Crabgraph crabgraph(DataSource duckdb) throws SQLException { return Crabgraph.attach(duckdb.getConnection()); } } @Service public class PersonService { private final Crabgraph g; public PersonService(Crabgraph g) { this.g = g; } public String friendsOf(String name) { return g.gremlin( "g.V().has('Person','name','" + name + "').out('KNOWS').values('name').toList()"); } }
Kotlin
DriverManager.getConnection("jdbc:duckdb:/var/g.duckdb").use { conn -> Crabgraph.attach(conn).use { g -> val friends = g.traversal().V() .has("Person", "name", "Alice") .out("KNOWS") .values<String>("name") .toList() } }
Python
The Python package wraps the embedded libcrabgraph via ctypes. Crabgraph.attach(conn) takes any open duckdb.DuckDBPyConnection; Crabgraph.open(path) is a path-based shortcut. Python 3.9+.
Synchronous core API
import duckdb import crabgraph conn = duckdb.connect("/var/g.duckdb") with crabgraph.Crabgraph.attach(conn) as g: count = g.gremlin("g.V().hasLabel('Person').count()") friends = g.gremlin( "g.V().has('Person','name','Alice').out('KNOWS').values('name').toList()")
Using with Flask / FastAPI
Build the Crabgraph instance once at startup and reuse it for the lifetime of the process. The engine is thread-safe; concurrent requests can call gremlin in parallel.
from flask import Flask, jsonify import duckdb, crabgraph conn = duckdb.connect("/var/g.duckdb") g = crabgraph.Crabgraph.attach(conn) app = Flask(__name__) @app.route("/people") def people(): return jsonify(g.gremlin("g.V().hasLabel('Person').values('name').toList()"))
Locating the native lib in development
For non-pip installs (e.g. testing against a workspace-local libcrabgraph), set CRABGRAPH_NATIVE_DIR to the directory containing libcrabgraph.{so,dylib,dll}.
export CRABGRAPH_NATIVE_DIR=/path/to/native/build python my_app.py
Node.js
Source is TypeScript. The package ships with a small N-API addon (built at install via node-gyp) that bridges to the embedded libcrabgraph. Crabgraph.attach(conn) takes a @duckdb/node-api DuckDBConnection.
TypeScript types
import { DuckDBInstance } from "@duckdb/node-api"; import { Crabgraph } from "crabgraph"; let g: Crabgraph; export async function initGraph(): Promise<void> { const inst = await DuckDBInstance.create(process.env.GRAPH_DB!); const conn = await inst.connect(); g = await Crabgraph.attach(conn); } export function getFriends(name: string): string { return g.gremlin( `g.V().has('Person','name','${name}').out('KNOWS').values('name').toList()`); }
Path-based attach
If you don't already have a DuckDB connection in scope, attach by path directly. The engine opens its own JDBC link to the file.
const g = Crabgraph.open("/var/g.duckdb");
Go
The Go binding is a cgo wrapper around libcrabgraph. Pair it with github.com/duckdb/duckdb-go/v2 (the official DuckDB driver) and you get a graph engine attached to your *sql.DB. Requires Go 1.24+ and CGO_ENABLED=1.
HTTP server example
package main import ( "database/sql" "net/http" _ "github.com/duckdb/duckdb-go/v2" crabgraph "github.com/henneberger/crabgraph/go" ) func main() { db, err := sql.Open("duckdb", "/var/g.duckdb") if err != nil { panic(err) } defer db.Close() g, err := crabgraph.Attach(db) if err != nil { panic(err) } defer g.Close() http.HandleFunc("/people", func(w http.ResponseWriter, r *http.Request) { out, err := g.Gremlin("g.V().hasLabel('Person').valueMap().toList()") if err != nil { http.Error(w, err.Error(), 500); return } w.Header().Set("Content-Type", "application/json") w.Write([]byte(out)) }) http.ListenAndServe(":3000", nil) }
Locating libcrabgraph
By default the binding looks for libcrabgraph.{so,dylib} at ../native/build/ relative to the Go module. To override, set CGo flags at build time:
CGO_CFLAGS="-I/path/to/include" \ CGO_LDFLAGS="-L/path/to/lib -lcrabgraph" \ go build
Rust
The Rust crate is a thin synchronous wrapper around libcrabgraph. The duckdb feature enables typed integration with the duckdb crate's Connection; without it, only the path-based open entrypoint is available.
Cargo features
| Feature | Default | Description |
|---|---|---|
| duckdb | Enables Crabgraph::attach(&duckdb::Connection) | |
| download-native | (Reserved) fetch libcrabgraph from a GitHub Release at build time |
[dependencies] crabgraph = { version = "0.1", features = ["duckdb"] } duckdb = "1.10"
Attach to an existing duckdb-rs connection
use crabgraph::Crabgraph; fn main() -> Result<(), Box<dyn std::error::Error>> { let conn = duckdb::Connection::open("/var/g.duckdb")?; conn.execute_batch(r#" CREATE SCHEMA IF NOT EXISTS public; CREATE VIEW public.V_Person AS SELECT 1::BIGINT AS "ID", 'Alice' AS "name" UNION ALL SELECT 2::BIGINT, 'Bob'; CREATE VIEW public.E_KNOWS AS SELECT 1::BIGINT AS "ID", 2::BIGINT AS "public.Person__I", 1::BIGINT AS "public.Person__O"; "#)?; let g = Crabgraph::attach(&conn)?; let friends = g.gremlin( "g.V().has('Person','name','Alice').out('KNOWS').values('name').toList()")?; println!("{friends}"); Ok(()) }
Without the duckdb feature
If you don't want a hard dep on duckdb-rs, attach by path:
let g = crabgraph::Crabgraph::open("/var/g.duckdb")?;
duckdb-rs dynamically links a system libduckdb. Set DUCKDB_LIB_DIR / DUCKDB_INCLUDE_DIR if needed. libcrabgraph must be on the runtime loader path (LD_LIBRARY_PATH / DYLD_LIBRARY_PATH).
CrabGraph Cloud
CrabGraph Cloud is a hosted notebook experience for exploring graphs without setting up a local DuckDB or any of the language bindings. Drop in a DuckDB file or point at an Iceberg catalog, define your V_* / E_* views in the editor, and run Gremlin queries in the browser.
Skip the local setup.
Same Gremlin, same view convention — running on a managed DuckDB. Try queries against your Iceberg or Parquet data without writing a line of glue code.
Cloud and the embedded library share the same view-discovery convention. A graph that works in cloud.crabgraph.net will work as-is when you switch to attaching from your own application.
Configuration
The current API surface is intentionally minimal — all you configure is which DuckDB file to attach to. Tuning happens in DuckDB itself (memory limits, threads, extension config) on the connection you pass in.
Build-time / install-time
| Variable | Used by | Purpose |
|---|---|---|
CRABGRAPH_NATIVE_DIR | Python, Node, Rust, Go (dev) | Directory containing libcrabgraph.{so,dylib,dll}. Bypasses the bundled library — useful for workspace-local builds. |
DUCKDB_LIB_DIR / DUCKDB_INCLUDE_DIR | Rust | Where duckdb-rs finds your system libduckdb. |
CGO_LDFLAGS / CGO_CFLAGS | Go | Override the cgo flags used to link libcrabgraph. |
LD_LIBRARY_PATH / DYLD_LIBRARY_PATH | Runtime (all non-Java) | Loader search path for libcrabgraph if not bundled with the package. |
Tuning DuckDB
CrabGraph executes Gremlin by translating it to SQL and running it through your DuckDB connection. Standard DuckDB tuning applies — set memory limits, thread counts, and any extension config before attaching:
SET memory_limit = '8GB'; SET threads = 8; INSTALL iceberg; LOAD iceberg; ATTACH 's3://...' AS cat (TYPE ICEBERG); -- now create your V_/E_ views
API Reference
Methods available on every binding's Crabgraph instance. Names and types are language-idiomatic; semantics are identical.
| Method | Returns | Description |
|---|---|---|
Crabgraph.attach(conn) | Crabgraph | Attach to an open DuckDB connection. Reads the file path from the connection and bootstraps from V_* / E_* views. |
Crabgraph.open(path) | Crabgraph | Attach by path to an existing DuckDB file. Convenience for cases without a live connection. |
.gremlin(query) | String (JSON) | Execute a Gremlin query string and return the result JSON-encoded. |
.exec_sql(sql) | void | Run raw SQL on the engine's connection. Useful for ad-hoc DuckDB ops alongside graph queries. |
.traversal() Java only | GraphTraversalSource | Returns the typed TinkerPop traversal source for programmatic Gremlin. |
.close() | void | Release the engine and its underlying JDBC link. |
Result format
gremlin(query) returns Jackson-serialized JSON. Primitive results are encoded as scalars ("3", "\"Alice\""). Collection results are JSON arrays. Vertices and edges are JSON objects:
{
"type": "vertex",
"id": "v[Person][1]",
"label": "Person",
"properties": { "name": "Alice", "age": 34 }
}Gremlin steps
Anything in the TinkerPop 3.7 reference grammar parses. See the TinkerPop reference for the complete step library.