duckdb-wasm attempt

last updated: Oct 20, 2023
import * as duckdb from '@duckdb/duckdb-wasm/dist/duckdb-esm.js';

// the docs (at https://www.npmjs.com/package/@duckdb/duckdb-wasm) list some
// bizarre webpack thing that I don't understand for loading some shit, so
// lets' just choose the jsdelivr option.
//
// I don't like it not being local, but I also want to try and GSD so let's
// settle on this for now
const JSDELIVR_BUNDLES = duckdb.getJsDelivrBundles();

console.log(JSDELIVR_BUNDLES);
out.js: duckdb.js
	./node_modules/.bin/esbuild duckdb.js --bundle --outfile=out.js
$ make out.js
./node_modules/.bin/esbuild duckdb.js --bundle --outfile=out.js
 > node_modules/@duckdb/duckdb-wasm/dist/duckdb-esm.js:1:75: error: Could not resolve "apache-arrow" (mark it as external to exclude it from the bundle)
    1 │ ...mWriter as ee,Table as U}from"apache-arrow";import{AsyncByteQueue as Z}f...
      ╵                                 ~~~~~~~~~~~~~~

1 error
make: *** [out.js] Error 1
import * as duckdb from '@duckdb/duckdb-wasm/dist/duckdb-esm.js';
  const JSDELIVR_BUNDLES = duckdb.getJsDelivrBundles();

  // Select a bundle based on browser checks
  const bundle = await duckdb.selectBundle(JSDELIVR_BUNDLES);

  const worker = new Worker(bundle.mainWorker);
  const logger = new duckdb.ConsoleLogger();
  const db = new duckdb.AsyncDuckDB(logger, worker);
  await db.instantiate(bundle.mainModule, bundle.pthreadWorker);
Uncaught TypeError: class heritage stream_1.Readable is not an object or null js http://devd.io:8001/dist/viewer_duckdb.js:13475 __require http://devd.io:8001/dist/viewer_duckdb.js:10 <anonymous> http://devd.io:8001/dist/viewer_duckdb.js:15606 <anonymous> http://devd.io:8001/dist/viewer_duckdb.js:17091 [viewer_duckdb.js:13475:30](http://devd.io:8001/dist/viewer_duckdb.js)

ok, I learned more than I ever wanted to about importing stuff with esbuild. We need to alias the @apache-arrow/esnext-esm package to apache-arrow, because that's the name that duckdb-wasm expects.

Once I vaguely understood the problem, I tried to use esbuild-plugin-resolve, but that failed because of this issue which tries to export default but there's no default to export.

So I took the plugin from that file, removed the line that addres the troublesome import, and created this build script, which works.

Now I have a duckdb db that I have no idea how to do anything with!

duckdb sql docs here: https://duckdb.org/docs/sql/introduction

create a table:

const c = await db.connect();
await c.query(`CREATE TABLE weather (
	city VARCHAR,
	temp_lo INTEGER, -- minimum temperature on a day
	temp_hi INTEGER, -- maximum temperature on a day
	prcp REAL,
	date DATE );`);

I then tried to show the tables, but it returns a result object I can't figure out how to do anything with:

let res = await c.query("PRAGMA show_tables;")

maybe let's try inserting a single row?

 res = await c.query("INSERT INTO weather VALUES ('San Francisco', 46, 50, 0.25, '1994-11-27');")

OK, to figure out the result API I'm having to read the tests:

I think the object that gets returned might be apache arrow table object?
- but it seems to be a little bit different?
- helpful observable notebook
- this might be the API docs for the table object
- this doc suggests there ought to be a filter object? but I can't find it
- I think the doc might be written to an older version of arrow, unfortunately
- the table class doesn't show a filter method

table object:

// get the first row of the table and print it
>>> table.get(0).toString()  

"{ \"city\": \"San Francisco\", \"temp_lo\": 46, \"temp_hi\": 50, \"prcp\": 0.25, \"date\": Sat Nov 26 1994 19:00:00 GMT-0500 (Eastern Standard Time) }"
>>> table.toArray()  
Array [ {} ]
0: Object { Symbol("rowIndex"): 0 }
length: 1
<prototype>: Array []
- but you can convert the array to a string to show all data:
>>> table.toArray().join(',')  

"{ \"city\": \"San Francisco\", \"temp_lo\": 46, \"temp_hi\": 50, \"prcp\": 0.25, \"date\": Sat Nov 26 1994 19:00:00 GMT-0500 (Eastern Standard Time) }"
>>> table.serialize()  

Uint8Array(776) [ 255, 255, 255, 255, 96, 1, 0, 0, 16, 0, … ]
table.select("city", "temp_lo", "temp_hi").toArray().toString() "{ \"city\": \"San Francisco\", \"temp_lo\": 46, \"temp_hi\": 50 }"

row object:

column object:

DataFrame

There's a dataframe object

↑ up