duckdb-wasm attempt

Nov 01, 2021
import * as duckdb from '@duckdb/duckdb-wasm/dist/duckdb-esm.js';

// the docs (at list some
// bizarre webpack thing that I don't understand for loading some shit, so
// lets' just choose the jsdelivr option.
// I don't like it not being local, but I also want to try and GSD so let's
// settle on this for now
const JSDELIVR_BUNDLES = duckdb.getJsDelivrBundles();

out.js: duckdb.js
    ./node_modules/.bin/esbuild duckdb.js --bundle --outfile=out.js
$ make out.js
./node_modules/.bin/esbuild duckdb.js --bundle --outfile=out.js
 > node_modules/@duckdb/duckdb-wasm/dist/duckdb-esm.js:1:75: error: Could not resolve "apache-arrow" (mark it as external to exclude it from the bundle)
    1 │ ...mWriter as ee,Table as U}from"apache-arrow";import{AsyncByteQueue as Z}f...
      ╵                                 ~~~~~~~~~~~~~~

1 error
make: *** [out.js] Error 1
import * as duckdb from '@duckdb/duckdb-wasm/dist/duckdb-esm.js';
  const JSDELIVR_BUNDLES = duckdb.getJsDelivrBundles();

  // Select a bundle based on browser checks
  const bundle = await duckdb.selectBundle(JSDELIVR_BUNDLES);

  const worker = new Worker(bundle.mainWorker);
  const logger = new duckdb.ConsoleLogger();
  const db = new duckdb.AsyncDuckDB(logger, worker);
  await db.instantiate(bundle.mainModule, bundle.pthreadWorker);
Uncaught TypeError: class heritage stream_1.Readable is not an object or null






ok, I learned more than I ever wanted to about importing stuff with esbuild. We need to alias the @apache-arrow/esnext-esm package to apache-arrow, because that's the name that duckdb-wasm expects.

Once I vaguely understood the problem, I tried to use esbuild-plugin-resolve, but that failed because of this issue which tries to export default but there's no default to export.

So I took the plugin from that file, removed the line that addres the troublesome import, and created this build script, which works.

Now I have a duckdb db that I have no idea how to do anything with!

duckdb sql docs here:

create a table:

const c = await db.connect();
await c.query(`CREATE TABLE weather (
    city VARCHAR,
    temp_lo INTEGER, -- minimum temperature on a day
    temp_hi INTEGER, -- maximum temperature on a day
    prcp REAL,
    date DATE );`);

I then tried to show the tables, but it returns a result object I can't figure out how to do anything with:

let res = await c.query("PRAGMA show_tables;")

maybe let's try inserting a single row?

 res = await c.query("INSERT INTO weather VALUES ('San Francisco', 46, 50, 0.25, '1994-11-27');")

 OK, to figure out the result API I'm having to read [the tests](

 I _think_ the object that gets returned might be apache arrow `table` [object](
    -    but it seems to be a little bit different?
    - helpful [observable notebook](
    - [this might be the API docs]( for the table object
    - [this doc]( suggests there ought to be a `filter` object? but I can't find it
        - I think the doc might be written to an older version of arrow, unfortunately
        - the [table class]( doesn't show a filter method

 table object:
 - `length`: the number of rows in the table
 - `numCols`: the number of columns
 - `getColumnAt(n)`: return a single column's object
 - `get(n)`: return a row
// get the first row of the table and print it
>>> table.get(0).toString()  

"{ \"city\": \"San Francisco\", \"temp_lo\": 46, \"temp_hi\": 50, \"prcp\": 0.25, \"date\": Sat Nov 26 1994 19:00:00 GMT-0500 (Eastern Standard Time) }"
>>> table.toArray()  
Array [ {} ]
0: Object { Symbol("rowIndex"): 0 }
length: 1
<prototype>: Array []
- but you can convert the array to a string to show all data:
>>> table.toArray().join(',')  

"{ \"city\": \"San Francisco\", \"temp_lo\": 46, \"temp_hi\": 50, \"prcp\": 0.25, \"date\": Sat Nov 26 1994 19:00:00 GMT-0500 (Eastern Standard Time) }"
>>> table.serialize()  

Uint8Array(776) [ 255, 255, 255, 255, 96, 1, 0, 0, 16, 0,  ]"city", "temp_lo", "temp_hi").toArray().toString()  

"{ \"city\": \"San Francisco\", \"temp_lo\": 46, \"temp_hi\": 50 }"

row object:

column object: - get(n): return the value of the column at the given row


There's a dataframe object

↑ up