line-by-line filter extension

last updated: Oct 20, 2023

https://github.com/asg017/sqlite-lines

sqlite-lines is a SQLite extension for reading lines from a file or blob.

sqlite-lines is great for line-oriented datasets, like ndjson or JSON Lines, when paired with SQLite's JSON support. Here, we calculate the top 5 country participants in Google's Quick, Draw! dataset for calendars.ndjson:

select
  line ->> '$.countrycode' as countrycode,
  count(*)
from lines_read('./calendar.ndjson')
group by 1
order by 2 desc
limit 5;
/*
┌─────────────┬──────────┐
│ countrycode │ count(*) │
├─────────────┼──────────┤
│ US          │ 141001   │
│ GB          │ 22560    │
│ CA          │ 11759    │
│ RU          │ 9250     │
│ DE          │ 8748     │
└─────────────┴──────────┘
*/

Really nice observable notebook here with:

$ cat names.txt | sqlite-lines 'rowid || upper(d)' 1ALEX 2BRIAN 3CRAIG
↑ up