Inserting data#

Creating tables#

The dclient create-table command creates a new empty table with an explicit schema. Define columns with --column name type and optionally set primary keys with --pk:

dclient create-table mydb dogs \
  --column id integer \
  --column name text \
  --column age integer \
  --pk id \
  -i myapp

This hits the Datasette create API with a columns array. The response includes the generated schema:

{
  "ok": true,
  "database": "mydb",
  "table": "dogs",
  "schema": "CREATE TABLE [dogs] (\n   [id] INTEGER PRIMARY KEY,\n   [name] TEXT,\n   [age] INTEGER\n)"
}

Compound primary keys are supported by passing --pk multiple times:

dclient create-table mydb events \
  --column user_id integer \
  --column event_id integer \
  --column data text \
  --pk user_id --pk event_id

If you want to create a table and populate it with data in one step, use dclient insert --create instead.

Inserting rows#

The dclient insert command can be used to insert data from a local file directly into a Datasette instance, via the Write API introduced in the Datasette 1.0 alphas.

First you’ll need to authenticate with the instance.

To insert data from a data.csv file into a table called my_table in the data database:

dclient insert data my_table data.csv --create -i myapp

You can also pipe data into standard input:

curl -s 'https://api.github.com/repos/simonw/dclient/issues' | \
  dclient insert data issues - --create -i myapp

Upserting data#

The dclient upsert command works exactly like insert but uses the upsert endpoint, which will update existing rows with matching primary keys rather than raising an error.

dclient upsert data my_table data.csv --csv -i myapp

Streaming data#

dclient insert works for streaming data as well.

If you have a log file containing newline-delimited JSON you can tail it and send it to a Datasette instance like this:

tail -f log.jsonl | \
  dclient insert data logs - --nl -i myapp

When reading from standard input (filename -) you are required to specify the format. In this example that’s --nl for newline-delimited JSON. --csv and --tsv are supported for streaming as well, but --json is not.

In streaming mode records default to being sent to the server every 100 records or every 10 seconds, whichever comes first. You can adjust these values using the --batch-size and --interval settings. For example, here’s how to send every 10 records or if 5 seconds has passed since the last time data was sent to the server:

tail -f log.jsonl | dclient insert \
  data logs - --nl --create -i myapp \
  --batch-size 10 \
  --interval 5

Supported formats#

Data can be inserted from CSV, TSV, JSON or newline-delimited JSON files.

The format of the file will be automatically detected. You can override this by using one of the following options:

  • --csv

  • --tsv

  • --json

  • --nl for newline-delimited JSON

Use --encoding <encoding> to specify the encoding of the file. The default is utf-8.

JSON#

JSON files should be formatted like this:

[
    {
        "id": 1
        "column1": "value1",
        "column2": "value2"
    },
    {
        "id": 2
        "column1": "value1",
        "column2": "value2"
    }
]

Newline-delimited files like this:

{"id": 1, "column1": "value1", "column2": "value2"}
{"id": 2, "column1": "value1", "column2": "value2"}

CSV and TSV#

CSV and TSV files should have a header row containing the names of the columns.

By default, dclient will attempt to detect the types of the different columns in the CSV and TSV files - so if a column only ever contains numeric integers it will be stored as integers in the SQLite database.

You can disable this and have every value treated as a string using --no-detect-types.

Other options#

  • --create - create the table if it doesn’t already exist

  • --replace - replace any rows with a matching primary key

  • --ignore - ignore any rows with a matching existing primary key

  • --alter - alter table to add any columns that are missing

  • --pk id - set a primary key (for if the table is being created)

If you use --create a table will be created with rows to match the columns in your uploaded data - using the correctly detected types, unless you use --no-detect-types in which case every column will be of type text.

dclient insert –help#

Usage: dclient insert [OPTIONS] DATABASE TABLE FILEPATH

  Insert data into a remote Datasette instance

  Example usage:

      dclient insert main mytable data.csv --csv -i myapp
      dclient insert main mytable data.csv --csv --create --pk id

Options:
  -i, --instance TEXT   Datasette instance URL or alias
  --csv                 Input is CSV
  --tsv                 Input is TSV
  --json                Input is JSON
  --nl                  Input is newline-delimited JSON
  --encoding TEXT       Character encoding for CSV/TSV
  --no-detect-types     Don't detect column types for CSV/TSV
  --alter               Alter table to add any missing columns
  --pk TEXT             Columns to use as the primary key when creating the
                        table
  --batch-size INTEGER  Send rows in batches of this size
  --interval FLOAT      Send batch at least every X seconds
  --token TEXT          API token
  --silent              Don't output progress
  -v, --verbose         Verbose output: show HTTP request and response
  --replace             Replace rows with a matching primary key
  --ignore              Ignore rows with a matching primary key
  --create              Create table if it does not exist
  --help                Show this message and exit.

dclient upsert –help#

Usage: dclient upsert [OPTIONS] DATABASE TABLE FILEPATH

  Upsert data into a remote Datasette instance

  Example usage:

      dclient upsert main mytable data.csv --csv -i myapp

Options:
  -i, --instance TEXT   Datasette instance URL or alias
  --csv                 Input is CSV
  --tsv                 Input is TSV
  --json                Input is JSON
  --nl                  Input is newline-delimited JSON
  --encoding TEXT       Character encoding for CSV/TSV
  --no-detect-types     Don't detect column types for CSV/TSV
  --alter               Alter table to add any missing columns
  --pk TEXT             Columns to use as the primary key when creating the
                        table
  --batch-size INTEGER  Send rows in batches of this size
  --interval FLOAT      Send batch at least every X seconds
  --token TEXT          API token
  --silent              Don't output progress
  -v, --verbose         Verbose output: show HTTP request and response
  --help                Show this message and exit.

dclient create-table –help#

Usage: dclient create-table [OPTIONS] DATABASE TABLE_NAME

  Create a new empty table with an explicit schema

  Example usage:

      dclient create-table mydb dogs \
        --column id integer --column name text --pk id

Options:
  -c, --column TEXT...  Column definition: name type (e.g. --column id integer
                        --column name text)
  --pk TEXT             Column(s) to use as primary key
  -i, --instance TEXT   Datasette instance URL or alias
  --token TEXT          API token
  -v, --verbose         Verbose output: show HTTP request and response
  --help                Show this message and exit.