Convert SQLite database to Postgres database

Install sqlite cli

Sometimes in life you need to work with SQLite databases. For example, I was given this Pitchfork database in the form of a database.sqlite file. The first thing I needed was to install a SQLite CLI, which can be downloaded https://sqlite.org/2020/sqlite-tools-osx-x86-3310100.zip - which contains a bundle of command-line tools for managing SQLite database files, including the command-line shell program, the sqldiff program, and the sqlite3_analyzer program.

cd sqlite-tools-osx-x86-3310100/
(base) shravan-sqlite-tools-osx-x86-3310100$ ls -rtl
total 5080
-rwxr-xr-x@ 1 shravan  staff   700096 Jan 28 03:28 sqldiff
-rwxr-xr-x@ 1 shravan  staff   728116 Jan 28 03:28 sqlite3_analyzer
-rwxr-xr-x@ 1 shravan  staff  1168852 Jan 28 03:28 sqlite3
-rw-r--r--@ 1 shravan  staff  83585024 Sep 20 05:24 database.sqlite
(base) shravan-sqlite-tools-osx-x86-3310100$ ./sqlite3
SQLite version 3.31.1 2020-01-27 19:55:54
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> .open database.sqlite

Data

We will be working with the pitchfork database made available to use in the form of a .sqlite file. The below screenshot shows the number of rows and columns to expect in each of these tables:

pitchfork

View tables and schema

sqlite> .tables
artists  content  genres   labels   reviews  years
sqlite> .schema
CREATE TABLE reviews (
	reviewid INTEGER,
	title TEXT,
	artist TEXT,
	url TEXT,
	score REAL,
	best_new_music INTEGER,
	author TEXT,
	author_type TEXT,
	pub_date TEXT,
	pub_weekday INTEGER,
	pub_day INTEGER,
	pub_month INTEGER,
	pub_year INTEGER);
CREATE TABLE artists (
	reviewid INTEGER, artist TEXT);
CREATE TABLE genres (
	reviewid INTEGER, genre TEXT);
CREATE TABLE labels (
	reviewid INTEGER, label TEXT);
CREATE TABLE years (
	reviewid INTEGER, year INTEGER);
CREATE TABLE content (
	reviewid INTEGER, content TEXT);
sqlite>

Query one of these tables

sqlite> SELECT
   ...>   reviewid,
   ...>   title,
   ...>   artist,
   ...>   author
   ...> FROM reviews
   ...> LIMIT 10;
22703|mezzanine|massive attack|nate patrin
22721|prelapsarian|krallice|zoe camp
22659|all of them naturals|uranium club|david glickman
22661|first songs|kleenex, liliput|jenn pelly
22725|new start|taso|kevin lozano
22722|insecure (music from the hbo original series)|various artists|vanessa okoth-obbo
22704|stillness in wonderland|little simz|katherine st. asaph
22694|tehillim|yotam avni|andy beta
22714|reflection|brian eno|andy beta
22724|filthy america its beautiful|the lox|ian cohen
sqlite>

By default the mode is set to |

Export the tables as individual csv files

To export an SQLite table (or part of a table) as CSV, simply set the “mode” to “csv” and then run a query to extract the desired rows of the table.

sqlite> .headers on
sqlite> .mode csv
sqlite> .once 'reviews.csv'
sqlite> SELECT * FROM reviews;
sqlite>

The .headers on line causes column labels to be printed as the first row of output. This means that the first row of the resulting CSV file will contain column labels. If column labels are not desired, set “.headers off” instead. (The “.headers off” setting is the default and can be omitted if the headers have not been previously turned on.)

The line .once FILENAME causes all query output to go into the named file instead of being printed on the console. In the example above, that line causes the CSV content to be written into a file named “C:/work/dataout.csv”.

Finally, after running the select statement, you can see that reviews.csv got created. Similarly, do it for other tables.

(base) shravan-sqlite-tools-osx-x86-3310100$ ls -rtl
total 243728
-rw-r--r--@ 1 shravan  staff  83585024 Sep 20 05:24 database.sqlite
-rwxr-xr-x@ 1 shravan  staff    700096 Jan 28 03:28 sqldiff
-rwxr-xr-x@ 1 shravan  staff    728116 Jan 28 03:28 sqlite3_analyzer
-rwxr-xr-x@ 1 shravan  staff   1168852 Jan 28 03:28 sqlite3
-rw-r--r--@ 1 shravan  staff  34891456 Feb 25 15:13 database.sqlite.zip
-rw-r--r--  1 shravan  staff   2924489 Feb 25 16:09 reviews.csv
-rw-r--r--  1 shravan  staff    392675 Feb 25 16:22 artists.csv
-rw-r--r--  1 shravan  staff    300689 Feb 25 16:22 genres.csv
-rw-r--r--  1 shravan  staff    358748 Feb 25 16:22 labels.csv
-rw-r--r--  1 shravan  staff    220192 Feb 25 16:23 years.csv
-rw-r--r--  1 shravan  staff  75497159 Feb 25 16:24 content.csv

Import the csv files into Postgres

Create a database pitchfork using this command: createdb -h localhost -p 5432 -U shravan pitchfork

Open the query tool and run the sql script provided in the appendix.

sqlite> SELECT COUNT(*) FROM reviews;
COUNT(*)
18393
sqlite>
(etl) shravan$ psql pitchfork
psql (11.5)
Type "help" for help.

pitchfork=# SELECT COUNT(*) FROM reviews;
 count
-------
 18393
(1 row)

pitchfork=#

Appendix

set time zone 'UTC';
DROP TABLE IF EXISTS reviews;
DROP TABLE IF EXISTS artists;
DROP TABLE IF EXISTS genres;
DROP TABLE IF EXISTS labels;
/*DROP TABLE IF EXISTS years;*/
DROP TABLE IF EXISTS contents;

CREATE TABLE IF NOT EXISTS reviews (
    reviewid INTEGER,
    title TEXT,
    artist TEXT,
    url TEXT,
    score NUMERIC,
    best_new_music INTEGER,
    author TEXT,
    author_type TEXT,
    pub_date timestamp,
    pub_weekday INTEGER,
    pub_day INTEGER,
    pub_month INTEGER,
    pub_year INTEGER
);

COPY reviews
	FROM '/Users/shravan/Downloads/sqlite-tools-osx-x86-3310100/reviews.csv' (DELIMITER ',', FORMAT CSV, HEADER, NULL 'NA');

CREATE TABLE IF NOT EXISTS artists (
    reviewid INTEGER,
	artist TEXT
);

COPY artists
	FROM '/Users/shravan/Downloads/sqlite-tools-osx-x86-3310100/artists.csv' (DELIMITER ',', FORMAT CSV, HEADER, NULL 'NA');


CREATE TABLE IF NOT EXISTS genres (
    reviewid INTEGER,
	genre TEXT
);

COPY genres
	FROM '/Users/shravan/Downloads/sqlite-tools-osx-x86-3310100/genres.csv' (DELIMITER ',', FORMAT CSV, HEADER, NULL 'NA');


CREATE TABLE IF NOT EXISTS labels (
    reviewid INTEGER,
	label TEXT
);

COPY labels
	FROM '/Users/shravan/Downloads/sqlite-tools-osx-x86-3310100/labels.csv' (DELIMITER ',', FORMAT CSV, HEADER, NULL 'NA');

/*
CREATE TABLE years (
    reviewid INTEGER,
	year INTEGER
);

COPY years
	FROM '/Users/shravan/Downloads/sqlite-tools-osx-x86-3310100/years.csv' (DELIMITER ',', FORMAT CSV, HEADER, NULL 'NA');
*/

CREATE TABLE contents (
    reviewid INTEGER,
	contents TEXT
);

COPY contents
	FROM '/Users/shravan/Downloads/sqlite-tools-osx-x86-3310100/content.csv' (DELIMITER ',', FORMAT CSV, HEADER, NULL 'NA');

Use ctrl + l to clear the screen.

help:

sqlite> .help
.archive ...             Manage SQL archives
.auth ON|OFF             Show authorizer callbacks
.backup ?DB? FILE        Backup DB (default "main") to FILE
.bail on|off             Stop after hitting an error.  Default OFF
.binary on|off           Turn binary output on or off.  Default OFF
.cd DIRECTORY            Change the working directory to DIRECTORY
.changes on|off          Show number of rows changed by SQL
.check GLOB              Fail if output since .testcase does not match
.clone NEWDB             Clone data into NEWDB from the existing database
.databases               List names and files of attached databases
.dbconfig ?op? ?val?     List or change sqlite3_db_config() options
.dbinfo ?DB?             Show status information about the database
.dump ?TABLE? ...        Render all database content as SQL
.echo on|off             Turn command echo on or off
.eqp on|off|full|...     Enable or disable automatic EXPLAIN QUERY PLAN
.excel                   Display the output of next command in spreadsheet
.exit ?CODE?             Exit this program with return-code CODE
.expert                  EXPERIMENTAL. Suggest indexes for queries
.explain ?on|off|auto?   Change the EXPLAIN formatting mode.  Default: auto
.filectrl CMD ...        Run various sqlite3_file_control() operations
.fullschema ?--indent?   Show schema and the content of sqlite_stat tables
.headers on|off          Turn display of headers on or off
.help ?-all? ?PATTERN?   Show help text for PATTERN
.import FILE TABLE       Import data from FILE into TABLE
.imposter INDEX TABLE    Create imposter table TABLE on index INDEX
.indexes ?TABLE?         Show names of indexes
.limit ?LIMIT? ?VAL?     Display or change the value of an SQLITE_LIMIT
.lint OPTIONS            Report potential schema issues.
.load FILE ?ENTRY?       Load an extension library
.log FILE|off            Turn logging on or off.  FILE can be stderr/stdout
.mode MODE ?TABLE?       Set output mode
.nullvalue STRING        Use STRING in place of NULL values
.once (-e|-x|FILE)       Output for the next SQL command only to FILE
.open ?OPTIONS? ?FILE?   Close existing database and reopen FILE
.output ?FILE?           Send output to FILE or stdout if FILE is omitted
.parameter CMD ...       Manage SQL parameter bindings
.print STRING...         Print literal STRING
.progress N              Invoke progress handler after every N opcodes
.prompt MAIN CONTINUE    Replace the standard prompts
.quit                    Exit this program
.read FILE               Read input from FILE
.recover                 Recover as much data as possible from corrupt db.
.restore ?DB? FILE       Restore content of DB (default "main") from FILE
.save FILE               Write in-memory database into FILE
.scanstats on|off        Turn sqlite3_stmt_scanstatus() metrics on or off
.schema ?PATTERN?        Show the CREATE statements matching PATTERN
.selftest ?OPTIONS?      Run tests defined in the SELFTEST table
.separator COL ?ROW?     Change the column and row separators
.sha3sum ...             Compute a SHA3 hash of database content
.shell CMD ARGS...       Run CMD ARGS... in a system shell
.show                    Show the current values for various settings
.stats ?on|off?          Show stats or turn stats on or off
.system CMD ARGS...      Run CMD ARGS... in a system shell
.tables ?TABLE?          List names of tables matching LIKE pattern TABLE
.testcase NAME           Begin redirecting output to 'testcase-out.txt'
.testctrl CMD ...        Run various sqlite3_test_control() operations
.timeout MS              Try opening locked tables for MS milliseconds
.timer on|off            Turn SQL timer on or off
.trace ?OPTIONS?         Output each SQL statement as it is run
.vfsinfo ?AUX?           Information about the top-level VFS
.vfslist                 List all available VFSes
.vfsname ?AUX?           Print the name of the VFS stack
.width NUM1 NUM2 ...     Set column widths for "column" mode
sqlite>