to

Convert CSV files to Parquet, PostgreSQL, SQLite, Excel (XLSX), LibreOffice Calc (ODS) and Data Package.

Table of Contents | Source: src/cmd/to.rs | 🚀🐻‍❄️🗄️

Description ↩

Convert CSV files to Parquet, PostgreSQL, SQLite, Excel XLSX, ODS and Data Package.

Parquet

Convert CSV files to Parquet format. Each input CSV produces a separate .parquet file in the specified output directory. The output directory will be created if it does not exist.

Requires the polars feature to be enabled.

Compression can be specified with --compression (default: zstd). Supported values: zstd, gzip, snappy, lz4raw, uncompressed. Use --compress-level to set the compression level for gzip (default: 6) or zstd (default: 3).

Examples ↩

Convert file1.csv and file2.csv to parquet files in output_dir/

qsv to parquet output_dir file1.csv file2.csv

Convert all CSVs in a directory to parquet.

qsv to parquet output_dir dir1

Convert files listed in the 'input.infile-list' to parquet.

qsv to parquet output_dir input.infile-list

Convert with snappy compression.

qsv to parquet output_dir --compression snappy file1.csv

Convert with zstd compression at level 10.

qsv to parquet output_dir --compression zstd --compress-level 10 file1.csv

Convert from stdin with a custom filename.

cat data.csv | qsv to parquet output_dir --table mydata -

Postgresql

To convert to postgres you need to supply connection string. The format is described here - https://docs.rs/postgres/latest/postgres/config/struct.Config.html#examples-1. Additionally you can use env=MY_ENV_VAR and qsv will get the connection string from the environment variable MY_ENV_VAR. If using the --dump option instead of a connection string put a name of a file or - for stdout. Load file1.csv and file2.csv' file to local database test, with user testuser, and password pass`.

qsv to postgres 'postgres://testuser:pass@localhost/test' file1.csv file2.csv

Load same files into a new/existing postgres schema myschema

qsv to postgres 'postgres://testuser:pass@localhost/test' --schema=myschema file1.csv file2.csv

Load same files into a new/existing postgres database whose connection string is in the DATABASE_URL environment variable.

qsv to postgres 'env=DATABASE_URL' file1.csv file2.csv

Load files inside a directory to a local database 'test' with user testuser, password pass.

qsv to postgres 'postgres://testuser:pass@localhost/test' dir1

Load files listed in the 'input.infile-list' to a local database 'test' with user testuser, password pass.

qsv to postgres 'postgres://testuser:pass@localhost/test' input.infile-list

Drop tables if they exist before loading.

qsv to postgres 'postgres://testuser:pass@localhost/test' --drop file1.csv file2.csv

Evolve tables if they exist before loading. Read http://datapackage_convert.opendata.coop/evolve.html to explain how evolving works.

qsv to postgres 'postgres://testuser:pass@localhost/test' --evolve file1.csv file2.csv

Create dump file.

qsv to postgres --dump dumpfile.sql file1.csv file2.csv

Print dump to stdout.

qsv to postgres --dump - file1.csv file2.csv

Sqlite

Convert to sqlite db file. Will be created if it does not exist. If using the --dump option, instead of a sqlite database file, put the name of the dump file or - for stdout. Load file1.csv and file2.csv' files to sqlite database test.db`

qsv to sqlite test.db file1.csv file2.csv

Load all files in dir1 to sqlite database test.db

qsv to sqlite test.db dir

Load files listed in the 'mydata.infile-list' to sqlite database test.db

qsv to sqlite test.db mydata.infile-list

Drop tables if they exist before loading.

qsv to sqlite test.db --drop file1.csv file2.csv

Evolve tables if they exist. Read http://datapackage_convert.opendata.coop/evolve.html to explain how evolving is done.

qsv to sqlite test.db --evolve file1.csv file2.csv

Create dump file .

qsv to sqlite --dump dumpfile.sql file1.csv file2.csv

Print dump to stdout.

qsv to sqlite --dump - file1.csv file2.csv

Excel XLSX

Convert to new xlsx file. Example: Load file1.csv and file2.csv' into xlsx file. Will create output.xlsx, creating new sheets for each file, with the sheet name being the filename without the extension. Note the output.xlsx` will be overwritten if it exists.

qsv to xlsx output.xlsx file1.csv file2.csv

Load all files in dir1 into xlsx file.

qsv to xlsx output.xlsx dir1

Load files listed in the 'ourdata.infile-list' into xlsx file.

qsv to xlsx output.xlsx ourdata.infile-list

Load a single CSV into xlsx with a custom sheet name.

qsv to xlsx output.xlsx --table "Sales Data" file1.csv

Load from stdin with a custom sheet name.

cat data.csv | qsv to xlsx output.xlsx --table "Monthly Report" -

ODS

Convert to new ODS (Open Document Spreadsheet) file. Example: Load file1.csv and file2.csv' into ODS file. Will create output.ods, creating new sheets for each file, with the sheet name being the filename without the extension. Note the output.ods` will be overwritten if it exists.

qsv to ods output.ods file1.csv file2.csv

Load all files in dir1 into ODS file.

qsv to ods output.ods dir1

Load files listed in the 'ourdata.infile-list' into ODS file.

qsv to ods output.ods ourdata.infile-list

Load a single CSV into ODS with a custom sheet name.

qsv to ods output.ods --table "Sales Data" file1.csv

Load from stdin with a custom sheet name.

cat data.csv | qsv to ods output.ods --table "Monthly Report" -

Data Package

Generate a datapackage, which contains stats and information about what is in the CSV files. Generate a datapackage.json file from file1.csv and `file2.csv' files.

qsv to datapackage datapackage.json file1.csv file2.csv

Add more stats to datapackage.

qsv to datapackage datapackage.json --stats file1.csv file2.csv

Generate a datapackage.json file from all the files in dir1

qsv to datapackage datapackage.json dir1

Generate a datapackage.json file from all the files listed in the 'data.infile-list'

qsv to datapackage datapackage.json data.infile-list

For all other conversions you can output the datapackage created by specifying --print-package.

qsv to xlsx datapackage.xlsx --stats --print-package file1.csv file2.csv

For more examples, see tests.

Usage ↩

qsv to parquet [options] <destination> [<input>...]
qsv to postgres [options] <destination> [<input>...]
qsv to sqlite [options] <destination> [<input>...]
qsv to xlsx [options] <destination> [<input>...]
qsv to ods [options] <destination> [<input>...]
qsv to datapackage [options] <destination> [<input>...]
qsv to --help

Arguments ↩

Argument	Description
`<destination>`	The output target, which varies by subcommand: parquet: output directory (created if needed) postgres: connection string or env=VAR_NAME (with --dump: dump file path or - for stdout) sqlite: database file path (with --dump: dump file path or - for stdout) xlsx: output .xlsx file path ods: output .ods file path datapackage: output .json file path
`<input>`	Input CSV file(s) to convert. Can be file path(s), a directory, an .infile-list file, or `-` for stdin (not supported by parquet subcommand).

To Options ↩

Option	Type	Description
`‑k,` `‑‑print‑package`	flag	Print statistics as datapackage, by default will print field summary.
`‑u,` `‑‑dump`	flag	Create database dump file for use with `psql` or `sqlite3` command line tools (postgres/sqlite only).
`‑a,` `‑‑stats`	flag	Produce extra statistics about the data beyond just type guessing.
`‑c,` `‑‑stats‑csv`	string	Output stats as CSV to specified file.
`‑q,` `‑‑quiet`	flag	Do not print out field summary.
`‑s,` `‑‑schema`	string	The schema to load the data into. (postgres only).
`‑‑infer‑len`	string	The number of rows to use for schema inference (parquet only). Note that even if a pschema.json file exists for an input file, explicitly specifying infer-len will cause qsv to ignore the pschema.json and infer the schema from the CSV data instead, including when set to 0. Set to 0 to infer from all rows (not recommended for large files).
`‑‑try‑parse‑dates`	flag	Attempt to parse date/datetime columns with polars' date inference logic. This may result in more accurate date parsing, but can be slower on large files. (parquet only).
`‑d,` `‑‑drop`	flag	Drop tables before loading new data into them (postgres/sqlite only).
`‑e,` `‑‑evolve`	flag	If loading into existing db, alter existing tables so that new data will load. (postgres/sqlite only).
`‑i,` `‑‑pipe`	flag	Adjust output format for piped data (omits row counts and field format columns).
`‑t,` `‑‑table`	string	Use this as the table/sheet/file name (postgres/sqlite/xlsx/ods/parquet). Overrides the default name derived from the input filename. When reading from stdin, the default table name is "stdin". Only valid with a single input file. For postgres/sqlite: must start with a letter or underscore, contain only alphanumeric characters and underscores (max 63). For xlsx/ods: used as sheet name (max 31 chars, cannot contain \ / * [ ] : ?).
`‑p,` `‑‑separator`	string	For xlsx, use this character to help truncate xlsx sheet names. Defaults to space.
`‑‑compression`	string	Parquet compression codec (parquet only). Valid values: zstd (default), gzip, snappy, lz4raw, uncompressed.
`‑‑compress‑level`	string	Compression level (parquet only). For gzip: 1-9 (default: 6). For zstd: -7 to 22 (default: 3). Ignored for other codecs.
`‑A,` `‑‑all‑strings`	flag	Convert all fields to strings.
`‑j,` `‑‑jobs`	string	The number of jobs to run in parallel. When not set, the number of jobs is set to the number of CPUs detected.

Common Options ↩

Option	Type	Description	Default
`‑h,` `‑‑help`	flag	Display this message
`‑d,` `‑‑delimiter`	string	The field delimiter for reading CSV data. Must be a single character. (default: ,)

Source: src/cmd/to.rs | Table of Contents | README

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

to

Description ↩

Parquet

Examples ↩

Postgresql

Sqlite

Excel XLSX

ODS

Data Package

Usage ↩

Arguments ↩

To Options ↩

Common Options ↩

FilesExpand file tree

to.md

Latest commit

History

to.md

File metadata and controls

to

Description ↩

Parquet

Examples ↩

Postgresql

Sqlite

Excel XLSX

ODS

Data Package

Usage ↩

Arguments ↩

To Options ↩

Common Options ↩