safenames

Modify headers of a CSV to only have "safe" names - guaranteed "database-ready"/"CKAN-ready" names.

Table of Contents | Source: src/cmd/safenames.rs |

Description | Usage | Safenames Options | Common Options

Description ↩

Modify headers of a CSV to only have "safe" names - guaranteed "database-ready" names (optimized specifically for PostgreSQL column identifiers).

Fold to lowercase. Trim leading & trailing whitespaces. Replace whitespace/non-alphanumeric characters with _. If name starts with a number & check_first_char is true, prepend the unsafe prefix. If a header with the same name already exists, append a sequence suffix (e.g. col, col_2, col_3). Names are limited to 60 characters in length. Empty names are replaced with the unsafe prefix.

In addition, specifically because of CKAN Datastore requirements:

Headers with leading underscores are replaced with "unsafe_" prefix.
Headers that are named "_id" are renamed to "reserved__id".

These CKAN Datastore options can be configured via the --prefix & --reserved options, respectively.

In Always (a) and Conditional (c) mode, returns number of modified headers to stderr, and sends CSV with safe headers output to stdout.

In Verify (v) mode, returns number of unsafe headers to stderr. In Verbose (V) mode, returns number of headers; duplicate count and unsafe & safe headers to stderr. No stdout output is generated in Verify and Verbose mode.

In JSON (j) mode, returns Verbose mode info in minified JSON to stdout. In Pretty JSON (J) mode, returns Verbose mode info in pretty printed JSON to stdout.

Given data.csv: c1,12_col,Col with Embedded Spaces,,Column!@Invalid+Chars,c1 1,a2,a3,a4,a5,a6

$ qsv safenames data.csv

c1,unsafe_12_col,col_with_embedded_spaces,unsafe_,column__invalid_chars,c1_2 1,a2,a3,a4,a5,a6 stderr: 5

Conditionally rename headers, allowing "quoted identifiers":

$ qsv safenames --mode c data.csv

c1,unsafe_12_col,Col with Embedded Spaces,unsafe_,column__invalid_chars,c1_2 1,a2,a3,a4,a5,a6 stderr: 4

Verify how many "unsafe" headers are found:

$ qsv safenames --mode v data.csv

stderr: 4

Verbose mode:

$ qsv safenames --mode V data.csv

stderr: 6 header/s 1 duplicate/s: "c1:2" 4 unsafe header/s: ["12_col", "Col with Embedded Spaces", "", "Column!@Invalid+Chars"] 1 safe header/s: ["c1"]

Note that even if "Col with Embedded Spaces" is technically safe, it is generally discouraged. Though it can be created as a "quoted identifier" in PostgreSQL, it is still marked "unsafe" by default, unless mode is set to "conditional."

It is discouraged because the embedded spaces can cause problems later on. (see https://lerner.co.il/2013/11/30/quoting-postgresql/ for more info).

For more examples, see https://github.com/dathere/qsv/blob/master/tests/test_safenames.rs.

Usage ↩

qsv safenames [options] [<input>]
qsv safenames --help

Safenames Options ↩

Option	Type	Description	Default
`‑‑mode`	string	Rename header names to "safe" names - i.e. guaranteed "database-ready" names. It has six modes - conditional, always, verify, Verbose, with Verbose having two submodes - JSON & pretty JSON.	`Always`
`‑‑reserved`	string	Comma-delimited list of additional case-insensitive reserved names that should be considered "unsafe." If a header name is found in the reserved list, it will be prefixed with "reserved_".	`_id`
`‑‑prefix`	string	Certain systems do not allow header names to start with "" (e.g. CKAN Datastore). This option allows the specification of the unsafe prefix to use when a header starts with "".	`unsafe_`

Common Options ↩

Option	Type	Description
`‑h,` `‑‑help`	flag	Display this message
`‑o,` `‑‑output`	string	Write output to instead of stdout. Note that no output is generated for Verify and Verbose modes.
`‑d,` `‑‑delimiter`	string	The field delimiter for reading CSV data. Must be a single character. (default: ,)

Source: src/cmd/safenames.rs | Table of Contents | README

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

safenames

Description ↩

Usage ↩

Safenames Options ↩

Common Options ↩

FilesExpand file tree

safenames.md

Latest commit

History

safenames.md

File metadata and controls

safenames

Description ↩

Usage ↩

Safenames Options ↩

Common Options ↩