head -n 10 data/sales.csv
tail -n 10 data/sales.csv
cut -d"|" -f 1,7 data/sales.csv
tail -n +2 data/sales.csv
sort -t, -k 3
sort -k2 -n -r
tells sed to sort numerically-h
sorts numerically assuming SI qualifiers, like
4K, 2M etc-r
produces a descending orderAssuming both files are sorted by those columns.
join -1 1 -2 3 a.csv b.csv
Many ways to do it, easiest is to use tr.
> echo hello | tr -d 'l'
From the manual: Unix tools re-implemented in simple AWK scripts
Script can be run inline, or also in a separated file:
awk -f print_fields.awk file.txt
{print $1, $3}
{sum += $2} END {print sum}
FS = ","
print $1, $3
{print $0, $1+$2}
for (line in count) {
print line, count[line]
{count[$2]++} END {for (val in count) print val, count[val]}
$1] += $2
for (key in arr) {
print key, arr[key]
Remove all columns except first one and 7th.
xsv select 1,7 filename.csv
xsv select 1,7 filename.csv | xsv table
Assuming input file has a semi-colon ;
, this command
will output regular CSV:
xsv fmt --delimiter ";" filename.csv
This is the equivalent in AWK:
awk 'BEGIN {FS="|"; OFS=","} {$1=$1; print}'
Create a virtual environment, activate it and install Pandas. Using python 3.10, this takes less than 30 seconds:
python -m venv .wadus
source .wadus/bin/activate
pip install pandas
User Guide should cover 90% of the cases.
Given this file:
"base": "USD",
"date": "2016-02-05",
"rates": {
"AUD": 1.3911,
"BGN": 1.7459
We would like to produce the following jsonl
And the script would be:
jq -c '.rates | to_entries[] | {base:"usd", quote: .key, mid: .value}' exchange_rates_usd.json
jq -rcs '.[] | [.code,.name] | @csv' countries.jsonl
You can build it like { some: .code }
or use the
shortcut directly {code}
jq -rcs '.[] | {code}' countries.jsonl
Arrays are built using []
instead of
, as expected.
Normally you would run it using -f
For instance, the exchange_rates.json
example can be
also produced like this:
by using:
jq -c --arg provider 'xe' -f exchange_rates.jq exchange_rates_usd.json
where exchange_rates.jq
script is:
.base as $base
| .rates
| to_entries[]
| {
provider: $provider,
base: $base,
quote: .key,
mid: .value
Notice the base
is fetched from the top level and
reused afterwards. The provider
field is set from the
outside with --arg
Reading the manual is a good starting point. It’s quite long, don’t get anxiety and rtfm before googling.
For some inspiration, you may check a bigger example