head -n 10 data/sales.csv
tail -n 10 data/sales.csv
cut -d"|" -f 1,7 data/sales.csv
tail -n +2 data/sales.csv
sort -t, -k 3
sort -k2 -n -r
-n
tells sed to sort numerically-h
sorts numerically assuming SI qualifiers, like
4K, 2M etc-r
produces a descending orderAssuming both files are sorted by those columns.
join -1 1 -2 3 a.csv b.csv
Many ways to do it, easiest is to use tr.
> echo hello | tr -d 'l'
heo
From the manual: Unix tools re-implemented in simple AWK scripts
Script can be run inline, or also in a separated file:
awk -f print_fields.awk file.txt
{print $1, $3}
{sum += $2} END {print sum}
BEGIN {
FS = ","
}
{
print $1, $3
}
{print $0, $1+$2}
{
$0]++
count[}
END {
for (line in count) {
print line, count[line]
}
}
{count[$2]++} END {for (val in count) print val, count[val]}
{
$1] += $2
arr[}
END {
for (key in arr) {
print key, arr[key]
}
}
Remove all columns except first one and 7th.
xsv select 1,7 filename.csv
xsv select 1,7 filename.csv | xsv table
Assuming input file has a semi-colon ;
, this command
will output regular CSV:
xsv fmt --delimiter ";" filename.csv
This is the equivalent in AWK:
awk 'BEGIN {FS="|"; OFS=","} {$1=$1; print}'
Create a virtual environment, activate it and install Pandas. Using python 3.10, this takes less than 30 seconds:
python -m venv .wadus
source .wadus/bin/activate
pip install pandas
User Guide should cover 90% of the cases.
Given this file:
{
"base": "USD",
"date": "2016-02-05",
"rates": {
"AUD": 1.3911,
"BGN": 1.7459
}
}
We would like to produce the following jsonl
file:
{"base":"usd","quote":"AUD","mid":1.3911}
{"base":"usd","quote":"BGN","mid":1.7459}
And the script would be:
jq -c '.rates | to_entries[] | {base:"usd", quote: .key, mid: .value}' exchange_rates_usd.json
jq -rcs '.[] | [.code,.name] | @csv' countries.jsonl
You can build it like { some: .code }
or use the
shortcut directly {code}
:
jq -rcs '.[] | {code}' countries.jsonl
{"code":"YE"}
{"code":"ZM"}
Arrays are built using []
instead of
{}
, as expected.
Normally you would run it using -f
option.
For instance, the exchange_rates.json
example can be
also produced like this:
{"provider":"xe","base":"USD","quote":"AUD","mid":1.3911}
{"provider":"xe","base":"USD","quote":"BGN","mid":1.7459}
by using:
jq -c --arg provider 'xe' -f exchange_rates.jq exchange_rates_usd.json
where exchange_rates.jq
script is:
.base as $base
| .rates
| to_entries[]
| {
provider: $provider,
base: $base,
quote: .key,
mid: .value
}
Notice the base
is fetched from the top level and
reused afterwards. The provider
field is set from the
outside with --arg
.
Reading the manual is a good starting point. It’s quite long, don’t get anxiety and rtfm before googling.
For some inspiration, you may check a bigger example