admin管理员组文章数量:1434909
We use a script that prints bash commands into a file that is then run on an HPC system. It is supposed to run through a large text file containing geographic coordinates separated by whitespace and extract a specific region from that file (e.g. extract all lines with an x coordinate between xmin and xmax and an y coordinate between ymin and ymax).
Ideally, I'd like to use awk for that like so (from memory since I don't have my computer available at the moment):
awk -v xmin=-13000 -v xmax=13000 -v ymin=-500 -v ymax=500 -F ' ' {if ($1 > xmin && $1 < xmin && $2 > ymin && $2 < ymin) print $1 $2} $infile > $outfile
That would probably execute fine. However, as suggested by the title, we save this line indirectly for 25 regions, each with their own xmin, xmax etc. There are more operations following after that (using GMT calls etc). Here's a little snippet:
xmin=-13000
xmax=13000
ymin=-500
ymax=500
infile=./full_file.txt
outfile=./filtered_file.yxy
srcfile=./region_1.txt
echo """awk -v xmin=$xmin -v xmax=$xmax -v ymin=$ymin -v ymax=$ymax -F ' ' {if ($1 > $xmin && $1 < $xmin && $2 > $ymin && $2 < $ymin) print $1 $2} $infile > $outfile""" >> $srcfile
Obviously, this raises errors when running due to variable expansion. I've tried escaping the awk column identifiers but to no avail or didn't understand the pattern correctly. Could someone point me to a solution that allows us to keep the indirect approach?
We use a script that prints bash commands into a file that is then run on an HPC system. It is supposed to run through a large text file containing geographic coordinates separated by whitespace and extract a specific region from that file (e.g. extract all lines with an x coordinate between xmin and xmax and an y coordinate between ymin and ymax).
Ideally, I'd like to use awk for that like so (from memory since I don't have my computer available at the moment):
awk -v xmin=-13000 -v xmax=13000 -v ymin=-500 -v ymax=500 -F ' ' {if ($1 > xmin && $1 < xmin && $2 > ymin && $2 < ymin) print $1 $2} $infile > $outfile
That would probably execute fine. However, as suggested by the title, we save this line indirectly for 25 regions, each with their own xmin, xmax etc. There are more operations following after that (using GMT calls etc). Here's a little snippet:
xmin=-13000
xmax=13000
ymin=-500
ymax=500
infile=./full_file.txt
outfile=./filtered_file.yxy
srcfile=./region_1.txt
echo """awk -v xmin=$xmin -v xmax=$xmax -v ymin=$ymin -v ymax=$ymax -F ' ' {if ($1 > $xmin && $1 < $xmin && $2 > $ymin && $2 < $ymin) print $1 $2} $infile > $outfile""" >> $srcfile
Obviously, this raises errors when running due to variable expansion. I've tried escaping the awk column identifiers but to no avail or didn't understand the pattern correctly. Could someone point me to a solution that allows us to keep the indirect approach?
Share Improve this question edited Nov 17, 2024 at 10:23 tripleee 190k36 gold badges313 silver badges361 bronze badges asked Nov 16, 2024 at 20:23 Sacha ViqueratSacha Viquerat 3931 gold badge3 silver badges15 bronze badges 4 |2 Answers
Reset to default 4IIUC, you have to either escape each dollar sign like that:
{if (\$1 > xmin && \$1 < xmin
or temporarily close a double quote and put a dollar sign in a single quote:
"{if ("'$1'" > xmin && "'$1'" < xmin"
or use Bash specific %q
printf specifier:
$ read
awk -v xmin=-13000 -v xmax=13000 -v ymin=-500 -v ymax=500 -F ' ' {if ($1 > xmin && $1 < xmin && $2 > ymin && $2 < ymin) print $1 $2} $infile > $outfile
$ printf "%q\n" "$REPLY"
awk\ -v\ xmin=-13000\ -v\ xmax=13000\ -v\ ymin=-500\ -v\ ymax=500\ -F\ \'\ \'\ \{if\ \(\$1\ \>\ xmin\ \&\&\ \$1\ \<\ xmin\ \&\&\ \$2\ \>\ ymin\ \&\&\ \$2\ \<\ ymin\)\ print\ \$1\ \$2\}\ \$infile\ \>\ \$outfile
$ echo awk\ -v\ xmin=-13000\ -v\ xmax=13000\ -v\ ymin=-500\ -v\ ymax=500\ -F\ \'\ \'\ \{if\ \(\$1\ \>\ xmin\ \&\&\ \$1\ \<\ xmin\ \&\&\ \$2\ \>\ ymin\ \&\&\ \$2\ \<\ ymin\)\ print\ \$1\ \$2\}\ \$infile\ \>\ \$outfile
awk -v xmin=-13000 -v xmax=13000 -v ymin=-500 -v ymax=500 -F ' ' {if ($1 > xmin && $1 < xmin && $2 > ymin && $2 < ymin) print $1 $2} $infile > $outfile
And also I think it would be good to enclose awk code in '
if you don't want shell to expand variables.
Creating a separate temporary script seems superfluous. Just loop over the parameters.
while read -r xmin xmax ymin ymax\
infile outfile
do
awk -v xmin="$xmin" -v xmax="$xmax" -v ymin="$ymin" -v ymax="$ymax" \
'$1 > xmin && $1 < xmax && $2 > ymin && $2 < ymax { print $1 $2 }' "$infile" > "$outfile"
done <<____
-13000 13000 -500 500 full_file.txt filtered_file.yxy
17 42 19 21 littlefile.txt other.yxy
-27350 27350 -123 123 another.txt moar.yxy
____
The ____
is just a cute alternative to the more conventional EOF
heredoc delimiter. The lines in the here document should each be one set of values for the variables in the read
.
If you really want to print each snippet to a separate file (perhaps to submit each to run on a different cluster node, for example), maybe learn to use printf
instead of echo
.
while read -r xmin xmax ymin ymax\
infile outfile srcfile
do
printf 'awk -v xmin="%i" -v xmax="%i" -v ymin="%i" -v ymax="%i" \
'"'"'$1 > xmin && $1 < xmax && $2 > ymin && $2 < ymax { print $1 $2 }'"'"' "./%s" > "./%s"\n' \
"$xmin" "$xmax" "$ymin" "$ymax" "$infile" "$outfile" >>"./$srcfile"
done <<____
-13000 13000 -500 500 full_file.txt filtered_file.yxy region1.txt
17 42 19 21 littlefile.txt other.yxy region2.txt
-27350 27350 -123 123 another.txt moar.yxy region3.txt
____
(though printing commands to .txt
files is still really weird).
For what it's worth, the triple quotes in your attempt do nothing useful. Python (for example) has this syntax, but in the shell, """
simply parses into an empty string inside a pair of quotes ""
followed by an opening double quote "
.
Similarly, the printf
example above demonstrates one way to produce a literal single quote inside a single-quoted string. 'foo'"'"'bar'
is (single-quoted) foo
next to double-quoted '
next to single-quoted bar
, which when pasted together produces foo'bar
.
I also slightly refactored your Awk script to make it more idiomatic, and fixed missing quoting
本文标签: bashUsing variables in awk within echo statement that prints into a fileStack Overflow
版权声明:本文标题:bash - Using variables in awk within echo statement that prints into a file - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1745644072a2668030.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
-F ' '
because it's the default (and trying to use it in yourecho
context is more complicated). You don't needif...print
; you can just do (if I'm right above)awk '$1>xmin&&$1<xmax&&$2>ymin&&$2<ymax' infile >outfile
(a pattern with no action defaults toprint $0
) but toecho
that to a file which will work as shell input you need to add quoted quotes around the script. – dave_thompson_085 Commented Nov 16, 2024 at 20:33'
if you don't want shell to expand variables. – Arkadiusz Drabczyk Commented Nov 16, 2024 at 20:40