admin管理员组文章数量:1430083
I have data in file with pattern as shown below -
TRSSOEC TRBSOEC ADICOES DELETE TROEC
TRSSOEC TRBSOEC ADICOES INSERT TROEC
TRSSOEC TRBSOEC ADICOES SELECT TROEC
SOBCXYZ SCHEMA1 AGENTES_CARGA INSERT TROEC
SOBCXYZ SCHEMA1 AGENTES_CARGA SELECT TROEC
YASU999 ASDF123 BS_ADICOES_SISCOMEX UPDATE TRBSOEC
USER123 WAREHOU BS_ADICOES_TRIBUTOS_BKP DELETE TRBSOEC
USER123 WAREHOU BS_ADICOES_TRIBUTOS_BKP INSERT TRBSOEC
I would like to compare lines with field 1,2,3 and if they are same then capture field 4 to diplay as shown below -
grant DELETE,INSERT,SELECT on TRBSOEC.ADICOES to TRSSOEC;
grant INSERT,SELECT on SCHEMA1.AGENTES_CARGA to SOBCXYZ;
grant UPDATE on ASDF123.BS_ADICOES_SISCOMEX to YASU999;
grant DELETE,INSERT on WAREHOU.BS_ADICOES_TRIBUTOS_BKP to USER123;
Basically logic is to group based on field 1,2,3 and print distinct field 4 along with other common fields. grant <comma separated array of filed 4> on filed2.field3 to field1 ;
So far I tried below but not sure how to capture field 4, any guidance would be helpful.
awk '
# Each line
{
if ($1 == prev[1] && $2 == prev[2] && $3 == prev[3])
{
# Capture and use field 4 of last match
if(!nrMatched) { nrMatched = prev[4] }
print "grant " $nrMatched " on " $2"."$3 " to " $1";"
}
else
{
# No match, reset matched flag and just copy record through
nrMatched = 0
}
# Save fields from this line for next comparison
split($0, prev)
}
' test.txt
I have data in file with pattern as shown below -
TRSSOEC TRBSOEC ADICOES DELETE TROEC
TRSSOEC TRBSOEC ADICOES INSERT TROEC
TRSSOEC TRBSOEC ADICOES SELECT TROEC
SOBCXYZ SCHEMA1 AGENTES_CARGA INSERT TROEC
SOBCXYZ SCHEMA1 AGENTES_CARGA SELECT TROEC
YASU999 ASDF123 BS_ADICOES_SISCOMEX UPDATE TRBSOEC
USER123 WAREHOU BS_ADICOES_TRIBUTOS_BKP DELETE TRBSOEC
USER123 WAREHOU BS_ADICOES_TRIBUTOS_BKP INSERT TRBSOEC
I would like to compare lines with field 1,2,3 and if they are same then capture field 4 to diplay as shown below -
grant DELETE,INSERT,SELECT on TRBSOEC.ADICOES to TRSSOEC;
grant INSERT,SELECT on SCHEMA1.AGENTES_CARGA to SOBCXYZ;
grant UPDATE on ASDF123.BS_ADICOES_SISCOMEX to YASU999;
grant DELETE,INSERT on WAREHOU.BS_ADICOES_TRIBUTOS_BKP to USER123;
Basically logic is to group based on field 1,2,3 and print distinct field 4 along with other common fields. grant <comma separated array of filed 4> on filed2.field3 to field1 ;
So far I tried below but not sure how to capture field 4, any guidance would be helpful.
awk '
# Each line
{
if ($1 == prev[1] && $2 == prev[2] && $3 == prev[3])
{
# Capture and use field 4 of last match
if(!nrMatched) { nrMatched = prev[4] }
print "grant " $nrMatched " on " $2"."$3 " to " $1";"
}
else
{
# No match, reset matched flag and just copy record through
nrMatched = 0
}
# Save fields from this line for next comparison
split($0, prev)
}
' test.txt
Share
Improve this question
asked Nov 19, 2024 at 14:10
YasserYasser
777 bronze badges
5
|
5 Answers
Reset to default 4Using any awk and assuming the input is grouped by the first 3 fields as shown in the sample input and as the OPs code requires:
$ cat tst.awk
{ tgt = "on " $2 "." $3 " to " $1 ";" }
tgt != prev {
if ( NR > 1 ) {
print grant, prev
}
prev = tgt
grant = "grant " $4
next
}
{ grant = grant "," $4 }
END { print grant, prev }
$ awk -f tst.awk file
grant DELETE,INSERT,SELECT on TRBSOEC.ADICOES to TRSSOEC;
grant INSERT,SELECT on SCHEMA1.AGENTES_CARGA to SOBCXYZ;
grant UPDATE on ASDF123.BS_ADICOES_SISCOMEX to YASU999;
grant DELETE,INSERT on WAREHOU.BS_ADICOES_TRIBUTOS_BKP to USER123;
What I would do:
awk '
{
key = $1"|"$2"|"$3
if (!(key in ops)) {
groups[key]=$1","$2","$3
}
if (!seen[key"|"$4]) {
ops[key]=ops[key](ops[key] ? "," : "") $4
seen[key"|"$4]=1
}
}
END {
for (key in groups) {
split(groups[key], fields, ",")
print "grant " ops[key] " on " fields[2] "." fields[3] " to " fields[1]";"
}
}' file
If datamash
is available we can use it for most of the work: sorting (-s
option), grouping by fields 1, 2 and 3 (-g1,2,3
), concatenating the (unique) privileges per group with a comma as separator (uniq 4
)... Then we can use awk
to do the final formating:
datamash -Wsg1,2,3 uniq 4 < test.txt |
awk '{print "grant " $4 " on " $2 "." $3 " to " $1 ";"}'
The -W
option of datamash
changes the default single tab separator for whitespace (one or more spaces and/or tabs), and ignores leading whitespaces.
If datamash
is not available, and your file is not too large to fit in memory, with any awk
, you can try:
awk '{ s = $2 "." $3 " to " $1 ";"
a[s] = a[s] ? a[s] "," $4 : $4 }
END { for(s in a) print "grant " a[s] " on " s }' test.txt
For each line we store the database.table to user;
string in variable s
, and we store the comma-separated list of privileges in a[s]
. In the END
block we print all grant
commands.
If your file is really large this solution that stores everything before printing may exceed the available memory. If it is the case you can sort first:
sort -k1,4 test.txt | awk '
{ s = $2 "." $3 " to " $1 ";" }
s == t { p = p "," $4; next }
NR > 1 { print "grant " p " on " t }
{ t = s; p = $4 }
END { print "grant " p " on " s }'
Overhauling OP's current code and taking into consideration the need to pre-sort the input file before feeding to awk
:
awk '
function print_grant() {
if (prev[1] != "")
print "grant", perms, "on", prev[2] "." prev[3], "to", prev[1] ";"
perms = ""
}
{ if ($1 != prev[1] || $2 != prev[2] || $3 != prev[3]) # if a different key then ...
print_grant() # print previous key to stdout
perms = perms (perms=="" ? "" : ",") $4 # append field $4 to variable "perms"
split($0,prev) # save current fields in prev[] array
}
END { print_grant() } # flush last key to stdout
' <(sort test.txt) # per comment from OP we cannot assume file is already sorted
This generates:
grant INSERT,SELECT on SCHEMA1.AGENTES_CARGA to SOBCXYZ;
grant DELETE,INSERT,SELECT on TRBSOEC.ADICOES to TRSSOEC;
grant DELETE,INSERT on WAREHOU.BS_ADICOES_TRIBUTOS_BKP to USER123;
grant UPDATE on ASDF123.BS_ADICOES_SISCOMEX to YASU999;
If OP needs the output sorted in a particular order we can either pipe this output to an additional sort
call or we can modify the initial sorting of the input file, for example ...
Sorting input so that output is ordered by object name, schema name, grantee and then permissions we can change <(sort file.txt)
to <(sort -k3,3 -k2,2 -k1,1 -k4,4 file.txt)
to generate:
grant DELETE,INSERT,SELECT on TRBSOEC.ADICOES to TRSSOEC;
grant INSERT,SELECT on SCHEMA1.AGENTES_CARGA to SOBCXYZ;
grant UPDATE on ASDF123.BS_ADICOES_SISCOMEX to YASU999;
grant DELETE,INSERT on WAREHOU.BS_ADICOES_TRIBUTOS_BKP to USER123;
Here is a Ruby that allows that type of grouping:
ruby -e 'puts $<.map(&:split).
group_by{|a| "#{a[1]}.#{a[2]} to #{a[0]}"}.
map{|grp, op| [grp, op.map{|x| x[3]}] }.
map{|grp, op| "grant #{op.join(",")} on #{grp};" }
' file
Prints:
grant DELETE,INSERT,SELECT on TRBSOEC.ADICOES to TRSSOEC;
grant INSERT,SELECT on SCHEMA1.AGENTES_CARGA to SOBCXYZ;
grant UPDATE on ASDF123.BS_ADICOES_SISCOMEX to YASU999;
grant DELETE,INSERT on WAREHOU.BS_ADICOES_TRIBUTOS_BKP to USER123;
本文标签: awk to compare fields of multiple lines and combineStack Overflow
版权声明:本文标题:awk to compare fields of multiple lines and combine - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1745556133a2663197.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
awk '{print "grant",$4,"on",$2 "." $3,"to",$1 ";"}' file
– Cyrus Commented Nov 19, 2024 at 19:57datamash
(see my answer for an example), plus a bit ofawk
for the final formatting. Isdatamash
available? – Renaud Pacalet Commented Nov 20, 2024 at 7:18