admin管理员组

文章数量:1432627

I have iceberg tables on aws. i am trying to use Athena or presto to query them.

My question is: how to know the data contribution from each table to the result.

e.g. How to know how many rows in the result are from table A and table B?

(select * from table A where XXX)
union
(select * from table B where YYY)

As for Athena, I am trying to use .html, but it seems that there are no what i want.

it is possible to get the information from presto query metadata?

Any comments welcomed.

Thanks

I have iceberg tables on aws. i am trying to use Athena or presto to query them.

My question is: how to know the data contribution from each table to the result.

e.g. How to know how many rows in the result are from table A and table B?

(select * from table A where XXX)
union
(select * from table B where YYY)

As for Athena, I am trying to use https://docs.aws.amazon/cli/latest/reference/athena/get-query-runtime-statistics.html, but it seems that there are no what i want.

it is possible to get the information from presto query metadata?

Any comments welcomed.

Thanks

Share Improve this question asked Nov 19, 2024 at 1:20 hehe123456hehe123456 1712 silver badges5 bronze badges 1
  • Can you please elaborate? Or add some sample data and desired output? – Guru Stron Commented Nov 19, 2024 at 10:20
Add a comment  | 

1 Answer 1

Reset to default 0

There is no direct method to know this. You can use count to know this. The below way to adjust query.

SELECT 
    'table_a' AS source_table,
    COUNT(*) AS contribution_count
FROM 
    table_a a
LEFT JOIN 
    table_b b 
ON 
    amon_key = bmon_key
GROUP BY 
    source_table
UNION ALL
SELECT 
    'table_b' AS source_table,
    COUNT(*) AS contribution_count
FROM 
    table_b b
LEFT JOIN 
    table_a a 
ON 
    bmon_key = amon_key
GROUP BY 
    source_table;

本文标签: