访问亚马逊Athena的命令行工具
问题描述:
我正在寻找一个命令行工具来查询Amazon Athena。访问亚马逊Athena的命令行工具
它与JDBC,使用驱动程序com.amazonaws.athena.jdbc.AthenaDriver
,但我还没有发现任何命令行工具,它的工作原理。
答
从版本1.11.89开始,AWS command line tool支持Amazon Athena的操作。
首先,您需要将AmazonAthenaFullAccess策略附加到主叫用户的IAM角色。
然后,上手查询,你会按如下方式使用start-query-execution
命令:
aws athena start-query-execution
--query-string "SELECT * FROM MyDb.MyTable"
--result-configuration "OutputLocation"="s3://MyBucket/logs" [Optional: EncryptionConfiguration]
--region <region>
这将返回QueryExecutionId的JSON对象,它可用于使用以下命令来检索查询结果:
aws athena get-query-results
--query-execution-id <id>
--region <region>
其中还返回结果和元数据的JSON对象。
更多信息可在官方AWS Documentation中找到。
希望这会有所帮助!
答
从@MasonWinsauer的前一个答案扩展。需要bash和jq。
#!/bin/bash
# Athena queries are fundamentally Asynchronous. So we have to :
# 1) Make the query, and tell Athena where in s3 to put the results (tell it the same place as the UI uses).
# 2) Wait for the query to finish
# 3) Pull down the results and un-wacky-Jsonify them.
# run the query, use jq to capture the QueryExecutionId, and then capture that into bash variable
queryExecutionId=$(
aws athena start-query-execution \
--query-string "SELECT Count(*) AS numBooks FROM books" \
--query-execution-context "Database=demo_books" \
--result-configuration "OutputLocation"="s3://whatever_is_in_the_athena_UI_settings" \
--region us-east-1 | jq -r ".QueryExecutionId"
)
echo "queryExecutionId was ${queryExecutionId}"
# Wait for the query to finish running.
# This will wait for up to 60 seconds (30 * 2)
for i in $(seq 1 30); do
queryState=$(
aws athena get-query-execution --query-execution-id "${queryExecutionId}" --region us-east-1 | jq -r ".QueryExecution.Status.State"
);
if [[ "${queryState}" == "SUCCEEDED" ]]; then
break;
fi;
echo " Awaiting queryExecutionId ${queryExecutionId} - state was ${queryState}"
if [[ "${queryState}" == "FAILED" ]]; then
# exit with "bad" error code
exit 1;
fi;
sleep 2
done
# Get the results.
aws athena get-query-results \
--query-execution-id "${queryExecutionId}" \
--region us-east-1 > numberOfBooks_wacky.json
# Todo un-wacky the json with jq or something
# cat numberOfBooks_wacky.json | jq -r ".ResultSet.Rows[] | .Data[0].VarCharValue"
从[文档】(https://aws.amazon.com/athena/faqs/):_Amazon雅典娜能够经由AWS管理控制台和JDBC驱动来访问。您可以以编程方式运行查询,使用JDBC驱动程序添加表或分区。它看起来像管理控制台或使用JDBC的代码是列出的唯一选项。 –
@TimBiegeleisen感谢您的回复。是的,我也在文档中阅读过,但我想知道它有任何“非官方”的方式从命令行查询。 – greuze
那么你总是可以用Java编写一个简单的包装控制台程序。 –