S3 架构设计与编程语言无关,提供 REST 和 SOAP 接口。HTTP 上的 SOAP 支持已弃用,但仍可在 HTTPS 上使用。SOAP 将不支持新 S3 功能,建议使用 REST API。
借助 REST,可以使用标准的 HTTP 请求创建、提取和删除存储桶和对象。直接利用 REST API 进行代码开发是复杂的,AWS SDK 包装了底层 REST API,可以简化编程任务。
配置 AWS Credentials
为使用 AWS SDK,必须提供 AWS 凭证,在 ~/.aws/credentials (Windows 用户为 C:\Users\USER_NAME.aws\credentials) 中创建:
aws_access_key_id = your_access_key_id
aws_secret_access_key = your_secret_access_key
project.build.sourceEncoding UTF-8 /project.build.sourceEncoding
groupId com.amazonaws /groupId
artifactId aws-java-sdk-s3 /artifactId
groupId com.amazonaws /groupId
artifactId aws-java-sdk-bom /artifactId
version 1.11.433 /version
type pom /type
scope import /scope
如要使用全部的 SDK,不需使用 BOM,简单声明如下:
groupId com.amazonaws /groupId
artifactId aws-java-sdk /artifactId
version 1.11.433 /version
S3 基本操作
演示了 createBucket、listBuckets、putObject、getObject、listObjects、deleteObject、deleteBucket 等 S3 基本操作。
package org.itrunner.aws.s3;
import com.amazonaws.HttpMethod;
import com.amazonaws.regions.Regions;
import com.amazonaws.services.s3.AmazonS3;
import com.amazonaws.services.s3.AmazonS3ClientBuilder;
import com.amazonaws.services.s3.model.*;
import java.io.File;
import java.net.URL;
import java.util.Date;
import java.util.List;
public class S3Util {
private static AmazonS3 s3;
static { s3 = AmazonS3ClientBuilder.standard().withRegion(Regions.CN_NORTH_1).build();
private S3Util() { }
* Create a new S3 bucket - Amazon S3 bucket names are globally unique
public static Bucket createBucket(String bucketName) { return s3.createBucket(bucketName);
* List the buckets in your account
public static List Bucket listBuckets() { return s3.listBuckets();
* List objects in your bucket
public static ObjectListing listObjects(String bucketName) { return s3.listObjects(bucketName);
* List objects in your bucket by prefix
public static ObjectListing listObjects(String bucketName, String prefix) { return s3.listObjects(bucketName, prefix);
* Upload an object to your bucket
public static PutObjectResult putObject(String bucketName, String key, File file) { return s3.putObject(bucketName, key, file);
* Download an object - When you download an object, you get all of the object s metadata and a stream from which to read the contents.
* It s important to read the contents of the stream as quickly as possibly since the data is streamed directly from Amazon S3 and your
* network connection will remain open until you read all the data or close the input stream.
public static S3Object get(String bucketName, String key) { return s3.getObject(bucketName, key);
* Delete an object - Unless versioning has been turned on for your bucket, there is no way to undelete an object, so use caution when deleting objects.
public static void deleteObject(String bucketName, String key) { s3.deleteObject(bucketName, key);
* Delete a bucket - A bucket must be completely empty before it can be deleted, so remember to delete any objects from your buckets before
* you try to delete them.
public static void deleteBucket(String bucketName) { s3.deleteBucket(bucketName);
生成预签名 URL
默认,S3 对象为私有,只有所有者具有访问权限。但是,对象所有者可以使用自己的安全凭证来创建预签名的 URL,授予有限时间内的对象下载许可,从而与其他用户共享对象,收到预签名 URL 的任何人都可以访问对象。
当创建预签名 URL 时,必须提供安全凭证、存储桶名称和对象键、HTTP 方法 (指定为 GET 来下载对象) 和过期时间。
public String generatePresignedUrl(String bucketName, String key, int minutes) {
// Sets the expiration date
Date expiration = new Date();
long expTimeMillis = expiration.getTime();
expTimeMillis += 1000 * 60 * minutes;
// Generate the presigned URL.
GeneratePresignedUrlRequest generatePresignedUrlRequest = new GeneratePresignedUrlRequest(bucketName, key).withMethod(HttpMethod.GET).withExpiration(expiration);
URL url = s3.generatePresignedUrl(generatePresignedUrlRequest);
return url.toString();}
利用 Amazon S3 Select,可以使用 SQL 语句筛选 S3 对象的内容,检索所需的部分数据。Amazon S3 Select 适用于以 CSV 或 JSON 格式存储的对象,这些对象可以通过 GZIP 或 BZIP2 压缩和服务器端加密。
S3 Select 的要求和限制
必须拥有所查询的对象的 s3:GetObject 权限。
如果查询的对象已进行加密,则必须使用 https,并必须在请求中提供加密密钥。
SQL 表达式的最大长度为 256 KB。
结果中记录的最大长度为 1 MB。
SQL 语法
Amazon S3 Select 支持部分 SQL,语法如下:
SELECT column_name FROM table_name [WHERE condition] [LIMIT number]
其中 table_name 为 S3Object。
SELECT 子句支持 *。
文件格式为 CSV 时,引用列可以使用列编号或列名,列编号从 1 开始:
select s._1 from S3Object s
Select s.name from S3Object s
使用列名时,程序中必须设置 FileHeaderInfo 为 Use。
SELECT s. name from S3Object s
比如,CSV 文件内容如下:
SQL 语句可以为:
select s.email from S3Object s where s.username= Jason
更多 SQL 信息请查看 Amazon S3 Select 和 Amazon Glacier Select 的 SQL 参考。
查询 CSV 文件
以下示例将查询结果保存在 outputPath 文件中:
public static void selectCsvObjectContent(String bucketName, String csvObjectKey, String sql, String outputPath) throws Exception { SelectObjectContentRequest request = generateBaseCSVRequest(bucketName, csvObjectKey, sql);
final AtomicBoolean isResultComplete = new AtomicBoolean(false);
try (OutputStream fileOutputStream = new FileOutputStream(new File(outputPath));
SelectObjectContentResult result = s3.selectObjectContent(request)) { InputStream resultInputStream = result.getPayload().getRecordsInputStream( new SelectObjectContentEventVisitor() {
* An End Event informs that the request has finished successfully.
public void visit(SelectObjectContentEvent.EndEvent event) { isResultComplete.set(true);
copy(resultInputStream, fileOutputStream);
* The End Event indicates all matching records have been transmitted. If the End Event is not received, the results may be incomplete.
if (!isResultComplete.get()) {
throw new Exception( S3 Select request was incomplete as End Event was not received.
private static SelectObjectContentRequest generateBaseCSVRequest(String bucket, String key, String query) { SelectObjectContentRequest request = new SelectObjectContentRequest();
InputSerialization inputSerialization = new InputSerialization();
CSVInput csvInput = new CSVInput();
OutputSerialization outputSerialization = new OutputSerialization();
outputSerialization.setCsv(new CSVOutput());
return request;
