Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions common/remote/hooks/s3/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@
# limitations under the License.
################################################################################


# Component: s3file
#####################################################
# Description:
Expand All @@ -29,8 +28,11 @@ find_package(aws-cpp-sdk-core REQUIRED)
find_package(aws-cpp-sdk-s3 REQUIRED)

set ( SRCS
s3api.cpp
s3file.cpp
s3file.hpp
s3utils.cpp
s3utils.hpp
)

include_directories (
Expand Down Expand Up @@ -65,7 +67,6 @@ if (USE_CPPUNIT)

include_directories( ${HPCC_SOURCE_DIR}/testing/unittests )

# Set runtime path to find s3file library in filehooks directory
set_target_properties(s3FileTests PROPERTIES
INSTALL_RPATH "${CMAKE_INSTALL_PREFIX}/filehooks"
)
Expand Down
66 changes: 66 additions & 0 deletions common/remote/hooks/s3/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# S3 Direct API File Hook

Storage hook for accessing S3 objects as HPCC files using the AWS C++ SDK.

## File Format

```
s3:<planeName>/<key>
s3:<planeName>/d<N>/<key> (multi-device)
```

## Architecture

The hook registers via `installFileHook()` at process startup. When any HPCC
component calls `createIFile("s3:...")`, the hook intercepts it and returns an
`S3File` that implements `IFile`. Opening for read returns `S3FileReadIO`,
opening for write returns `S3FileWriteIO` — both implement `IFileIO`.

## Features

- Read via `GetObject` with byte-range requests
- Write coalescing — buffers small writes, auto-selects between:
- `PutObject` for files under 8MB (single HTTP request)
- Multipart upload for files 8MB+ (8MB parts)
- 0-byte objects created for empty file parts (Thor multi-part compatibility)
- Server-side copy via `CopyObject` / multipart copy for large files
- Directory listing via `ListObjectsV2` with prefix/delimiter filtering
- Connection pooling — one S3 client per plane+device combination
- Retry with exponential backoff and jitter for all S3 operations
- IAM credential chain (IRSA on EKS, env vars, ~/.aws/credentials)

## Configuration

Storage plane in Helm values:

```yaml
storage:
planes:
- name: s3data
prefix: "s3:s3data"
category: data
storageapi:
type: s3
region: us-east-1
buckets:
- name: my-bucket
secret: my-secret # optional — omit for IAM roles
```

## Files

| File | Description |
|------|-------------|
| `s3file.cpp` | S3File, S3FileReadIO, S3FileWriteIO, S3MultipartUpload, S3DirectoryIterator |
| `s3file.hpp` | Public API — `installFileHook`, `createS3File`, `isS3FileName` |
| `s3api.cpp` | S3FileHook registration, S3APICopyClient, AWS SDK lifecycle |
| `s3utils.cpp` | S3ClientManager — client cache and creation |
| `s3utils.hpp` | Retry helpers, constants, `getS3Client()` API |
| `s3fileTests.cpp` | Unit tests — URL validation |
| `jplane_compat.hpp` | Build shim for cross-version `getStoragePlaneConfig` resolution |
| `CMakeLists.txt` | Build config — links against jlib + aws-cpp-sdk-{s3,core} |

## Build

Built as part of the platform via CMake, or as a standalone overlay — see
`dockerfiles/s3-hook-overlay.dockerfile` and `helm/examples/s3/README.md`.
27 changes: 27 additions & 0 deletions common/remote/hooks/s3/jplane_compat.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
#ifndef JPLANE_HPP
#define JPLANE_HPP

#include "jfile.hpp"
#include <dlfcn.h>

inline const IPropertyTree * getStoragePlaneConfig(const char * name, bool required)
{
// Runtime may have getStoragePlaneConfig(const char*, bool) or getStoragePlane(const char*)
// Use dlsym to find whichever exists
typedef IPropertyTree * (*fn2_t)(const char *, bool);
typedef IPropertyTree * (*fn1_t)(const char *);
static fn2_t fn2 = (fn2_t)dlsym(RTLD_DEFAULT, "_Z21getStoragePlaneConfigPKcb");
static fn1_t fn1 = (fn1_t)dlsym(RTLD_DEFAULT, "_Z15getStoragePlanePKc");

IPropertyTree * result = nullptr;
if (fn2)
result = fn2(name, required);
else if (fn1)
result = fn1(name);

if (!result && required)
throw makeStringExceptionV(99, "Storage plane '%s' not found", name);
return result;
}

#endif
Loading
Loading