Skip to main content
TensorPool Object Storage is S3-compatible storage for TensorPool Organizations. It works with standard S3 tools like boto3, rclone, s5cmd, etc.
To get a TensorPool Organization contact team@tensorpool.dev
For high-performance storage volumes that mount directly to clusters, see Storage Volumes.

Key Features

  • S3 API compatible: Works with any S3-compatible tool or library
  • No ingress/egress fees: Transfer data in and out without additional costs
  • Globally distributed: Data is cached across all TensorPool cluster regions globally with strong consistency
    • 99.99% of objects are globally available in all regions within 15 minutes. Small files are globally available in milliseconds.
    • This means you always get best-case latency as if you were using the closest region, regardless of your client location.
  • Unlimited storage: Billed on usage with no size limit. See pricing for details.
  • Organization-scoped: Credentials and buckets are shared across your organization

Quick Start

# 1. Enable object storage for your organization
tp object-storage enable
# 2. Create a bucket
tp object-storage bucket create datasets

# 3. Configure your S3 client
tp object-storage configure rclone
# Or view credentials directly with `tp object-storage credentials`

# 4. Use any S3-compatible tool to upload data
rclone copy ./data tp:datasets/ --ignore-checksum
The --ignore-checksum flag is required for all rclone commands. TensorPool Object Storage uses CRC32 checksums which rclone doesn’t support, causing operations to fail without this flag.Setting the RCLONE_IGNORE_CHECKSUM=true environment variable in your shell configuration is recommended so the flag doesn’t need to be passed each time. See Shell Configuration Files for how to edit your shell config.

Using boto3

boto3 is the official AWS SDK for Python and can be used to interact with any S3-compatible storage, including TensorPool Object Storage.
import boto3

# Get credentials from: tp object-storage credentials
s3 = boto3.client(
    "s3",
    endpoint_url="<your-endpoint>",
    aws_access_key_id="<your-access-key>",
    aws_secret_access_key="<your-secret-key>",
    region_name="global"
)

# Upload a file
s3.upload_file("model.pt", "my-bucket", "checkpoints/model.pt")

# List objects
response = s3.list_objects_v2(Bucket="my-bucket", Prefix="checkpoints/")
for obj in response.get("Contents", []):
    print(obj["Key"])

Bucket Management

# List all buckets
tp object-storage bucket list

# Create a bucket
tp object-storage bucket create my-bucket

# Delete an empty bucket
tp object-storage bucket delete my-bucket

FUSE Mounting

Object storage buckets can optionally be mounted to TensorPool clusters via FUSE. While convenient, FUSE mounts trade performance for filesystem compatibility.
Prefer S3 API tools over FUSE mounts. Using boto3, rclone, or other S3-compatible tools is strongly recommended. FUSE mounts add significant overhead and should only be used when filesystem semantics are required.
If you need a FUSE mount, the following rclone mount command is recommended:
sudo apt update && sudo apt install -y --only-upgrade rclone fuse3 || sudo apt install -y rclone fuse3


# 2. Setup rclone config
mkdir -p ~/.config/rclone
cat > ~/.config/rclone/rclone.conf << 'EOF'
[tp]
type = s3
provider = Other
access_key_id = <your_access_key_id_here>
secret_access_key = <your_access_key_here>
endpoint = <your_endpoint_here>
region = global
v2_auth = false
force_path_style = false
disable_checksum = true
EOF

# 3. Find volume with most available space
export LARGEST_MOUNT_POINT=$(df -P -k -x tmpfs -x devtmpfs | tail -n +2 | sort -k4 -nr | head -n 1 | awk '{print $6}')
export AVAILABLE_SPACE_KB=$(df -P -k -x tmpfs -x devtmpfs | tail -n +2 | sort -k4 -nr | head -n 1 | awk '{print $4}')

# 4. Calculate 90% limit
export CACHE_LIMIT_KB=$((AVAILABLE_SPACE_KB * 90 / 100))

# 5. Set cache directory path and mount point path
export CACHE_DIR="${LARGEST_MOUNT_POINT%/}/rclonecache"

# 6. Create the cache directory
mkdir -p "$CACHE_DIR"

# 7. Create and configure mount point
sudo mkdir -p "$LARGEST_MOUNT_POINT/temp"
sudo chown $USER:$USER "$LARGEST_MOUNT_POINT/temp"

# 8 Increase Go garbage collection threshold to reduce the memory overhead
export GOGC=400

# 9. Mount bucket with rclone
rclone mount tp:<bucket name> "$LARGEST_MOUNT_POINT/temp" \
 --no-unicode-normalization \
 --vfs-fast-fingerprint \
 --cache-dir "$CACHE_DIR" \
 --vfs-cache-max-size "${CACHE_LIMIT_KB}K" \
 --vfs-cache-mode full \
 --vfs-cache-max-age 24h \
 --no-modtime \
 --vfs-cache-poll-interval 0 \
 --vfs-cache-min-free-space 50G \
 --no-checksum \
 --ignore-checksum \
 --s3-disable-checksum \
 --vfs-read-ahead 16M \
 --vfs-read-chunk-size 8M \
 --vfs-read-chunk-size-limit 64M \
 --vfs-read-chunk-streams 4 \
 --buffer-size 4M \
 --transfers 48 \
 --checkers 48 \
 --copy-links \
 --max-backlog 2000000 \
 --tpslimit 0 \
 --use-mmap \
 --tpslimit-burst 0 \
 --s3-chunk-size 32M \
 --s3-upload-concurrency 8 \
 --dir-cache-time 15m \
 --allow-non-empty \
 --attr-timeout 1m \
 --allow-other \
 --umask 0000 \
 --vfs-write-back 5s \
 --log-level INFO \
 --log-file /tmp/rclone.log \
 --daemon
Due to the request-based nature of object storage, every file operation incurs fixed overhead regardless of file size:
  • FUSE overhead: User-space/kernel context switches per syscall
  • S3 API overhead: HTTP request/response cycle
For large files this overhead is negligible. For small files (under 100KB), the overhead dominates the operation time. For example, a simple touch file.txt translates to 3 S3 API calls (HeadObject, PutObject, ListObjectsV2) under the hood.Traditionally cheap operations like ls are time-intensive because object storage has no directory hierarchy — listing requires querying all objects with a matching prefix.
Mounted object storage buckets are not POSIX compliant. Unsupported features:
  • Hard links
  • Setting file permissions (chmod)
  • Sticky, set-user-ID (SUID), and set-group-ID (SGID) bits
  • Updating the modification timestamp (mtime)
  • Creating and using FIFOs (first-in-first-out) pipes
  • Creating and using Unix sockets
  • Obtaining exclusive file locks
  • Unlinking an open file while it is still readable
While symlinks are supported, their use is discouraged. Symlink targets may not exist across all clusters, which can cause unexpected behavior.Using python virtual environments within an object storage bucket is not recommended due to virtual environments’ use of symlinks and large number (~1000) of small files.

Next Steps