Hosting a Static Website on a Private S3 Bucket
I host this blog on AWS for pennies, but I refused to take the easy route of making the S3 bucket public.
S3 Website Hosting is a convenient default, but it exposes the origin directly to the internet. I wanted a setup that was secure by design, highly performant, and treated “static” hosting as a first-class engineering problem.
So I engineered it properly.
The stack:
- Private S3 Bucket: No public access allowed.
- CloudFront: The only entity allowed to read from S3 (via Origin Access Control).
- CloudFront Functions: A lightweight edge function to handle “pretty URLs” (so
/aboutserves/about/index.html) and redirects. - GitHub Actions + OIDC: Zero-trust CI/CD that assumes a temporary AWS role to deploy.
- Terraform: The entire infrastructure is defined as code.
Here is how (and why) I built it.
Why This Stack?#
I chose CloudFront in front of a private S3 bucket because it minimizes attack surface and maximizes control. By keeping the bucket private, I can enforce HTTPS, manage headers, and centrally log traffic at the CDN level. Origin Access Control (OAC) is the modern way to securely link CloudFront to S3 without managing bucket policies manually.
I also hate storing AWS keys in GitHub secrets. It’s a security leak waiting to happen. Using OIDC allows GitHub Actions to assume a role only for the duration of the deployment.
High-Level Diagram#
Core Components#
- CloudFront Distribution: Offloads TLS, caching, and performance globally. CloudFront is the choke point for security headers and cache rules.
- S3 (Private) with OAC: S3 website hosting is convenient but public. A private S3 + OAC prevents direct access; only CloudFront can fetch objects.
- CloudFront Function: Cheaper/simpler than Lambda@Edge for simple viewer‑request logic. Perfect for “www → apex” and pretty‑URL rewrites.
- GitHub Actions with OIDC: No stored AWS keys. The workflow assumes a short‑lived role on every run; permissions are least‑privilege.
- Terraform: One place to define and review changes. Safer rollbacks and reproducible environments.
Request Flow#
- The user requests a page (e.g.,
/about/). - DNS resolves to the CloudFront distribution.
- A CloudFront Function handles
www→ apex and pretty URL rewriting. - CloudFront serves from cache; on a miss, it fetches from the private S3 origin via OAC.
- The response is cached and returned to the user over HTTPS.
How changes go live#
Here is how a code change turns into a new page. GitHub Actions checks out the repo and runs Hugo, which writes the static site into the public/ folder. The workflow then assumes an AWS role with OIDC and syncs that folder to S3 with the --delete flag so the bucket mirrors the build. Finally, it asks CloudFront to create an invalidation for /*. That nudges edge caches to pick up the new files right away.
Why this path? You could let caches expire naturally, or version every asset and avoid invalidations. For a personal site, one broad invalidation after deploy is simpler and the cost is tiny. Using OIDC means there are no saved AWS keys in the repository, and the IAM policy stays tight: S3 sync, list, multipart, and cloudfront:CreateInvalidation.
Deployment Flow#
At a glance: a push to main builds the site, uploads it to S3, and invalidates CloudFront so readers see the update immediately.
CloudFront Function (Redirects + Pretty URLs)#
function handler(event) {
var request = event.request;
var host = request.headers.host.value;
// 1) Redirect www → apex
if (host.startsWith('www.')) {
var apexDomain = host.substring(4);
return {
statusCode: 301,
statusDescription: 'Moved Permanently',
headers: { 'location': { value: 'https://' + apexDomain + request.uri } }
};
}
// 2) Pretty URL rewrite
// If URI ends in '/', append 'index.html'.
// If URI doesn't contain a dot in the last path segment, append '/index.html'.
var uri = request.uri;
if (uri.endsWith('/')) {
request.uri = uri + 'index.html';
} else {
// Check if the last segment contains a dot (heuristic for file extension)
var lastSegment = uri.split('/').pop();
if (lastSegment.indexOf('.') === -1) {
request.uri = uri + '/index.html';
}
}
return request;
}This approach preserves “pretty” URLs by handling standard cases (/about -> /about/index.html) while letting file requests pass through (/style.css). It relies on a dot heuristic for extensions, which is simple and effective for standard static site generators like Hugo.
Security Model#
The origin is private by design: S3 blocks public access and CloudFront authenticates to it with OAC. The CI pipeline doesn’t store AWS credentials; it uses OIDC to assume a role at runtime. That role only allows S3 sync (including multipart) and cloudfront:CreateInvalidation. Traffic is HTTPS end‑to‑end, and the CF Function enforces a single canonical host.
Why keep S3 private? It gives me one front door. Every request goes through CloudFront where I can enforce HTTPS, apply redirects and headers, and see consistent logs. Users cannot bypass the CDN to hit S3 directly, which protects the origin, improves cache hit rate, and avoids duplicate URLs on s3.amazonaws.com. I do not run AWS WAF today, but this setup makes it easy to add WAF at CloudFront later without changing the origin.
Observability and Cost#
CloudFront writes access logs to a dedicated S3 bucket so I can inspect traffic or plug Athena in later. A handful of CloudWatch alarms on 4xx/5xx spikes and origin errors are enough for a personal site. Cost stays low: most of it is data transfer. The Function is essentially free at this scale, and S3 requests are tiny compared to the CDN cache hit rate.
Trade‑offs and alternatives#
S3 Website Hosting is easy to set up and can do some rewrites, but it requires a public bucket and does not use OAC. The private S3 approach keeps the origin locked down and leaves user traffic at the CDN edge.
Lambda@Edge can run more complex logic and on more events, though it adds latency and cost. For two simple viewer tasks, a CloudFront Function is a better fit.
Other CI/CD tools like CodeBuild or CircleCI can deploy just fine. I used GitHub Actions because the code already lives there, and OIDC integration avoids managing long‑lived credentials.
Why This Design#
The goal is a setup that behaves well without constant attention. The origin stays private, the CDN handles speed and TLS, and the edge logic is as small as it can be. Publishing is just a git push, and the infrastructure lives in a few Terraform files I can come back to later without re‑learning context. Simple to run, quick to load, and easy to change.