Self-Hosted Preview Environments: GitHub Actions + Docker

Self-hosted preview environments are ephemeral deployments that spin up per pull request on infrastructure you own, AWS ECS, Cloud Run, DigitalOcean, Hetzner, or Fly.io, giving every PR a real, isolated URL before merge. This article builds the complete GitHub Actions preview environment with Docker: five components (Build, Deploy, Route, Test, Teardown), a production-ready preview deployment workflow YAML, and a cloud provider comparison. The testing component ships two paths: 30+ lines of Playwright YAML, or one API call to Autonoma.

Vercel preview environments are excellent, until they're not. The moment your stack includes a backend API, a background worker, a websocket server, or a database that needs seeding, Vercel's preview model stops working. You can preview the frontend. You can't preview the system.

That's the gap self-hosted preview environments fill. Every PR gets a Docker-composed, fully wired-up deployment on infrastructure you control. The URL is real. The backend is real. The database state is real. When a reviewer clicks the preview link, they're looking at the actual system, not a shimmed frontend connected to a shared staging API that may or may not match the PR's backend changes.

This article is the reference implementation. We've seen enough fragmented blog posts on this topic that stitch together half a workflow with outdated cloud CLI flags. This one is complete: five components, full YAML, five cloud providers, and an honest comparison of cost and complexity for each.

Why Self-Host Preview Environments?

The open-source Vercel alternatives like Coolify and Dokku get you some of this, but they're platform layers with their own abstractions. Self-hosting with Docker and GitHub Actions puts you in direct control of the primitives: the container, the network, the URL routing, the cleanup strategy. That control matters when you need things that managed platforms can't give you.

Full-stack isolation is the main one. A preview environment that only runs your Next.js app isn't useful when the PR also changes three API endpoints. Self-hosted preview environments run the whole Docker Compose stack, frontend, API, workers, sidecars, in one ephemeral unit per PR.

Cost predictability is the other. Managed preview platforms charge per seat, per deployment, or per bandwidth GB. Your own AWS ECS tasks or Hetzner VPS cost what they cost, on infrastructure you already have billing relationships with, and those costs go to zero when you delete the resources.

The tradeoff is real: you own the plumbing. Five components need to work together correctly. That's what the rest of this article builds.

The DIY Preview Stack

Every self-hosted preview environment is five components working together. Understanding them separately makes the GitHub Actions workflow easier to read and easier to debug when something breaks.

Build creates a Docker image for every PR push. The image is tagged with the PR number and pushed to a registry your cloud can pull from. This is the most portable component, the same Dockerfile works everywhere, and the same GitHub Actions step works for any target cloud.

Deploy takes that image and runs it somewhere with a stable address. The address can't be hardcoded because it changes per PR. The deploy step provisions a new resource (ECS service, Cloud Run revision, Fly app, Docker container on a VPS) and outputs the URL as a step output for downstream steps.

Route gives the deployed resource a human-readable URL. A raw ECS load balancer URL or Cloud Run service URL works technically, but PR-specific subdomains (pr-42.preview.yourapp.com) are far easier to share in Slack and review comments. Wildcard DNS handles this without per-PR DNS writes.

Test runs E2E tests against the deployed URL before the PR is mergeable. This is where most teams underestimate the YAML. Installing Playwright, managing browser binaries, waiting for the deployment to become healthy, running tests with the right base URL, and uploading reports is 30+ lines of workflow that needs maintenance every time Playwright updates. We'll show both the full Playwright path and the one-API-call alternative.

Teardown deletes the cloud resource and, optionally, the DNS record when the PR closes or merges. Without this step, preview environments accumulate like stale branches. A single missed teardown is harmless. Twelve of them running in parallel cost real money and cause port conflicts on shared infrastructure.

DIY self-hosted preview environment stack: five components (Build Docker image, Deploy, Route, Test, Teardown) orchestrated by GitHub Actions

Component 1: Building Docker Preview Images with GitHub Actions

The build step is straightforward but has one decision worth getting right upfront: multi-stage builds. A single-stage Dockerfile that installs all dev dependencies, builds, and runs produces an image that's 800MB to 1.2GB for a typical Next.js app. A multi-stage build produces an image under 200MB by discarding the build toolchain from the final image.

Here's the production-ready multi-stage Dockerfile. The builder stage installs dependencies and runs the Next.js build. The runner stage copies only the compiled output, the .next/standalone directory, into a minimal Node Alpine image:

# syntax=docker/dockerfile:1.6
#
# Multi-stage Dockerfile for a Next.js app using the standalone output mode.
#
# Requires `output: 'standalone'` in next.config.js. With that set, Next.js
# emits a self-contained server under .next/standalone that bundles only the
# runtime dependencies actually used, so the final image stays small.

# ---------- deps ----------
FROM node:20-alpine AS deps
WORKDIR /app

# Install system packages that some Node native modules need during install.
RUN apk add --no-cache libc6-compat

COPY package.json package-lock.json* ./
RUN npm ci

# ---------- builder ----------
FROM node:20-alpine AS builder
WORKDIR /app

ENV NEXT_TELEMETRY_DISABLED=1

COPY --from=deps /app/node_modules ./node_modules
COPY . .

RUN npm run build

# ---------- runner ----------
FROM node:20-alpine AS runner
WORKDIR /app

ENV NODE_ENV=production
ENV NEXT_TELEMETRY_DISABLED=1
ENV PORT=3000
ENV HOSTNAME=0.0.0.0

# Non-root runtime user for a smaller blast radius.
RUN addgroup --system --gid 1001 nodejs \
 && adduser  --system --uid 1001 nextjs

# Public assets and the standalone server bundle.
COPY --from=builder /app/public ./public
COPY --from=builder --chown=nextjs:nodejs /app/.next/standalone ./
COPY --from=builder --chown=nextjs:nodejs /app/.next/static ./.next/static

USER nextjs

EXPOSE 3000

# The standalone output generates a server.js at the image root.
CMD ["node", "server.js"]

The .dockerignore file is equally important. Without it, Docker sends the entire project to the build context including node_modules, .git, and generated files, which adds 30-60 seconds to every build:

# Version control
.git
.gitignore
.gitattributes

# Dependencies (re-installed inside the image)
node_modules
npm-debug.log*
yarn-debug.log*
yarn-error.log*
pnpm-debug.log*

# Next.js build artifacts (rebuilt inside the image)
.next
out
dist
build

# Environment files (never bake secrets into images)
.env
.env.*
!.env.example

# Tests and fixtures (not needed at runtime)
tests
__tests__
e2e
playwright-report
test-results
coverage

# Docs and meta (not needed at runtime)
README.md
CHANGELOG.md
LICENSE
docs

# Editor / OS noise
.vscode
.idea
.DS_Store
*.swp
*.swo

# CI config (the runner doesn't need these)
.github
.gitlab-ci.yml

# Local tooling
.eslintcache
.turbo

The GitHub Actions build step uses Docker's layer caching via cache-from and cache-to with the GitHub Actions cache backend. Uncached first builds take 3-5 minutes. Cached subsequent builds for the same PR take under 60 seconds because only changed layers rebuild.

Component 2: Deploying to a Temporary URL

The deploy step is where cloud providers diverge. Five options cover the realistic range of self-hosting choices, each with different trade-offs on cost, setup complexity, and operational model.

AWS ECS/Fargate is the most production-grade option. Each preview is a separate ECS service in a shared cluster. Fargate handles server provisioning. The deploy command registers a new task definition, creates or updates the service, and waits for the service to stabilize. Teardown deletes the service. The URL comes from an Application Load Balancer with a listener rule for the PR-specific host header. This is the highest-complexity option but scales to hundreds of simultaneous previews without manual intervention. Reference: AWS ECS update-service documentation.

GCP Cloud Run is significantly simpler. Each preview is a new Cloud Run revision. The deploy command is one gcloud run deploy call. Cloud Run generates a URL automatically, no load balancer to configure. Teardown deletes the revision. The URL pattern is deterministic: pr-NUMBER-HASH-REGION.a.run.app. The main limitation is that Cloud Run requires containerized HTTP services; it doesn't run Docker Compose stacks natively. Reference: Cloud Run deploying docs.

DigitalOcean App Platform sits between managed and DIY. Each preview is a new App. The deploy uses the doctl CLI and a spec file that references your image. Teardown is doctl apps delete. Apps get APP-NAME.ondigitalocean.app URLs. Setup complexity is low; cost per preview is higher than Hetzner for long-lived previews but lower than AWS for short-lived ones.

Hetzner with Docker on a VPS is the cost-optimized option for teams with moderate preview counts. A shared CX21 server (€5-6/month, 2 vCPU, 4 GB RAM) can host 10-20 simultaneous previews as Docker containers on different ports, proxied by Nginx. Deploy is SSH + docker run. Teardown is SSH + docker rm. URL routing requires Nginx config management, which is the added complexity versus cloud-native options.

Fly.io is the developer-experience-first option. Each preview is a Fly app. flyctl deploy handles everything, image push, machine provisioning, health check waiting. The URL is pr-NUMBER-APPNAME.fly.dev. Teardown is flyctl apps destroy. Fly scales to zero between requests by default, which means cold starts on first access but zero idle cost. Reference: flyctl deploy documentation.

Here's the deploy script for AWS ECS, the most complex case, which makes the others straightforward by comparison:

#!/usr/bin/env bash
#
# deploy-aws.sh — Deploy (or update) a preview environment on AWS ECS.
#
# Required env vars:
#   PR_NUMBER      Pull request number (e.g. 1234)
#   IMAGE_TAG      Fully qualified image reference, e.g.
#                  123456789012.dkr.ecr.us-east-1.amazonaws.com/app:pr-1234
#   CLUSTER_NAME   ECS cluster name (default: previews)
#   AWS_REGION     AWS region (default: us-east-1)
#   TASK_ROLE_ARN  IAM role ARN the task assumes
#   EXEC_ROLE_ARN  IAM role ARN ECS uses to pull the image / write logs
#   SUBNETS        Comma-separated private subnet IDs for the ENI
#   SECURITY_GROUPS Comma-separated security group IDs for the ENI
#   CONTAINER_PORT Port the container listens on (default: 3000)
#   PREVIEW_DOMAIN Base preview domain (default: preview.example.com)
#
# Outputs (stdout):
#   preview_url=<https url the ALB routes to this service>
#
# The script registers a new task definition, creates or updates the service,
# and blocks until ECS reports the service as stable.

set -euo pipefail

: "${PR_NUMBER:?PR_NUMBER is required}"
: "${IMAGE_TAG:?IMAGE_TAG is required}"
: "${TASK_ROLE_ARN:?TASK_ROLE_ARN is required}"
: "${EXEC_ROLE_ARN:?EXEC_ROLE_ARN is required}"
: "${SUBNETS:?SUBNETS (comma-separated subnet ids) is required}"
: "${SECURITY_GROUPS:?SECURITY_GROUPS (comma-separated sg ids) is required}"

CLUSTER_NAME="${CLUSTER_NAME:-previews}"
AWS_REGION="${AWS_REGION:-us-east-1}"
CONTAINER_PORT="${CONTAINER_PORT:-3000}"
PREVIEW_DOMAIN="${PREVIEW_DOMAIN:-preview.example.com}"

FAMILY="preview-pr-${PR_NUMBER}"
SERVICE_NAME="preview-pr-${PR_NUMBER}"
LOG_GROUP="/ecs/${FAMILY}"

echo ">> Ensuring CloudWatch log group ${LOG_GROUP} exists"
aws logs create-log-group \
  --log-group-name "${LOG_GROUP}" \
  --region "${AWS_REGION}" 2>/dev/null || true

echo ">> Registering task definition ${FAMILY}"
TASK_DEF_JSON=$(cat <<JSON
{
  "family": "${FAMILY}",
  "networkMode": "awsvpc",
  "requiresCompatibilities": ["FARGATE"],
  "cpu": "512",
  "memory": "1024",
  "executionRoleArn": "${EXEC_ROLE_ARN}",
  "taskRoleArn": "${TASK_ROLE_ARN}",
  "containerDefinitions": [
    {
      "name": "app",
      "image": "${IMAGE_TAG}",
      "essential": true,
      "portMappings": [
        { "containerPort": ${CONTAINER_PORT}, "protocol": "tcp" }
      ],
      "environment": [
        { "name": "NODE_ENV", "value": "production" },
        { "name": "PR_NUMBER", "value": "${PR_NUMBER}" }
      ],
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "${LOG_GROUP}",
          "awslogs-region": "${AWS_REGION}",
          "awslogs-stream-prefix": "ecs"
        }
      }
    }
  ]
}
JSON
)

TASK_DEF_ARN=$(aws ecs register-task-definition \
  --region "${AWS_REGION}" \
  --cli-input-json "${TASK_DEF_JSON}" \
  --query 'taskDefinition.taskDefinitionArn' \
  --output text)

echo ">> Registered ${TASK_DEF_ARN}"

# Build the awsvpc network configuration from the comma-separated inputs.
NETWORK_CONFIG=$(cat <<JSON
{
  "awsvpcConfiguration": {
    "subnets": [$(echo "\"${SUBNETS}\"" | sed 's/,/","/g')],
    "securityGroups": [$(echo "\"${SECURITY_GROUPS}\"" | sed 's/,/","/g')],
    "assignPublicIp": "DISABLED"
  }
}
JSON
)

# Create the service if it doesn't exist; otherwise update it.
EXISTING=$(aws ecs describe-services \
  --region "${AWS_REGION}" \
  --cluster "${CLUSTER_NAME}" \
  --services "${SERVICE_NAME}" \
  --query 'services[?status==`ACTIVE`] | length(@)' \
  --output text)

if [ "${EXISTING}" = "0" ]; then
  echo ">> Creating ECS service ${SERVICE_NAME} in cluster ${CLUSTER_NAME}"
  aws ecs create-service \
    --region "${AWS_REGION}" \
    --cluster "${CLUSTER_NAME}" \
    --service-name "${SERVICE_NAME}" \
    --task-definition "${TASK_DEF_ARN}" \
    --desired-count 1 \
    --launch-type FARGATE \
    --network-configuration "${NETWORK_CONFIG}" \
    --tags "key=preview,value=pr-${PR_NUMBER}" >/dev/null
else
  echo ">> Updating existing ECS service ${SERVICE_NAME}"
  aws ecs update-service \
    --region "${AWS_REGION}" \
    --cluster "${CLUSTER_NAME}" \
    --service "${SERVICE_NAME}" \
    --task-definition "${TASK_DEF_ARN}" \
    --desired-count 1 \
    --force-new-deployment >/dev/null
fi

echo ">> Waiting for service ${SERVICE_NAME} to reach a stable state"
aws ecs wait services-stable \
  --region "${AWS_REGION}" \
  --cluster "${CLUSTER_NAME}" \
  --services "${SERVICE_NAME}"

# The ALB in front of the cluster routes by Host header. A listener rule
# keyed to pr-${PR_NUMBER}.${PREVIEW_DOMAIN} should forward to the service's
# target group (provisioned once per preview by separate infra).
PREVIEW_URL="https://pr-${PR_NUMBER}.${PREVIEW_DOMAIN}"

echo ">> Preview ready: ${PREVIEW_URL}"
echo "preview_url=${PREVIEW_URL}"

The Cloud Run equivalent, which shows how much simpler a serverless-container model is:

#!/usr/bin/env bash
#
# deploy-gcp.sh — Deploy a preview environment to Google Cloud Run.
#
# Required env vars:
#   PR_NUMBER   Pull request number (e.g. 1234)
#   IMAGE       Fully qualified image, e.g.
#               us-central1-docker.pkg.dev/my-project/app/app:pr-1234
#   REGION      Cloud Run region (default: us-central1)
#   PROJECT_ID  GCP project id (optional; falls back to gcloud's default)
#
# Outputs (stdout):
#   preview_url=<https url assigned by Cloud Run>
#
# Idempotent: `gcloud run deploy` creates the service on first run and updates
# it on subsequent runs, so the same script works for `opened` and
# `synchronize` PR events.

set -euo pipefail

: "${PR_NUMBER:?PR_NUMBER is required}"
: "${IMAGE:?IMAGE is required}"

REGION="${REGION:-us-central1}"
SERVICE="preview-pr-${PR_NUMBER}"

PROJECT_FLAG=()
if [ -n "${PROJECT_ID:-}" ]; then
  PROJECT_FLAG=(--project "${PROJECT_ID}")
fi

echo ">> Deploying ${SERVICE} to Cloud Run (${REGION})"
gcloud run deploy "${SERVICE}" \
  "${PROJECT_FLAG[@]}" \
  --image "${IMAGE}" \
  --region "${REGION}" \
  --platform managed \
  --allow-unauthenticated \
  --memory 512Mi \
  --cpu 1 \
  --port 3000 \
  --max-instances 2 \
  --set-env-vars "NODE_ENV=production,PR_NUMBER=${PR_NUMBER}" \
  --labels "preview=true,pr=${PR_NUMBER}" \
  --quiet

PREVIEW_URL=$(gcloud run services describe "${SERVICE}" \
  "${PROJECT_FLAG[@]}" \
  --region "${REGION}" \
  --format 'value(status.url)')

if [ -z "${PREVIEW_URL}" ]; then
  echo "ERROR: Cloud Run did not return a service URL for ${SERVICE}" >&2
  exit 1
fi

echo ">> Preview ready: ${PREVIEW_URL}"
echo "preview_url=${PREVIEW_URL}"

The Fly.io script, which is the most minimal:

#!/usr/bin/env bash
#
# deploy-fly.sh — Deploy a preview environment to Fly.io.
#
# Required env vars:
#   PR_NUMBER    Pull request number (e.g. 1234)
#   IMAGE        Fully qualified image reference, e.g.
#                ghcr.io/my-org/app:pr-1234
#   FLY_ORG      Fly.io org slug (default: personal)
#   FLY_REGION   Primary region (default: iad)
#
# Requires the FLY_API_TOKEN env var (or a prior `flyctl auth login`).
#
# Outputs (stdout):
#   preview_url=https://preview-pr-<PR_NUMBER>.fly.dev

set -euo pipefail

: "${PR_NUMBER:?PR_NUMBER is required}"
: "${IMAGE:?IMAGE is required}"

FLY_ORG="${FLY_ORG:-personal}"
FLY_REGION="${FLY_REGION:-iad}"
APP_NAME="preview-pr-${PR_NUMBER}"

echo ">> Ensuring Fly app ${APP_NAME} exists (org=${FLY_ORG})"
if ! flyctl apps list --json | grep -q "\"Name\": *\"${APP_NAME}\""; then
  flyctl apps create "${APP_NAME}" --org "${FLY_ORG}"
else
  echo ">> App ${APP_NAME} already exists, reusing it"
fi

echo ">> Deploying image ${IMAGE} to ${APP_NAME}"
# --ha=false keeps one machine per region (plenty for a preview).
# --now skips the interactive confirmation on first deploy.
flyctl deploy \
  --app "${APP_NAME}" \
  --image "${IMAGE}" \
  --primary-region "${FLY_REGION}" \
  --ha=false \
  --now \
  --strategy immediate \
  --wait-timeout 300

PREVIEW_URL="https://${APP_NAME}.fly.dev"

echo ">> Preview ready: ${PREVIEW_URL}"
echo "preview_url=${PREVIEW_URL}"

Hetzner and DigitalOcean scripts follow the same pattern, environment-specific CLI commands with PR number as the identifier. Both are in the companion repo as scripts/deploy-hetzner.sh and scripts/deploy-do.sh.

Component 3: DNS and Routing for Preview URLs

Raw cloud URLs, an ECS load balancer DNS name or a Cloud Run service URL, work but are ugly and hard to share. PR-specific subdomains are better for every workflow that involves a human clicking a link.

The wildcard DNS approach requires one record: *.preview.yourapp.com CNAME your-alb.us-east-1.elb.amazonaws.com. Every PR subdomain resolves to the same load balancer, which routes by host header to the right ECS service. No per-PR DNS writes. The downside is that you need SSL certificates for the wildcard domain, AWS Certificate Manager handles this in one click for ACM-issued certs on ALBs.

For Cloud Run, you skip DNS entirely if you're willing to accept the generated URL. The URL is stable for the lifetime of the revision and is included in the deploy command output. The GitHub Actions step captures it and posts it to the PR as a comment. Many teams find this acceptable, the URL is ugly but functional and reviewers can still click it.

For Hetzner and shared-VPS setups, Nginx reverse proxy with a dynamic config is the practical approach. The deploy script writes a new Nginx server block for the PR, then reloads Nginx. The teardown script removes the block and reloads again. This works but requires Nginx to be running on the host and the GitHub Actions runner to have SSH access with sudo privileges for Nginx reload.

A cleaner alternative on a single-VPS setup is Traefik with Docker labels. Each preview container gets labels like traefik.http.routers.pr-42.rule=Host(`pr-42.preview.yourapp.com`) at docker run time. Traefik discovers the label, provisions a Let's Encrypt cert on demand, and routes traffic automatically. Teardown becomes a plain docker rm: Traefik notices the container is gone and removes the route itself. No Nginx config management, no reload choreography, no SSH privilege to worry about. For teams running previews on Hetzner or DigitalOcean Droplets, Traefik is the modern default.

For Fly.io, the generated URL is clean enough that most teams use it directly: pr-42-yourapp.fly.dev reads well in a PR comment.

Wildcard DNS costs one DNS record and one SSL certificate. It buys you unlimited clean PR subdomains forever. The alternative is managing DNS per PR, which is one outage away from being a four-hour debugging session about why your wildcard cert doesn't cover the new subdomain format you introduced last week.

Component 4: Running E2E Tests Against the Preview

This is where the workflow earns its value or exposes its maintenance burden. Deploying a preview is straightforward. Testing it is where the complexity lives, and where most teams end up with CI YAML that nobody wants to touch.

The Playwright path is the default. You add steps to install Node, install Playwright, install browsers, wait for the preview URL to respond with a health check, run npx playwright test with BASE_URL set to the preview URL, and upload the HTML report as an artifact so failed runs are debuggable. Here's the full workflow, 30+ lines and growing:

name: preview-test-playwright

# Reusable workflow: run Playwright E2E tests against a preview URL.
#
# Call it from another workflow with:
#
#   jobs:
#     test:
#       uses: ./.github/workflows/preview-test-playwright.yml
#       with:
#         preview_url: ${{ needs.deploy.outputs.url }}

on:
  workflow_call:
    inputs:
      preview_url:
        description: Fully qualified URL of the preview environment
        required: true
        type: string
      health_path:
        description: Path to poll for readiness
        required: false
        type: string
        default: /
      health_timeout_seconds:
        description: Max seconds to wait for the preview to respond
        required: false
        type: number
        default: 180

jobs:
  playwright:
    name: Playwright (${{ inputs.preview_url }})
    runs-on: ubuntu-latest
    timeout-minutes: 20

    steps:
      - name: Checkout
        uses: actions/checkout@v4

      - name: Setup Node.js
        uses: actions/setup-node@v4
        with:
          node-version: "20"
          cache: npm

      - name: Install dependencies
        run: npm ci

      - name: Install Playwright browsers
        run: npx playwright install --with-deps chromium

      - name: Wait for preview to be reachable
        env:
          URL: ${{ inputs.preview_url }}${{ inputs.health_path }}
          TIMEOUT: ${{ inputs.health_timeout_seconds }}
        run: |
          echo "Polling ${URL} for up to ${TIMEOUT}s"
          deadline=$((SECONDS + TIMEOUT))
          until curl -fsS --max-time 5 -o /dev/null "${URL}"; do
            if [ "${SECONDS}" -ge "${deadline}" ]; then
              echo "Preview never became reachable within ${TIMEOUT}s" >&2
              exit 1
            fi
            echo "  not ready yet, retrying in 5s..."
            sleep 5
          done
          echo "Preview responded OK"

      - name: Run Playwright tests
        env:
          BASE_URL: ${{ inputs.preview_url }}
        run: npx playwright test

      - name: Upload Playwright report
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: playwright-report
          path: playwright-report
          retention-days: 7
          if-no-files-found: ignore

This works. It also requires you to own browser installation failures, flakiness from timing issues, artifact expiration policies, and the inevitable Playwright major version upgrade that breaks your CI every 12-18 months. If your team has a dedicated QA infrastructure engineer, this is manageable. If the platform engineer who built the preview stack is also the person responsible when tests break, it accumulates as technical debt.

The Autonoma path replaces all of that with one API call. Our Planner agent reads your codebase, routes, components, user flows, and generates test cases. Our Automator agent runs them against the preview URL you pass in. Results post back to the PR. Here's the entire testing workflow:

name: preview-test-autonoma

# Reusable workflow: run an Autonoma test folder against a preview URL.
# Requires:
#   - repo/org variable AUTONOMA_FOLDER_ID
#   - repo/org secret  AUTONOMA_API_KEY

on:
  workflow_call:
    inputs:
      preview_url:
        description: Fully qualified URL of the preview environment
        required: true
        type: string

jobs:
  autonoma:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Run Autonoma folder
        run: |
          curl -fsS -X POST \
            "https://api.prod.autonoma.app/v1/run/folder/${{ vars.AUTONOMA_FOLDER_ID }}" \
            -H "Authorization: Bearer ${{ secrets.AUTONOMA_API_KEY }}" \
            -H "Content-Type: application/json" \
            -d '{"base_url": "${{ inputs.preview_url }}"}'

The API call is a POST to https://api.prod.autonoma.app/v1/run/folder/<folder-id> with your preview URL as the base_url parameter and your API key in the Authorization header. The folder ID maps to the test suite configuration in Autonoma's dashboard. You can also use the official GitHub Action if you prefer declarative steps over a raw API call, but for self-hosted teams the API approach is more portable, it works in any CI system, not just GitHub Actions.

For a deeper look at E2E testing strategies specifically for preview environments, our E2E testing preview environments guide covers test isolation, database state handling, and authentication patterns in detail.

Component 5: Teardown on Merge or Close

The teardown workflow is the component most teams implement last and regret skipping first. Without it, every closed PR leaves a running cloud resource. On AWS ECS, that's a task consuming Fargate vCPU. On Hetzner, that's a container using RAM. On Fly.io, it's an app that cold-starts on accident.

The teardown trigger is pull_request with types: [closed]. GitHub fires this event for both merged and manually closed PRs. The PR number is available as github.event.number, which gives you the identifier to delete the right resource.

Here's the teardown script, it handles ECS service deletion and DNS cleanup, with comments showing the equivalent commands for other clouds:

#!/usr/bin/env bash
#
# teardown.sh — Destroy the preview environment for a given pull request on
# any of the five supported providers.
#
# Required env vars:
#   PR_NUMBER   Pull request number (e.g. 1234)
#   PROVIDER    One of: aws | gcp | fly | digitalocean | hetzner | all
#               Defaults to "all" which attempts each provider in sequence.
#
# Optional env vars:
#   CLUSTER_NAME   (aws) ECS cluster name (default: previews)
#   AWS_REGION     (aws) default: us-east-1
#   REGION         (gcp) Cloud Run region (default: us-central1)
#   PROJECT_ID     (gcp) optional project override
#
# Idempotency pattern
# -------------------
# Every provider call ends in `|| true`. This is deliberate: the script is
# called from the `closed` PR event, and a PR may be closed without ever
# having deployed (draft, CI failed, first push closed), so the resources we
# target may not exist. Treating "already deleted" as success keeps the job
# green and keeps teardown cheap to call repeatedly.

set -uo pipefail

: "${PR_NUMBER:?PR_NUMBER is required}"
PROVIDER="${PROVIDER:-all}"

APP_NAME="preview-pr-${PR_NUMBER}"

teardown_aws() {
  local cluster="${CLUSTER_NAME:-previews}"
  local region="${AWS_REGION:-us-east-1}"
  echo ">> [aws] Scaling ${APP_NAME} to 0 on cluster ${cluster}"
  aws ecs update-service \
    --region "${region}" \
    --cluster "${cluster}" \
    --service "${APP_NAME}" \
    --desired-count 0 >/dev/null 2>&1 || true

  echo ">> [aws] Deleting service ${APP_NAME}"
  aws ecs delete-service \
    --region "${region}" \
    --cluster "${cluster}" \
    --service "${APP_NAME}" \
    --force >/dev/null 2>&1 || true

  echo ">> [aws] Deregistering task definitions for family ${APP_NAME}"
  local arns
  arns=$(aws ecs list-task-definitions \
    --region "${region}" \
    --family-prefix "${APP_NAME}" \
    --query 'taskDefinitionArns' \
    --output text 2>/dev/null || true)
  for arn in ${arns}; do
    aws ecs deregister-task-definition \
      --region "${region}" \
      --task-definition "${arn}" >/dev/null 2>&1 || true
  done
}

teardown_gcp() {
  local region="${REGION:-us-central1}"
  local project_flag=()
  if [ -n "${PROJECT_ID:-}" ]; then
    project_flag=(--project "${PROJECT_ID}")
  fi
  echo ">> [gcp] Deleting Cloud Run service ${APP_NAME}"
  gcloud run services delete "${APP_NAME}" \
    "${project_flag[@]}" \
    --region "${region}" \
    --quiet 2>/dev/null || true
}

teardown_fly() {
  echo ">> [fly] Destroying Fly app ${APP_NAME}"
  flyctl apps destroy "${APP_NAME}" -y 2>/dev/null || true
}

teardown_digitalocean() {
  echo ">> [digitalocean] Looking up app id for ${APP_NAME}"
  local app_id
  app_id=$(doctl apps list --format ID,Spec.Name --no-header 2>/dev/null \
    | awk -v name="${APP_NAME}" '$2 == name {print $1}' \
    | head -n1 || true)
  if [ -n "${app_id}" ]; then
    echo ">> [digitalocean] Deleting app ${app_id}"
    doctl apps delete "${app_id}" --force 2>/dev/null || true
  else
    echo ">> [digitalocean] No app found for ${APP_NAME}, skipping"
  fi
}

teardown_hetzner() {
  echo ">> [hetzner] Deleting servers tagged pr=${PR_NUMBER}"
  # hcloud lists label selectors as `server list -l key=value`.
  local server_ids
  server_ids=$(hcloud server list \
    -l "preview=true,pr=${PR_NUMBER}" \
    -o columns=id -o noheader 2>/dev/null || true)
  for id in ${server_ids}; do
    hcloud server delete "${id}" 2>/dev/null || true
  done
}

case "${PROVIDER}" in
  aws)           teardown_aws ;;
  gcp)           teardown_gcp ;;
  fly)           teardown_fly ;;
  digitalocean)  teardown_digitalocean ;;
  hetzner)       teardown_hetzner ;;
  all)
    teardown_aws
    teardown_gcp
    teardown_fly
    teardown_digitalocean
    teardown_hetzner
    ;;
  *)
    echo "ERROR: unknown PROVIDER '${PROVIDER}'" >&2
    echo "Valid values: aws | gcp | fly | digitalocean | hetzner | all" >&2
    exit 2
    ;;
esac

echo ">> Teardown for PR ${PR_NUMBER} (${PROVIDER}) complete"

Two things to get right in teardown: make the delete command idempotent, and fail gracefully on 404. If a PR is force-closed before the deploy step completes, the resource may not exist yet. A teardown script that errors on "resource not found" creates a failed workflow on every force-close. Use --no-fail-if-not-exists flags or wrap the delete commands in a check-then-delete pattern.

For teams on Hetzner, add the PR number as a server tag at deploy time. Teardown finds the server by tag and deletes it. This is more reliable than tracking server IDs in workflow state, which can go missing if the deploy workflow is cancelled partway through.

The Complete GitHub Actions Preview Deployment Workflow

The five components compose into one GitHub Actions workflow. The preview.yml workflow runs on every PR push and handles Build, Deploy, Route, and Test in sequence. A separate teardown job runs conditionally when the PR closes.

Here's the complete workflow with all five components wired together, using Cloud Run as the deploy target (swap the deploy step for any of the cloud-specific scripts above):

name: preview

# End-to-end preview pipeline for every pull request:
#   1. build  — build the app image and push to GHCR
#   2. deploy — roll the image to Cloud Run and post the URL as a PR comment
#   3. test   — run Autonoma against the preview URL
#   teardown — on PR close, destroy the preview environment
#
# This is the canonical workflow referenced by the blog post. Swap the
# `deploy` step for scripts/deploy-aws.sh or scripts/deploy-fly.sh to target
# a different cloud — the build, comment, and test shape stay identical.

on:
  pull_request:
    types: [opened, synchronize, reopened, closed]

concurrency:
  # One in-flight run per PR. Newer pushes cancel older deployments so we
  # don't ship a stale commit over a newer one.
  group: preview-${{ github.event.pull_request.number }}
  cancel-in-progress: true

permissions:
  contents: read
  packages: write       # push to GHCR
  pull-requests: write  # post the preview URL comment
  id-token: write       # OIDC for cloud providers

env:
  IMAGE_REPO: ghcr.io/${{ github.repository }}

jobs:
  # ---------------------------------------------------------------------------
  # build
  # ---------------------------------------------------------------------------
  build:
    if: github.event.action != 'closed'
    runs-on: ubuntu-latest
    outputs:
      image: ${{ steps.meta.outputs.image }}
    steps:
      - name: Checkout
        uses: actions/checkout@v4

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3

      - name: Log in to GHCR
        uses: docker/login-action@v3
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Compute image tag
        id: meta
        run: |
          IMAGE="${IMAGE_REPO}:pr-${{ github.event.pull_request.number }}"
          echo "image=${IMAGE}" >> "${GITHUB_OUTPUT}"

      - name: Build and push image
        uses: docker/build-push-action@v5
        with:
          context: .
          push: true
          tags: ${{ steps.meta.outputs.image }}
          cache-from: type=gha
          cache-to: type=gha,mode=max

  # ---------------------------------------------------------------------------
  # deploy
  # ---------------------------------------------------------------------------
  deploy:
    if: github.event.action != 'closed'
    needs: build
    runs-on: ubuntu-latest
    outputs:
      url: ${{ steps.run.outputs.url }}
    steps:
      - name: Checkout
        uses: actions/checkout@v4

      - name: Authenticate to Google Cloud
        uses: google-github-actions/auth@v2
        with:
          workload_identity_provider: ${{ secrets.GCP_WORKLOAD_IDENTITY_PROVIDER }}
          service_account: ${{ secrets.GCP_SERVICE_ACCOUNT }}

      - name: Set up gcloud
        uses: google-github-actions/setup-gcloud@v2

      - name: Deploy to Cloud Run
        id: run
        env:
          PR_NUMBER: ${{ github.event.pull_request.number }}
          IMAGE: ${{ needs.build.outputs.image }}
          PROJECT_ID: ${{ secrets.GCP_PROJECT_ID }}
          REGION: us-central1
        run: |
          chmod +x scripts/deploy-gcp.sh
          OUTPUT=$(./scripts/deploy-gcp.sh | tee /dev/stderr)
          URL=$(echo "${OUTPUT}" | grep '^preview_url=' | tail -n1 | cut -d= -f2-)
          if [ -z "${URL}" ]; then
            echo "deploy-gcp.sh did not emit a preview_url line" >&2
            exit 1
          fi
          echo "url=${URL}" >> "${GITHUB_OUTPUT}"

      - name: Comment preview URL on PR
        uses: peter-evans/create-or-update-comment@v4
        with:
          issue-number: ${{ github.event.pull_request.number }}
          body: |
            Preview deployed: ${{ steps.run.outputs.url }}

            Commit: `${{ github.event.pull_request.head.sha }}`
            Image:  `${{ needs.build.outputs.image }}`

  # ---------------------------------------------------------------------------
  # test
  # ---------------------------------------------------------------------------
  test:
    if: github.event.action != 'closed'
    needs: deploy
    uses: ./.github/workflows/preview-test-autonoma.yml
    with:
      preview_url: ${{ needs.deploy.outputs.url }}
    secrets: inherit

  # ---------------------------------------------------------------------------
  # teardown
  # ---------------------------------------------------------------------------
  teardown:
    if: github.event.action == 'closed'
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@v4

      - name: Authenticate to Google Cloud
        uses: google-github-actions/auth@v2
        with:
          workload_identity_provider: ${{ secrets.GCP_WORKLOAD_IDENTITY_PROVIDER }}
          service_account: ${{ secrets.GCP_SERVICE_ACCOUNT }}

      - name: Set up gcloud
        uses: google-github-actions/setup-gcloud@v2

      - name: Destroy preview environment
        env:
          PR_NUMBER: ${{ github.event.pull_request.number }}
          PROVIDER: gcp
          PROJECT_ID: ${{ secrets.GCP_PROJECT_ID }}
          REGION: us-central1
        run: |
          chmod +x scripts/teardown.sh
          ./scripts/teardown.sh

A few design decisions in this workflow worth explaining. The concurrency block at the top cancels in-progress runs for the same PR when a new push arrives. Without this, two simultaneous pushes create two parallel deploy workflows that race to update the same Cloud Run revision. The cancel-in-progress: true setting ensures only the latest push is deployed, which is the correct behavior for preview environments. Note that the teardown job should use a separate concurrency group (e.g. teardown-pr-${{ github.event.number }}) so rapid close/reopen cycles don't race each other.

Authentication to your cloud is the one piece this workflow assumes but doesn't spell out. Long-lived AWS or GCP service-account keys stored as GitHub secrets are the old way and a liability for preview environments, every PR runner holds production-capable credentials. The modern pattern is GitHub OIDC: each run gets a short-lived JWT that the cloud provider exchanges for scoped credentials. Add permissions: id-token: write to the job, use aws-actions/configure-aws-credentials@v4 with role-to-assume, or google-github-actions/auth@v2 with workload identity federation. No long-lived secrets in GitHub, credentials auto-expire after the job, and each PR gets a separate session. Essential for any preview environment workflow that deploys to production-adjacent infrastructure.

The outputs block on the deploy job passes the preview URL to the test job. GitHub Actions job outputs require explicit declaration, the URL isn't available automatically from a previous job's step output. The pattern is: declare the output in the job block, set it in the step with echo "url=..." >> $GITHUB_OUTPUT, and reference it in downstream jobs with needs.deploy.outputs.url.

The PR comment step at the end uses the peter-evans/create-or-update-comment action, which creates a new comment on first deploy and updates the existing comment on subsequent pushes. This prevents the PR comment thread from filling up with one comment per push.

Complete GitHub Actions preview pipeline: build Docker image, push to registry, deploy to cloud, post URL to PR, run E2E tests, and teardown on close

Adding Autonoma as the Testing Layer

For teams who built the infrastructure themselves and want to hand off the testing layer, Autonoma is a clean fit. You built the Docker pipeline. You own the cloud resources. You control the URL routing. Autonoma slots into Step 4 as a single API call and handles everything that would otherwise be 30+ lines of Playwright YAML.

We built Autonoma specifically for workflows like this one. The Planner agent reads your codebase to understand your routes, components, and user flows, no recording, no test scripts. The Automator agent runs generated test cases against the URL you pass in. The Maintainer agent keeps tests passing as your code changes. When the preview teardown runs, the test run is complete and results are already in the PR.

For self-hosted teams, the API integration is the recommended path over the GitHub Action. It's CI-system-agnostic, the same curl call works in GitHub Actions, GitLab CI, CircleCI, or a bare bash script. No GitHub App installation required, no action versioning to track.

The Playwright-in-GitHub-Actions guide, linked here, covers the full Playwright workflow including browser caching, sharding, and report artifact management, useful if you're committed to owning the Playwright path directly.

Cloud Provider Comparison

Cloud Provider	Deploy command	URL pattern	Teardown method	Cost per preview/hr	Setup complexity
AWS ECS/Fargate	`aws ecs update-service`	Custom subdomain via ALB	Delete ECS service + task def	~$0.05 (0.5 vCPU / 1 GB)	High, IAM, ALB, VPC, cluster
GCP Cloud Run	`gcloud run deploy`	`pr-N-hash-region.run.app`	`gcloud run revisions delete`	$0.00 idle, ~$0.024 active	Medium, one gcloud CLI call
DigitalOcean App Platform	`doctl apps create`	`app-name.ondigitalocean.app`	`doctl apps delete`	~$0.0139 (512 MB)	Medium, spec file + doctl
Hetzner (Docker on VPS)	SSH + `docker run`	Custom via Nginx reverse proxy	SSH + `docker rm`	~$0.006 (shared VPS slice)	Medium, Nginx config management
Fly.io	`flyctl deploy`	`pr-N-appname.fly.dev`	`flyctl apps destroy`	$0.00 idle, ~$0.019 active	Low, single flyctl command

Choosing between them comes down to three questions. First, do you need zero idle cost? Cloud Run and Fly.io scale to zero between requests. ECS Fargate, DigitalOcean, and Hetzner keep resources running and accruing cost for the PR's lifetime. Second, do you need full Docker Compose support? Hetzner is the only option here where you can run a multi-container stack natively with Docker Compose. Third, how much AWS/GCP footprint do you already have? If your production runs on AWS, ECS reuses the infrastructure you already understand and pay for. Starting from zero, Fly.io has the lowest friction.

Condensed to a single decision rule:

Already on AWS in production: ECS/Fargate. Reuses your IAM, VPC, and billing.
Want zero idle cost and no Docker Compose need: Cloud Run or Fly.io.
Need full Docker Compose natively: Hetzner + Traefik.
Fastest time-to-first-preview: Fly.io. One flyctl command and you have a URL.
Highest concurrent preview count at the lowest per-preview cost: Hetzner. A €6/month CX21 hosts 10-20 previews simultaneously.

The best cloud for your preview environments is the one your team already knows how to operate at 2am when something breaks. Cloud Run is technically elegant. AWS ECS is what most platform engineers can debug without Googling.

What Breaks in Production

Every preview-environment post on the internet makes this look sunny. It isn't. Five failure modes show up within the first month of running self-hosted previews at any real volume. Knowing about them in advance is the difference between a one-hour fix and a four-hour 2am debugging session.

Stuck teardowns when the deploy failed mid-run. A workflow that crashes between "create resource" and "record resource ID" leaves the cloud resource running with no record of it. The teardown trigger fires on PR close, looks for the ID, finds nothing, and exits clean while the orphan keeps accruing cost. The fix is idempotent teardown: delete by a naming convention tied to PR number, not a stored ID. Run your teardown script with || true on each delete so the workflow succeeds even when resources don't exist.

Wildcard SSL certificates don't cover new subdomain patterns. Your wildcard *.preview.yourapp.com cert covers pr-42.preview.yourapp.com. It does not cover pr-42-feature.preview.yourapp.com. The moment someone on your team changes the subdomain pattern to include a dash or an extra level, every preview comes up with a browser warning and the PR feedback loop breaks. Stick to a single-level wildcard pattern and enforce it in the deploy script.

Port exhaustion on shared VPS above ~20 concurrent previews. Hetzner setups commonly assign each preview a port from 8000-9000. Without active cleanup of orphaned containers, that range fills up and new deploys silently fail to bind. Add a health check that counts running containers with the preview label and alerts when it exceeds 80% of the port range.

Fork PRs and the pull_request_target trap. The pull_request event from forks doesn't expose secrets, which means forked PRs can't deploy. Switching to pull_request_target exposes secrets but runs the base branch's workflow with the fork's code, a pwn risk if the workflow checks out and executes the PR's code. The safe pattern: build on pull_request (no secrets, isolated), then deploy on pull_request_target with an if: github.event.pull_request.head.repo.full_name == github.repository guard that blocks fork deploys.

PR comment bot races on fast-push workflows. Pushing three times in rapid succession can create three "preview ready" comments because each workflow runs the "create comment" action before the concurrency cancel kicks in. Use peter-evans/create-or-update-comment with a comment-author matcher instead of create-comment, so subsequent runs update the existing comment instead of adding a new one.

The preview environment you never debug at 2am is the one with idempotent teardowns, a single-level wildcard pattern, and a build job that never touches production secrets.

FAQ

Self-hosted preview environments are ephemeral deployments that spin up automatically for each pull request on infrastructure you own and control, AWS, GCP, DigitalOcean, Hetzner, or Fly.io, rather than on a managed platform like Vercel or Netlify. Each PR gets a unique URL, tests run against the real deployed app, and the environment tears down when the PR closes.

Vercel preview environments only work for frontend apps. If your stack includes a custom backend, a worker process, a websocket server, or services that need to run alongside your frontend, Vercel can't preview them together. Self-hosted preview environments run your entire Docker Compose stack, frontend, backend, workers, in one ephemeral environment per PR.

Use the PR number as part of the subdomain or path. For subdomain-based routing, create a DNS record like pr-123.preview.yourapp.com pointing to your cloud resource. For path-based routing on a shared host, proxy /preview/pr-123 to the container. Wildcard DNS (*. preview.yourapp.com) covers all PR subdomains with one record, avoiding per-PR DNS writes.

Add a second GitHub Actions workflow that triggers on pull_request with types [closed]. This workflow runs on both merged and closed PRs. It should delete the cloud resource by PR number and optionally remove the DNS record. On AWS ECS, delete the service and task definition. On Cloud Run, delete the revision. On Fly.io, run flyctl apps destroy. On Hetzner, delete the server by tag.

Two paths. The Playwright path: add steps to your preview workflow that install Playwright, install browsers, wait for the preview URL to respond, run npx playwright test with BASE_URL set to the preview URL, and upload the HTML report as an artifact. This requires 30+ lines of YAML and ownership of flakiness, browser management, and reporting. The Autonoma path: one API call to POST https://api.prod.autonoma.app/v1/run/folder/<folder-id> with the preview URL as the base_url parameter. Autonoma's agents read your codebase, generate tests, run them against the preview, and post results back to the PR.

Hetzner is cheapest by far for teams running many simultaneous previews, a CX21 server (2 vCPU, 4 GB RAM) costs roughly €0.006/hour. Fly.io is comparable at $0.0032/vCPU/hour with no minimum. DigitalOcean App Platform charges $0.0139/hour per container. AWS ECS Fargate and GCP Cloud Run cost more per hour but are genuinely serverless with zero idle cost. The right choice depends on your concurrent preview count and whether idle cost matters more than per-hour rate.

Self-Hosted Preview Environments with GitHub Actions + Docker

Why Self-Host Preview Environments?

The DIY Preview Stack

Component 1: Building Docker Preview Images with GitHub Actions

Component 2: Deploying to a Temporary URL

Component 3: DNS and Routing for Preview URLs

Component 4: Running E2E Tests Against the Preview

Component 5: Teardown on Merge or Close

The Complete GitHub Actions Preview Deployment Workflow

Adding Autonoma as the Testing Layer

Cloud Provider Comparison

What Breaks in Production

FAQ

What are self-hosted preview environments?

Why build self-hosted preview environments instead of using Vercel?

How do I generate a preview URL for each pull request?

How do I tear down preview environments automatically?

How do I run E2E tests against a self-hosted preview environment?

Which cloud is cheapest for self-hosted preview environments?

Self-Hosted Preview Environments with GitHub Actions + Docker

Why Self-Host Preview Environments?

The DIY Preview Stack

Component 1: Building Docker Preview Images with GitHub Actions

Component 2: Deploying to a Temporary URL

Component 3: DNS and Routing for Preview URLs

Component 4: Running E2E Tests Against the Preview

Component 5: Teardown on Merge or Close

The Complete GitHub Actions Preview Deployment Workflow

Adding Autonoma as the Testing Layer

Cloud Provider Comparison

What Breaks in Production

FAQ

What are self-hosted preview environments?

Why build self-hosted preview environments instead of using Vercel?

How do I generate a preview URL for each pull request?

How do I tear down preview environments automatically?

How do I run E2E tests against a self-hosted preview environment?

Which cloud is cheapest for self-hosted preview environments?

Related articles

What Are Preview Environments and Why Fast Teams Need Them

Preview Environments for Every Pull Request: The Complete Workflow

Preview Deployments vs Preview Environments: Why a Frontend Preview Is Not Enough

Full-Stack Preview Environments Without Shared Staging