feat(mail): support gzip compressed email storage via ENABLE_MAIL_GZIP (#933)

* feat(mail): support gzip compressed email storage in D1 raw_blob column

Add ENABLE_MAIL_GZIP env var to optionally gzip-compress incoming emails
into a new raw_blob BLOB column, saving D1 storage space. Reading is
backward-compatible: prioritizes raw_blob (decompress) with fallback to
plaintext raw field. Includes DB migration v0.0.7, docs, and changelogs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: gzip fallback on missing column + decouple resolve from handleListQuery

- email/index.ts: gzip INSERT failure now falls back to plaintext INSERT
  instead of silently losing the email (P1: data loss prevention)
- common.ts: add handleMailListQuery for raw_mails-specific list queries
  with resolveRawEmailList, keeping handleListQuery generic
- Replace handleListQuery → handleMailListQuery in mails_api, admin_mail_api,
  user_mail_api (only raw_mails callers)
- Add e2e test infrastructure: worker-gzip service, wrangler.toml.e2e.gzip,
  api-gzip playwright project, mail-gzip.spec.ts with 4 test cases

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address CodeRabbit review feedback for gzip feature

- Use destructuring in resolveRawEmailRow to truly remove raw_blob key
- Narrow fallback scope: only fallback to plaintext on compression failure
  or missing raw_blob column, re-throw other DB errors
- Clean unused imports in e2e gzip test

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add try-catch in resolveRawEmail to prevent single corrupt blob from failing entire list

A corrupted raw_blob would cause decompressBlob to throw, which with
Promise.all in resolveRawEmailList would reject the entire batch query.
Now catches decompression errors and falls back to row.raw field.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(mail): align sendAdminInternalMail with gzip storage path

sendAdminInternalMail now respects ENABLE_MAIL_GZIP: compresses to
raw_blob when enabled, with fallback to plaintext on failure.
Added e2e test verifying admin internal mail is readable under gzip.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(e2e): match admin internal mail by body content instead of encoded subject

mimetext base64-encodes the Subject header, so the raw MIME string
does not contain the literal subject text. Match on body content
(balance: 99) which is plaintext.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(e2e): add WORKER_GZIP_URL guard and length assertions in gzip tests

Address CodeRabbit feedback:
- Skip gzip tests when WORKER_GZIP_URL is not set to prevent false positives
- Assert results array length before accessing [0] for clearer error messages

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(mail): narrow gzip fallback scope and fix webhook query compatibility

- sendAdminInternalMail: separate compress vs DB error handling, only
  fallback to plaintext on compression failure or missing raw_blob
  column, rethrow other DB errors (aligns with email/index.ts)
- Webhook test endpoints: use SELECT * instead of explicit raw_blob
  column reference, so pre-migration databases don't 500
- Docs/changelog: clarify that db_migration must run before enabling
  ENABLE_MAIL_GZIP

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(telegram): use generic Record type for raw_mails query result

Align with other query sites — avoid hardcoding raw_blob in the
TypeScript type annotation so the query works with or without the
column after migration.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor(models): add RawMailRow type and unify raw_mails query typing

Add RawMailRow type to models with raw_blob as optional field, replacing
ad-hoc Record<string, unknown> and inline type annotations across
webhook test endpoints, telegram API, and gzip utilities.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Dream Hunter
2026-04-04 18:46:39 +08:00
committed by GitHub
parent 53c35062c8
commit 7c6d0d7c8a
28 changed files with 563 additions and 40 deletions

View File

@@ -11,7 +11,8 @@ RUN pnpm install --frozen-lockfile || (echo "WARN: frozen-lockfile failed, falli
COPY worker/src/ src/
COPY worker/tsconfig.json ./
COPY e2e/fixtures/wrangler.toml.e2e wrangler.toml
ARG WRANGLER_TOML=e2e/fixtures/wrangler.toml.e2e
COPY ${WRANGLER_TOML} wrangler.toml
EXPOSE 8787

View File

@@ -41,10 +41,28 @@ services:
context: ..
dockerfile: e2e/Dockerfile.worker
ports:
- "8788:8788"
command: ["pnpm", "exec", "wrangler", "dev", "--port", "8788", "--ip", "0.0.0.0"]
- "8790:8790"
command: ["pnpm", "exec", "wrangler", "dev", "--port", "8790", "--ip", "0.0.0.0"]
volumes:
- ./fixtures/wrangler.toml.e2e.env-off:/app/worker/wrangler.toml:ro
depends_on:
- mailpit
healthcheck:
test: ["CMD", "curl", "-sf", "http://localhost:8790/health_check"]
interval: 3s
timeout: 5s
start_period: 10s
retries: 20
worker-gzip:
build:
context: ..
dockerfile: e2e/Dockerfile.worker
args:
WRANGLER_TOML: e2e/fixtures/wrangler.toml.e2e.gzip
ports:
- "8788:8788"
command: ["pnpm", "exec", "wrangler", "dev", "--port", "8788", "--ip", "0.0.0.0"]
depends_on:
- mailpit
healthcheck:
@@ -108,7 +126,8 @@ services:
environment:
WORKER_URL: http://worker:8787
WORKER_URL_SUBDOMAIN: http://worker-subdomain:8789
WORKER_URL_ENV_OFF: http://worker-env-off:8788
WORKER_URL_ENV_OFF: http://worker-env-off:8790
WORKER_GZIP_URL: http://worker-gzip:8788
FRONTEND_URL: https://frontend:5173
MAILPIT_API: http://mailpit:8025/api
SMTP_PROXY_HOST: smtp-proxy
@@ -125,6 +144,8 @@ services:
condition: service_healthy
worker-env-off:
condition: service_healthy
worker-gzip:
condition: service_healthy
frontend:
condition: service_started
smtp-proxy:

View File

@@ -4,6 +4,7 @@ import WebSocket from 'ws';
export const WORKER_URL = process.env.WORKER_URL!;
export const WORKER_URL_SUBDOMAIN = process.env.WORKER_URL_SUBDOMAIN || '';
export const WORKER_URL_ENV_OFF = process.env.WORKER_URL_ENV_OFF || '';
export const WORKER_GZIP_URL = process.env.WORKER_GZIP_URL || '';
export const FRONTEND_URL = process.env.FRONTEND_URL!;
export const MAILPIT_API = process.env.MAILPIT_API!;
export const TEST_DOMAIN = 'test.example.com';

View File

@@ -0,0 +1,34 @@
name = "cloudflare_temp_email_gzip"
main = "src/worker.ts"
compatibility_date = "2025-04-01"
compatibility_flags = [ "nodejs_compat" ]
keep_vars = true
[vars]
PREFIX = "tmp"
DEFAULT_DOMAINS = ["test.example.com"]
DOMAINS = ["test.example.com"]
JWT_SECRET = "e2e-test-secret-key"
BLACK_LIST = ""
ENABLE_USER_CREATE_EMAIL = true
ENABLE_USER_DELETE_EMAIL = true
ENABLE_AUTO_REPLY = true
DEFAULT_SEND_BALANCE = 10
ENABLE_ADDRESS_PASSWORD = true
DISABLE_ADMIN_PASSWORD_CHECK = true
ADMIN_PASSWORDS = '["e2e-admin-pass"]'
ENABLE_WEBHOOK = true
E2E_TEST_MODE = true
ENABLE_MAIL_GZIP = true
SMTP_CONFIG = """
{"test.example.com":{"host":"mailpit","port":1025,"secure":false}}
"""
[[kv_namespaces]]
binding = "KV"
id = "e2e-test-kv-gzip-00000000-0000-0000-0000-000000000000"
[[d1_databases]]
binding = "DB"
database_name = "e2e-temp-email-gzip"
database_id = "e2e-test-db-gzip-00000000-0000-0000-0000-000000000000"

View File

@@ -1,6 +1,7 @@
import { defineConfig, devices } from '@playwright/test';
const WORKER_BASE = process.env.WORKER_URL!;
const WORKER_GZIP_BASE = process.env.WORKER_GZIP_URL || '';
const FRONTEND_BASE = process.env.FRONTEND_URL!;
export default defineConfig({
@@ -16,6 +17,13 @@ export default defineConfig({
baseURL: WORKER_BASE,
},
},
{
name: 'api-gzip',
testDir: './tests/api-gzip',
use: {
baseURL: WORKER_GZIP_BASE,
},
},
{
name: 'smtp-proxy',
testDir: './tests/smtp-proxy',

View File

@@ -44,6 +44,21 @@ if [ -n "${WORKER_URL_ENV_OFF:-}" ]; then
done
fi
if [ -n "${WORKER_GZIP_URL:-}" ]; then
echo "==> Waiting for worker-gzip at $WORKER_GZIP_URL ..."
for i in $(seq 1 60); do
if curl -sf "$WORKER_GZIP_URL/health_check" > /dev/null 2>&1; then
echo " Worker-gzip ready after ${i}s"
break
fi
if [ "$i" -eq 60 ]; then
echo "ERROR: Worker-gzip not ready after 60s"
exit 1
fi
sleep 1
done
fi
echo "==> Waiting for frontend at $FRONTEND_URL ..."
for i in $(seq 1 60); do
if curl -skf "$FRONTEND_URL" > /dev/null 2>&1; then
@@ -88,5 +103,12 @@ if [ -n "${WORKER_URL_ENV_OFF:-}" ]; then
echo " Env-off database initialized"
fi
if [ -n "${WORKER_GZIP_URL:-}" ]; then
echo "==> Initializing gzip worker database"
curl -sf -X POST "$WORKER_GZIP_URL/admin/db_initialize" > /dev/null
curl -sf -X POST "$WORKER_GZIP_URL/admin/db_migration" > /dev/null
echo " Gzip worker database initialized"
fi
echo "==> Running Playwright tests"
exec npx playwright test "$@"

View File

@@ -0,0 +1,242 @@
import { test, expect } from '@playwright/test';
import { WORKER_GZIP_URL, TEST_DOMAIN } from '../../fixtures/test-helpers';
/**
* These tests run against a worker instance with ENABLE_MAIL_GZIP=true.
* They verify gzip-compressed storage and backward-compatible reading.
*/
// Helper: create address on the gzip worker
async function createGzipAddress(ctx: any, name: string) {
const uniqueName = `${name}${Date.now()}`;
const res = await ctx.post(`${WORKER_GZIP_URL}/api/new_address`, {
data: { name: uniqueName, domain: TEST_DOMAIN },
});
if (!res.ok()) throw new Error(`Failed to create address: ${res.status()} ${await res.text()}`);
const body = await res.json();
return { jwt: body.jwt, address: body.address, address_id: body.address_id };
}
// Helper: seed mail via receiveMail (goes through email() handler → gzip compression)
async function receiveGzipMail(
ctx: any, address: string,
opts: { subject?: string; html?: string; text?: string; from?: string }
) {
const from = opts.from || `sender@${TEST_DOMAIN}`;
const subject = opts.subject || 'Test Email';
const boundary = `----E2E${Date.now()}`;
const htmlPart = opts.html || `<p>${opts.text || 'Hello from E2E'}</p>`;
const textPart = opts.text || 'Hello from E2E';
const messageId = `<e2e-${Date.now()}-${Math.random().toString(36).slice(2, 10)}@test>`;
const raw = [
`From: ${from}`,
`To: ${address}`,
`Subject: ${subject}`,
`Message-ID: ${messageId}`,
`MIME-Version: 1.0`,
`Content-Type: multipart/alternative; boundary="${boundary}"`,
``,
`--${boundary}`,
`Content-Type: text/plain; charset=utf-8`,
``,
textPart,
`--${boundary}`,
`Content-Type: text/html; charset=utf-8`,
``,
htmlPart,
`--${boundary}--`,
].join('\r\n');
const res = await ctx.post(`${WORKER_GZIP_URL}/admin/test/receive_mail`, {
data: { from, to: address, raw },
});
if (!res.ok()) throw new Error(`Failed to receive mail: ${res.status()} ${await res.text()}`);
const body = await res.json();
if (!body.success) throw new Error(`Mail was rejected: ${body.rejected || 'unknown reason'}`);
}
// Helper: seed mail via seedMail (direct INSERT → plaintext raw, no gzip)
async function seedPlaintextMail(
ctx: any, address: string,
opts: { subject?: string; text?: string; from?: string }
) {
const from = opts.from || `sender@${TEST_DOMAIN}`;
const subject = opts.subject || 'Plaintext Mail';
const messageId = `<e2e-plain-${Date.now()}-${Math.random().toString(36).slice(2, 10)}@test>`;
const raw = [
`From: ${from}`,
`To: ${address}`,
`Subject: ${subject}`,
`Message-ID: ${messageId}`,
`Content-Type: text/plain; charset=utf-8`,
``,
opts.text || 'Hello plaintext from E2E',
].join('\r\n');
const res = await ctx.post(`${WORKER_GZIP_URL}/admin/test/seed_mail`, {
data: { address, source: from, raw, message_id: messageId },
});
if (!res.ok()) throw new Error(`Failed to seed mail: ${res.status()} ${await res.text()}`);
}
// Helper: delete address on gzip worker
async function deleteGzipAddress(ctx: any, jwt: string) {
await ctx.delete(`${WORKER_GZIP_URL}/api/delete_address`, {
headers: { Authorization: `Bearer ${jwt}` },
});
}
test.describe('Mail Gzip Storage', () => {
test.beforeEach(() => {
test.skip(!WORKER_GZIP_URL, 'WORKER_GZIP_URL not set — skipping gzip tests');
});
test('gzip-compressed mail is readable in list', async ({ request }) => {
const { jwt, address } = await createGzipAddress(request, 'gzip-list');
try {
await receiveGzipMail(request, address, {
subject: 'Gzip List Test',
text: 'compressed content here',
});
const res = await request.get(`${WORKER_GZIP_URL}/api/mails?limit=10&offset=0`, {
headers: { Authorization: `Bearer ${jwt}` },
});
expect(res.ok()).toBe(true);
const { results } = await res.json();
expect(results).toHaveLength(1);
expect(results[0].raw).toContain('Gzip List Test');
expect(results[0].raw).toContain('compressed content here');
} finally {
await deleteGzipAddress(request, jwt);
}
});
test('gzip-compressed mail is readable in detail', async ({ request }) => {
const { jwt, address } = await createGzipAddress(request, 'gzip-detail');
try {
await receiveGzipMail(request, address, {
subject: 'Gzip Detail Test',
html: '<b>bold gzip</b>',
});
const listRes = await request.get(`${WORKER_GZIP_URL}/api/mails?limit=10&offset=0`, {
headers: { Authorization: `Bearer ${jwt}` },
});
const { results } = await listRes.json();
expect(results.length).toBeGreaterThanOrEqual(1);
const mailId = results[0].id;
const detailRes = await request.get(`${WORKER_GZIP_URL}/api/mail/${mailId}`, {
headers: { Authorization: `Bearer ${jwt}` },
});
expect(detailRes.ok()).toBe(true);
const mail = await detailRes.json();
expect(mail.raw).toContain('Gzip Detail Test');
expect(mail.raw).toContain('<b>bold gzip</b>');
} finally {
await deleteGzipAddress(request, jwt);
}
});
test('mixed: plaintext seed + gzip receive both readable in same list', async ({ request }) => {
const { jwt, address } = await createGzipAddress(request, 'gzip-mixed');
try {
// 1. Direct INSERT plaintext (simulates pre-gzip data)
await seedPlaintextMail(request, address, {
subject: 'Old Plaintext Mail',
text: 'legacy plain content',
});
// 2. receiveMail → goes through email() handler → gzip compressed
await receiveGzipMail(request, address, {
subject: 'New Gzip Mail',
text: 'new compressed content',
});
const res = await request.get(`${WORKER_GZIP_URL}/api/mails?limit=10&offset=0`, {
headers: { Authorization: `Bearer ${jwt}` },
});
expect(res.ok()).toBe(true);
const { results } = await res.json();
expect(results).toHaveLength(2);
// Both mails should have readable raw content
const subjects = results.map((r: any) => r.raw);
expect(subjects.some((r: string) => r.includes('Old Plaintext Mail'))).toBe(true);
expect(subjects.some((r: string) => r.includes('New Gzip Mail'))).toBe(true);
expect(subjects.some((r: string) => r.includes('legacy plain content'))).toBe(true);
expect(subjects.some((r: string) => r.includes('new compressed content'))).toBe(true);
} finally {
await deleteGzipAddress(request, jwt);
}
});
test('admin internal mail (sendAdminInternalMail) is gzip-compressed and readable', async ({ request }) => {
const { jwt, address } = await createGzipAddress(request, 'gzip-admin-mail');
try {
// 1. Request send access → creates address_sender row
const reqAccessRes = await request.post(`${WORKER_GZIP_URL}/api/request_send_mail_access`, {
headers: { Authorization: `Bearer ${jwt}` },
});
expect(reqAccessRes.ok()).toBe(true);
// 2. Get address_sender id
const senderListRes = await request.get(
`${WORKER_GZIP_URL}/admin/address_sender?limit=10&offset=0&address=${encodeURIComponent(address)}`,
);
expect(senderListRes.ok()).toBe(true);
const senderList = await senderListRes.json();
expect(senderList.results.length).toBeGreaterThanOrEqual(1);
const senderId = senderList.results[0].id;
// 3. Update send access via admin API → triggers sendAdminInternalMail
const updateRes = await request.post(`${WORKER_GZIP_URL}/admin/address_sender`, {
data: { address, address_id: senderId, balance: 99, enabled: true },
});
expect(updateRes.ok()).toBe(true);
// 4. Verify the internal mail is readable
const mailsRes = await request.get(`${WORKER_GZIP_URL}/api/mails?limit=10&offset=0`, {
headers: { Authorization: `Bearer ${jwt}` },
});
expect(mailsRes.ok()).toBe(true);
const { results } = await mailsRes.json();
expect(results.length).toBeGreaterThanOrEqual(1);
// mimetext base64-encodes the Subject header, so match on body content instead
const internalMail = results.find((m: any) => m.raw?.includes('balance: 99'));
expect(internalMail).toBeDefined();
expect(internalMail.raw).toContain('admin@internal');
expect(internalMail.raw).toContain('balance: 99');
expect(internalMail).not.toHaveProperty('raw_blob');
} finally {
await deleteGzipAddress(request, jwt);
}
});
test('raw_blob field is not exposed in API response', async ({ request }) => {
const { jwt, address } = await createGzipAddress(request, 'gzip-noblob');
try {
await receiveGzipMail(request, address, { subject: 'No Blob Leak' });
// Check list response
const listRes = await request.get(`${WORKER_GZIP_URL}/api/mails?limit=10&offset=0`, {
headers: { Authorization: `Bearer ${jwt}` },
});
const { results } = await listRes.json();
expect(results.length).toBeGreaterThanOrEqual(1);
expect(results[0]).not.toHaveProperty('raw_blob');
// Check detail response
const detailRes = await request.get(`${WORKER_GZIP_URL}/api/mail/${results[0].id}`, {
headers: { Authorization: `Bearer ${jwt}` },
});
const mail = await detailRes.json();
expect(mail).not.toHaveProperty('raw_blob');
} finally {
await deleteGzipAddress(request, jwt);
}
});
});