feat(mail): support gzip compressed email storage via ENABLE_MAIL_GZIP (#933)

* feat(mail): support gzip compressed email storage in D1 raw_blob column

Add ENABLE_MAIL_GZIP env var to optionally gzip-compress incoming emails
into a new raw_blob BLOB column, saving D1 storage space. Reading is
backward-compatible: prioritizes raw_blob (decompress) with fallback to
plaintext raw field. Includes DB migration v0.0.7, docs, and changelogs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: gzip fallback on missing column + decouple resolve from handleListQuery

- email/index.ts: gzip INSERT failure now falls back to plaintext INSERT
  instead of silently losing the email (P1: data loss prevention)
- common.ts: add handleMailListQuery for raw_mails-specific list queries
  with resolveRawEmailList, keeping handleListQuery generic
- Replace handleListQuery → handleMailListQuery in mails_api, admin_mail_api,
  user_mail_api (only raw_mails callers)
- Add e2e test infrastructure: worker-gzip service, wrangler.toml.e2e.gzip,
  api-gzip playwright project, mail-gzip.spec.ts with 4 test cases

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address CodeRabbit review feedback for gzip feature

- Use destructuring in resolveRawEmailRow to truly remove raw_blob key
- Narrow fallback scope: only fallback to plaintext on compression failure
  or missing raw_blob column, re-throw other DB errors
- Clean unused imports in e2e gzip test

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add try-catch in resolveRawEmail to prevent single corrupt blob from failing entire list

A corrupted raw_blob would cause decompressBlob to throw, which with
Promise.all in resolveRawEmailList would reject the entire batch query.
Now catches decompression errors and falls back to row.raw field.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(mail): align sendAdminInternalMail with gzip storage path

sendAdminInternalMail now respects ENABLE_MAIL_GZIP: compresses to
raw_blob when enabled, with fallback to plaintext on failure.
Added e2e test verifying admin internal mail is readable under gzip.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(e2e): match admin internal mail by body content instead of encoded subject

mimetext base64-encodes the Subject header, so the raw MIME string
does not contain the literal subject text. Match on body content
(balance: 99) which is plaintext.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(e2e): add WORKER_GZIP_URL guard and length assertions in gzip tests

Address CodeRabbit feedback:
- Skip gzip tests when WORKER_GZIP_URL is not set to prevent false positives
- Assert results array length before accessing [0] for clearer error messages

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(mail): narrow gzip fallback scope and fix webhook query compatibility

- sendAdminInternalMail: separate compress vs DB error handling, only
  fallback to plaintext on compression failure or missing raw_blob
  column, rethrow other DB errors (aligns with email/index.ts)
- Webhook test endpoints: use SELECT * instead of explicit raw_blob
  column reference, so pre-migration databases don't 500
- Docs/changelog: clarify that db_migration must run before enabling
  ENABLE_MAIL_GZIP

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(telegram): use generic Record type for raw_mails query result

Align with other query sites — avoid hardcoding raw_blob in the
TypeScript type annotation so the query works with or without the
column after migration.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor(models): add RawMailRow type and unify raw_mails query typing

Add RawMailRow type to models with raw_blob as optional field, replacing
ad-hoc Record<string, unknown> and inline type annotations across
webhook test endpoints, telegram API, and gzip utilities.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Dream Hunter
2026-04-04 18:46:39 +08:00
committed by GitHub
parent 53c35062c8
commit 7c6d0d7c8a
28 changed files with 563 additions and 40 deletions

View File

@@ -70,6 +70,7 @@
| `FORWARD_ADDRESS_LIST` | JSON | Global forward address list, disabled if not configured, all emails will be forwarded to listed addresses when enabled | `["xxx@xxx.com"]` |
| `REMOVE_EXCEED_SIZE_ATTACHMENT` | Text/JSON | If attachment exceeds 2MB, remove it, email may lose some information due to parsing | `true` |
| `REMOVE_ALL_ATTACHMENT` | Text/JSON | Remove all attachments, email may lose some information due to parsing | `true` |
| `ENABLE_MAIL_GZIP` | Text/JSON | When enabled, new emails are gzip-compressed and stored in `raw_blob` column to save D1 database space. Existing plaintext `raw` data is automatically compatible for reading. **Run database migration first (Admin → DB Migration) to ensure `raw_blob` column exists before enabling** | `true` |
> [!NOTE]
> `Junk mail checking` and `attachment removal` require email parsing, free tier CPU is limited, may cause large email parsing timeout

View File

@@ -66,6 +66,7 @@
| `FORWARD_ADDRESS_LIST` | JSON | 全局转发地址列表,如果不配置则不启用,启用后所有邮件都会转发到列表中的地址 | `["xxx@xxx.com"]` |
| `REMOVE_EXCEED_SIZE_ATTACHMENT` | 文本/JSON | 如果附件大小超过 2MB则删除附件邮件可能由于解析而丢失一些信息 | `true` |
| `REMOVE_ALL_ATTACHMENT` | 文本/JSON | 移除所有附件,邮件可能由于解析而丢失一些信息 | `true` |
| `ENABLE_MAIL_GZIP` | 文本/JSON | 启用后新邮件将 Gzip 压缩存储到 `raw_blob` 字段,可节省 D1 数据库空间。已有明文 `raw` 数据自动兼容读取。**启用前请先执行数据库迁移(管理后台 → 数据库迁移),确保 `raw_blob` 列已创建** | `true` |
> [!NOTE]
> `垃圾邮件检查` 和 `移除附件功能` 需要解析邮件,免费版 CPU 有限,可能会导致大邮件解析超时