mirror of
https://github.com/dreamhunter2333/cloudflare_temp_email.git
synced 2026-07-05 14:21:54 +08:00
feat: regex fallback for verification code extraction without Workers AI (#1048)
feat: add regex fallback for verification code extraction without Workers AI When AI email extraction is enabled but no Workers AI binding is available, fall back to a built-in, zero-dependency regex extractor so self-hosted deployments without Workers AI still surface verification codes in Telegram notifications and webhooks. - Add worker/src/email/extract_code.ts: rule-based multilingual (English / Chinese / Japanese / Korean) verification-code extractor with year and YYYYMMDD date rejection to avoid false positives. - ai_extract.ts: share the allowlist check and content parsing across both paths, extract a saveExtractMetadata helper, and use the regex fallback when env.AI is absent. - Reuse the existing aiExtractResult pipeline (auth_code type), so Telegram and webhook output need no changes. - Update bilingual CHANGELOG and AI-extract feature docs.
This commit is contained in:
@@ -43,6 +43,18 @@ Or add in Cloudflare Dashboard Worker settings:
|
||||
- **Variable name**: `AI`
|
||||
- **Type**: Workers AI
|
||||
|
||||
## Fallback Without a Workers AI Binding
|
||||
|
||||
If `ENABLE_AI_EMAIL_EXTRACT` is enabled but **no Workers AI binding is configured** (e.g. a self-hosted deployment without Workers AI), the system automatically falls back to a built-in **regex verification-code extractor**:
|
||||
|
||||
- Extracts **verification codes** (`auth_code`) only; links are not extracted (link extraction requires AI)
|
||||
- Zero dependency, zero cost, runs locally inside the Worker
|
||||
- Supports common verification-code formats in English, Chinese, Japanese and Korean
|
||||
- Rejects years (e.g. `2026`) and `YYYYMMDD` dates to reduce false positives
|
||||
- Results are written to `metadata` and reuse the same Telegram / webhook placeholders (`aiExtractType` is `auth_code` in this case)
|
||||
|
||||
When a Workers AI binding is configured, AI extraction is still preferred (recognizing both codes and links) and this fallback does not apply.
|
||||
|
||||
## Address Allowlist (Optional)
|
||||
|
||||
To control costs and resource usage, you can configure an address allowlist in the Admin console's **AI Extract Settings** page:
|
||||
|
||||
@@ -43,6 +43,18 @@ binding = "AI"
|
||||
- **Variable name**: `AI`
|
||||
- **Type**: Workers AI
|
||||
|
||||
## 无 Workers AI 绑定时的正则兜底
|
||||
|
||||
如果启用了 `ENABLE_AI_EMAIL_EXTRACT` 但**没有配置 Workers AI 绑定**(例如自部署时未开通 Workers AI),系统会自动回退到内置的**正则验证码提取**:
|
||||
|
||||
- 仅提取**验证码**(`auth_code`),不提取链接(链接提取依赖 AI)
|
||||
- 零依赖、零成本,在 Worker 内本地完成
|
||||
- 支持中文、英文、日文、韩文常见验证码格式
|
||||
- 自动排除年份(如 `2026`)与 `YYYYMMDD` 日期,降低误判
|
||||
- 提取结果同样写入 `metadata`,并复用 Telegram 推送与 Webhook 占位符(此时 `aiExtractType` 为 `auth_code`)
|
||||
|
||||
当配置了 Workers AI 绑定时,仍优先使用 AI 提取(可识别验证码与各类链接),不受此回退影响。
|
||||
|
||||
## 地址白名单(可选)
|
||||
|
||||
为了控制成本和资源使用,可以在 Admin 控制台的 **AI 提取设置** 页面配置地址白名单:
|
||||
|
||||
Reference in New Issue
Block a user