feat: add CLAUDE.md for project guidance and enhance HTTP proxy validation

- Introduced CLAUDE.md to provide comprehensive guidance for the GoProxy project.
- Enhanced proxy validation logic to include HTTPS CONNECT tunnel verification for HTTP proxies.
- Updated README to reflect new features, including the addition of an HTTP proxy HTTPS access testing script.
- Adjusted configuration parameters to set the default HTTP protocol ratio to 30%.
This commit is contained in:
isboyjc
2026-04-01 05:10:46 +08:00
parent b1555a702f
commit dfe71d0390
6 changed files with 351 additions and 47 deletions

115
CLAUDE.md Normal file
View File

@@ -0,0 +1,115 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Project Overview
GoProxy is an intelligent proxy pool system written in Go. It automatically fetches HTTP/SOCKS5 proxies from public sources, validates them (exit IP + geolocation + latency), and serves them via 4 proxy ports (HTTP random/stable, SOCKS5 random/stable) plus a WebUI dashboard.
## Build & Run
```bash
# Run directly (requires Go 1.25, CGO enabled for sqlite3)
go run .
# Build and run
go build -o proxygo .
./proxygo
# Docker
docker compose up -d
```
CGO is required (`CGO_ENABLED=1`) because of the `github.com/mattn/go-sqlite3` dependency.
## Testing
There are no Go unit tests (`go test ./...`). Testing is done via shell scripts against a running instance:
```bash
# HTTP proxy test (continuous, Ctrl+C to stop)
./test/test_proxy.sh # port 7777 (random)
./test/test_proxy.sh 7776 # port 7776 (stable)
# HTTP proxy HTTPS access test (random visits to Google/OpenAI/GitHub etc.)
./test/test_http_https.sh # port 7777, continuous
./test/test_http_https.sh 7776 20 # port 7776, 20 iterations
# SOCKS5 proxy test
./test/test_socks5.sh localhost 7779 # random
./test/test_socks5.sh localhost 7780 50 # stable, 50 iterations
# Go/Python test scripts
go run test/test_proxy.go 7777
python test/test_proxy.py 7776
```
## Architecture
The system is a single binary with several cooperating goroutines. Module `go.mod` name is `goproxy`.
### Module Dependency Flow
```
main.go (orchestrator)
├── config/ — Global config (env vars + config.json), thread-safe singleton
├── storage/ — SQLite persistence layer (proxies + source_status tables)
├── fetcher/ — Multi-source proxy fetcher with circuit breaker (SourceManager)
├── validator/ — Concurrent proxy validation (connectivity + exit IP + geo + latency)
├── pool/ — Pool manager (admission control, slot allocation, replacement logic)
├── checker/ — Background health checker (batch-based, skips S-grade when healthy)
├── optimizer/ — Background quality optimizer (replaces slow proxies with faster ones)
├── proxy/ — Outward-facing proxy servers
│ ├── server.go — HTTP proxy (implements http.Handler)
│ └── socks5_server.go — SOCKS5 proxy (raw TCP, manual protocol implementation)
├── webui/ — Dashboard server (embedded HTML in html.go, API in dashboard.go)
└── logger/ — In-memory log collector for WebUI display
```
### Key Design Patterns
- **Pool state machine**: healthy → warning → critical → emergency. State determines fetch mode (optimize/refill/emergency) and latency thresholds.
- **Slot-based capacity**: Pool has fixed size split between HTTP/SOCKS5 by configurable ratio (default 3:7). Each protocol has guaranteed minimum slots.
- **Smart admission**: New proxies enter if slots available, or replace worst existing proxy if significantly faster (30%+ by default via `ReplaceThreshold`). HTTP proxies must also pass an HTTPS CONNECT tunnel test (random real HTTPS site visit with retry) before admission.
- **Protocol-parallel validation**: `smartFetchAndFill` splits candidates by protocol and validates SOCKS5/HTTP concurrently. SOCKS5 fills faster (no HTTPS check overhead); HTTP validation runs in parallel without blocking SOCKS5 admission.
- **Circuit breaker on sources**: `SourceManager` tracks consecutive failures per source URL. 3 fails → degraded, 5 → disabled for 30min.
- **Auto-retry on proxy failure**: Both HTTP and SOCKS5 servers retry with different upstream proxies on failure (up to `MaxRetry` times), deleting failed proxies immediately.
- **SOCKS5 service only uses SOCKS5 upstreams** (many free HTTP proxies don't support CONNECT). HTTP service can use either protocol upstream.
### Background Goroutines (started in main.go)
1. **Status monitor** — every 30s, checks pool state and triggers `smartFetchAndFill` if needed
2. **Health checker** — every `HealthCheckInterval` min, validates a batch of proxies
3. **Optimizer** — every `OptimizeInterval` min, fetches from slow sources and replaces B/C grade proxies
4. **Config watcher** — listens for WebUI config changes and adjusts pool slots
### Ports
| Port | Service |
|------|---------|
| 7776 | HTTP proxy (lowest-latency mode) |
| 7777 | HTTP proxy (random rotation mode) |
| 7778 | WebUI dashboard |
| 7779 | SOCKS5 proxy (random rotation mode) |
| 7780 | SOCKS5 proxy (lowest-latency mode) |
### Configuration
- Environment variables: `WEBUI_PASSWORD`, `PROXY_AUTH_ENABLED`, `PROXY_AUTH_USERNAME`, `PROXY_AUTH_PASSWORD`, `BLOCKED_COUNTRIES`, `DATA_DIR`
- Persistent config: `config.json` (or `$DATA_DIR/config.json`) — pool capacity, latency thresholds, intervals. Editable via WebUI.
- Config is loaded once at startup via `config.Load()`, updated in-memory via `config.Save()`. Thread-safe via `sync.RWMutex`.
### Storage
SQLite with `MaxOpenConns(1)` (single-writer). Two tables: `proxies` (with quality grades S/A/B/C based on latency) and `source_status` (circuit breaker state). Schema auto-migrates on startup.
### WebUI
The entire frontend is embedded as Go string literals in `webui/html.go`. The server (`webui/server.go`) serves HTML and API endpoints. `webui/dashboard.go` contains API handlers. Dual-role auth: guest (read-only) and admin (full control via password).
## Code Conventions
- All log messages use `[module]` prefix: `[pool]`, `[fetch]`, `[health]`, `[optimize]`, `[monitor]`, `[socks5]`, `[proxy]`, `[tunnel]`, `[storage]`, `[source]`
- Comments and log messages are in Chinese
- Quality grades: S (≤500ms), A (501-1000ms), B (1001-2000ms), C (>2000ms)
- `storage.Proxy` is the shared data type across all modules

View File

@@ -27,15 +27,17 @@ GoProxy 从多个公开代理源自动抓取 HTTP/SOCKS5 代理,通过严格
## ✨ 核心特性
### 🎯 智能池子机制
- **固定容量管理**:可配置池子大小和 HTTP/SOCKS5 协议比例
- **固定容量管理**:可配置池子大小和 HTTP/SOCKS5 协议比例(默认 3:7
- **质量分级**S/A/B/C 四级评分(基于延迟),智能选择高质量代理
- **动态状态感知**Healthy → Warning → Critical → Emergency 四级状态自适应
- **严格准入标准**:必须通过出口 IP、地理位置、延迟三重验证才可入池
- **HTTPS 可用性验证**HTTP 协议代理入池前额外验证 HTTPS CONNECT 隧道能力,随机访问真实 HTTPS 网站确认可用(失败自动换站重试),确保入池的 HTTP 代理都能正常访问 HTTPS 网站
- **智能替换**:新代理必须显著优于现有代理(默认快 30%)才触发替换
### 🚀 按需抓取
- **源分组策略**快更新源5-30min用于紧急补充慢更新源每天用于优化轮换
- **断路器保护**:连续失败的源自动降级/禁用,冷却后恢复
- **协议并发验证**抓取到的候选代理按协议分组SOCKS5 和 HTTP 各自并发验证入池。SOCKS5 无额外检测天然更快优先填充HTTP 带 HTTPS CONNECT 检测较慢但不阻塞 SOCKS5 入池
- **多模式抓取**
- **Emergency**:单协议缺失或池子 <10%,使用所有可用源
- **Refill**:池子 <80%,使用快更新源
@@ -133,6 +135,7 @@ GoProxy 从多个公开代理源自动抓取 HTTP/SOCKS5 代理,通过严格
├── test/ # 🧪 测试脚本与文档
│ ├── test_proxy.sh # HTTP 代理测试脚本Bash
│ ├── test_socks5.sh # SOCKS5 代理测试脚本Bash
│ ├── test_http_https.sh # HTTP 代理 HTTPS 访问测试脚本Bash
│ ├── test_proxy.go # Go 测试脚本
│ ├── test_proxy.py # Python 测试脚本
│ └── README.md # 测试脚本使用说明
@@ -571,7 +574,7 @@ proxies = {'http': 'socks5://myuser:secure_pass_123@server-ip:7779', 'https': 's
```json
{
"pool_max_size": 100,
"pool_http_ratio": 0.5,
"pool_http_ratio": 0.3,
"pool_min_per_protocol": 10,
"max_latency_ms": 2000,
"max_latency_healthy": 1500,
@@ -615,7 +618,7 @@ proxies = {'http': 'socks5://myuser:secure_pass_123@server-ip:7779', 'https': 's
| 参数 | 默认值 | 说明 | 推荐范围 |
| --- | --- | --- | --- |
| `pool_max_size` | `100` | 代理池总容量 | 50-150 ⚠️ |
| `pool_http_ratio` | `0.5` | HTTP 协议占比 | 0.3-0.8 |
| `pool_http_ratio` | `0.3` | HTTP 协议占比 | 0.2-0.5 |
| `pool_min_per_protocol` | `10` | 每协议最少保证数量 | 5-50 |
> ⚠️ **容量限制说明**:公开代理源质量有限,验证通过率通常只有 1-3%。受地理过滤、延迟标准、出口检测等因素影响,**实际填充率约为 70-90%**。如设置 150 容量,实际可能稳定在 105-135 个。建议根据实际需求设置合理容量。
@@ -657,7 +660,7 @@ proxies = {'http': 'socks5://myuser:secure_pass_123@server-ip:7779', 'https': 's
```json
{
"pool_max_size": 50,
"pool_http_ratio": 0.5,
"pool_http_ratio": 0.3,
"validate_concurrency": 100,
"health_check_interval": 10,
"health_check_batch_size": 10,
@@ -949,6 +952,7 @@ Emergency (总数<10% 或 单协议缺失)
3. **地理位置查询**:获取出口 IP 的国家/城市
4. **延迟测试**:测量连接延迟
5. **质量评估**:根据延迟计算质量等级
6. **HTTPS 隧道验证**(仅 HTTP 协议):通过代理实际访问随机 HTTPS 网站Google/OpenAI/GitHub/Cloudflare/httpbin验证 CONNECT 隧道可用性,首次失败自动换站重试
**入池判断逻辑**
- ✅ 协议槽位未满:直接加入
@@ -1140,6 +1144,18 @@ go run test/test_proxy.go 7777
python test/test_proxy.py 7776
```
**HTTP 代理 HTTPS 访问测试**
```bash
# 持续测试 HTTP 代理访问 HTTPS 网站(随机访问 Google/OpenAI/GitHub 等)
./test/test_http_https.sh
# 指定端口
./test/test_http_https.sh 7776
# 指定端口 + 测试次数
./test/test_http_https.sh 7777 20
```
**SOCKS5 代理测试**
```bash
# 测试 SOCKS5 随机轮换模式7779 端口)
@@ -1347,7 +1363,8 @@ docker logs proxygo --tail 200 | grep -i "socks5.*failed"
### 本项目增强功能
在原项目基础上,我们进行了大量改进和功能增强:
- 🆕 **智能池子机制**固定容量管理、质量分级S/A/B/C、智能替换逻辑
- 🆕 **智能池子机制**固定容量管理、质量分级S/A/B/C、智能替换逻辑、HTTP/SOCKS5 默认 3:7 比例
- 🆕 **HTTPS 可用性验证**HTTP 协议代理入池/刷新时额外验证 HTTPS CONNECT 隧道,随机访问真实网站确认可用
- 🆕 **按需抓取策略**源分组、断路器保护、Emergency/Refill/Optimize 多模式
- 🆕 **分层健康管理**:批次检查、智能跳过 S 级、定时优化轮换
- 🆕 **智能重试机制**:自动故障切换、失败即删除、防重复尝试
@@ -1357,7 +1374,7 @@ docker logs proxygo --tail 200 | grep -i "socks5.*failed"
- 🆕 **黑客风格 WebUI**Matrix 美学、实时仪表盘、完整配置界面、中英文切换
- 🆕 **双角色权限**:访客模式(只读)+ 管理员模式(完全控制),可安全公网开放
- 🆕 **扩展存储层**:质量等级、使用统计、源状态管理
- 🆕 **测试套件**HTTP + SOCKS5 测试脚本,持续运行模式,显示国旗 emoji
- 🆕 **测试套件**HTTP + SOCKS5 + HTTPS 访问测试脚本,持续运行模式,显示国旗 emoji
- 🆕 **CI/CD 自动化**GitHub Actions 自动构建多架构镜像amd64/arm64双仓库发布
- 🆕 **环境变量配置**docker-compose + .env 文件,灵活配置各种部署场景

View File

@@ -161,7 +161,7 @@ func DefaultConfig() *Config {
// 池子容量配置
PoolMaxSize: 100, // 总容量
PoolHTTPRatio: 0.5, // HTTP占50%
PoolHTTPRatio: 0.3, // HTTP占30%
PoolMinPerProtocol: 10, // 每协议最少10个
// 延迟标准配置

118
main.go
View File

@@ -172,41 +172,49 @@ func smartFetchAndFill(fetch *fetcher.Fetcher, validate *validator.Validator, st
return
}
log.Printf("[main] 抓取到 %d 个候选代理,开始严格验证...", len(candidates))
// 按协议分组
var httpCandidates, socks5Candidates []storage.Proxy
for _, c := range candidates {
if c.Protocol == "http" {
httpCandidates = append(httpCandidates, c)
} else {
socks5Candidates = append(socks5Candidates, c)
}
}
// 严格验证并尝试入池
addedCount := 0
validCount := 0
rejectedNoExit := 0
rejectedLatency := 0
rejectedGeo := 0
rejectedFull := 0
log.Printf("[main] 抓取到 %d 个候选代理SOCKS5=%d HTTP=%d按协议并发验证...",
len(candidates), len(socks5Candidates), len(httpCandidates))
for result := range validate.ValidateStream(candidates) {
// 共享计数器
var addedCount atomic.Int32
var validCount atomic.Int32
var rejectedNoExit atomic.Int32
var rejectedLatency atomic.Int32
var rejectedGeo atomic.Int32
var rejectedFull atomic.Int32
// 入池处理函数(两个协程共用)
processResult := func(result validator.Result) {
if !result.Valid {
continue
return
}
validCount++
validCount.Add(1)
latencyMs := int(result.Latency.Milliseconds())
// 根据池子状态动态调整延迟标准
cfg := config.Get()
maxLatency := cfg.GetLatencyThreshold(status.State)
// 检查有出口IP、有位置
if result.ExitIP == "" || result.ExitLocation == "" {
rejectedNoExit++
continue
rejectedNoExit.Add(1)
return
}
// 检查:延迟达标
if latencyMs > maxLatency {
rejectedLatency++
continue
rejectedLatency.Add(1)
return
}
// 尝试加入池子
proxyToAdd := storage.Proxy{
Address: result.Proxy.Address,
Protocol: result.Proxy.Protocol,
@@ -216,41 +224,71 @@ func smartFetchAndFill(fetch *fetcher.Fetcher, validate *validator.Validator, st
}
if added, reason := poolMgr.TryAddProxy(proxyToAdd); added {
addedCount++
addedCount.Add(1)
} else if reason == "slots_full" {
rejectedFull++
rejectedFull.Add(1)
} else if len(result.ExitLocation) >= 2 {
// 检查是否被地理过滤
countryCode := result.ExitLocation[:2]
for _, blocked := range cfg.BlockedCountries {
if countryCode == blocked {
rejectedGeo++
rejectedGeo.Add(1)
break
}
}
}
// 如果是紧急模式且已达到最小要求,停止验证
if mode == "emergency" && status.HTTP >= cfg.PoolMinPerProtocol && status.SOCKS5 >= cfg.PoolMinPerProtocol {
log.Println("[main] 🎉 紧急模式:达到最小要求,停止验证")
break
}
// 动态检查是否已经填满
if addedCount > 0 && addedCount%20 == 0 {
currentStatus, _ := poolMgr.GetStatus()
if !poolMgr.NeedsFetchQuick(currentStatus) {
log.Println("[main] ✅ 池子已填满,停止验证")
break
}
}
}
// 池子是否已满的检查函数
poolFilled := func() bool {
currentStatus, _ := poolMgr.GetStatus()
return !poolMgr.NeedsFetchQuick(currentStatus)
}
var wg sync.WaitGroup
// SOCKS5 协程:验证快,优先填充
if len(socks5Candidates) > 0 {
wg.Add(1)
go func() {
defer wg.Done()
count := 0
for result := range validate.ValidateStream(socks5Candidates) {
processResult(result)
count++
if count%20 == 0 && poolFilled() {
log.Println("[main] ✅ SOCKS5 验证中检测到池子已满,停止")
break
}
}
log.Printf("[main] SOCKS5 验证完成,处理 %d 个", count)
}()
}
// HTTP 协程:有额外 HTTPS 检测,较慢
if len(httpCandidates) > 0 {
wg.Add(1)
go func() {
defer wg.Done()
count := 0
for result := range validate.ValidateStream(httpCandidates) {
processResult(result)
count++
if count%20 == 0 && poolFilled() {
log.Println("[main] ✅ HTTP 验证中检测到池子已满,停止")
break
}
}
log.Printf("[main] HTTP 验证完成,处理 %d 个", count)
}()
}
wg.Wait()
// 最终状态
finalStatus, _ := poolMgr.GetStatus()
log.Printf("[main] 填充完成: 验证%d 通过%d 入池%d | 拒绝[无出口:%d 延迟:%d 地理:%d 满:%d] | 最终: %s HTTP=%d SOCKS5=%d",
len(candidates), validCount, addedCount,
rejectedNoExit, rejectedLatency, rejectedGeo, rejectedFull,
len(candidates), validCount.Load(), addedCount.Load(),
rejectedNoExit.Load(), rejectedLatency.Load(), rejectedGeo.Load(), rejectedFull.Load(),
finalStatus.State, finalStatus.HTTP, finalStatus.SOCKS5)
}

81
test/test_http_https.sh Executable file
View File

@@ -0,0 +1,81 @@
#!/bin/bash
# GoProxy HTTP 协议代理 HTTPS 访问测试脚本
# 随机访问多个 HTTPS 网站,验证 HTTP 代理的 CONNECT 隧道能力
# 用法: ./test_http_https.sh [端口号默认7777] [测试次数,默认持续运行]
# 按 Ctrl+C 停止测试
PROXY_HOST="127.0.0.1"
PROXY_PORT="${1:-7777}"
MAX_COUNT="${2:-0}" # 0 = 持续运行
DELAY=2
# 测试目标HTTPS 网站)
TARGETS=(
"https://www.google.com"
"https://www.openai.com"
"https://www.github.com"
"https://www.cloudflare.com"
"https://httpbin.org/ip"
)
# 统计变量
total=0
success=0
fail=0
# 获取毫秒时间戳
get_ms_time() {
python3 -c 'import time; print(int(time.time() * 1000))'
}
# 捕获 Ctrl+C 信号
trap ctrl_c INT
function ctrl_c() {
echo ""
echo "---"
if [ $total -gt 0 ]; then
loss_rate=$(awk "BEGIN {printf \"%.1f\", ($total - $success)/$total*100}")
success_rate=$(awk "BEGIN {printf \"%.1f\", $success/$total*100}")
echo "$total requests transmitted, $success succeeded, $fail failed, ${loss_rate}% loss, ${success_rate}% success rate"
fi
exit 0
}
echo "HTTP PROXY HTTPS TEST — $PROXY_HOST:$PROXY_PORT"
echo "targets: ${#TARGETS[@]} HTTPS sites"
echo ""
while true; do
# 随机选择目标
idx=$((RANDOM % ${#TARGETS[@]}))
target="${TARGETS[$idx]}"
total=$((total + 1))
start_time=$(get_ms_time)
response=$(curl -x "http://${PROXY_HOST}:${PROXY_PORT}" \
-s -k \
-o /dev/null \
-w "%{http_code}" \
--connect-timeout 10 \
--max-time 15 \
"${target}" 2>&1)
end_time=$(get_ms_time)
elapsed=$((end_time - start_time))
if [[ "$response" =~ ^[23] ]]; then
echo "✅ seq=$total ${target} -> HTTP $response time=${elapsed}ms"
success=$((success + 1))
else
echo "❌ seq=$total ${target} -> HTTP $response time=${elapsed}ms"
fail=$((fail + 1))
fi
# 达到指定次数则停止
if [ "$MAX_COUNT" -gt 0 ] && [ "$total" -ge "$MAX_COUNT" ]; then
ctrl_c
fi
sleep $DELAY
done

View File

@@ -83,6 +83,52 @@ func getExitIPInfo(client *http.Client) (string, string) {
return result.Query, location
}
// HTTPS 测试目标列表,随机选一个验证代理的 CONNECT 隧道能力
var httpsTestTargets = []string{
"https://www.google.com",
"https://www.openai.com",
"https://www.github.com",
"https://www.cloudflare.com",
"https://httpbin.org/ip",
}
// checkHTTPSConnect 通过 HTTP 代理实际访问一个随机 HTTPS 网站,验证 CONNECT 隧道是否可用
// 首次失败会换一个目标重试一次,避免目标网站偶尔抽风导致误杀
func checkHTTPSConnect(proxyAddr string, timeout time.Duration) bool {
proxyURL, err := url.Parse(fmt.Sprintf("http://%s", proxyAddr))
if err != nil {
return false
}
client := &http.Client{
Transport: &http.Transport{
Proxy: http.ProxyURL(proxyURL),
TLSHandshakeTimeout: timeout,
},
Timeout: timeout,
}
// 随机起始索引
start := int(time.Now().UnixNano() % int64(len(httpsTestTargets)))
for attempt := 0; attempt < 2; attempt++ {
idx := (start + attempt) % len(httpsTestTargets)
resp, err := client.Get(httpsTestTargets[idx])
if err != nil {
continue
}
io.Copy(io.Discard, resp.Body)
resp.Body.Close()
// 2xx 或 3xx 都算成功(部分网站会重定向)
if resp.StatusCode >= 200 && resp.StatusCode < 400 {
return true
}
}
return false
}
// ValidateAll 并发验证所有代理,返回验证结果
func (v *Validator) ValidateAll(proxies []storage.Proxy) []Result {
var results []Result
@@ -172,6 +218,13 @@ func (v *Validator) ValidateOne(p storage.Proxy) (bool, time.Duration, string, s
}
}
// HTTP 代理额外检测:必须支持 HTTPS CONNECT 隧道
if p.Protocol == "http" {
if !checkHTTPSConnect(p.Address, v.timeout) {
return false, latency, exitIP, exitLocation
}
}
return true, latency, exitIP, exitLocation
}