Commit Graph

15 Commits

Author SHA1 Message Date
程序员阿江(Relakkes)
0c5f281212 fix: 避免复用浏览器时跨域 Cookie 过长导致请求失败
连接已有 Chrome 会把整个浏览器上下文的 cookie 带入平台 client。
除 xhs 外,多数平台仍直接读取全量 cookies,导致请求头过长并放大跨域污染。
本次将各平台的 cookie 读取统一收口到平台域名,并补上基础回归测试。

Constraint: 必须继续复用用户真实浏览器里的平台登录态
Rejected: 仅修复 xhs | 其他平台在连接已有浏览器时仍会携带超长 Cookie
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: 后续新增平台或调整 update_cookies 和 create client 流程时,只按平台域名读取 cookies
Tested: uv run pytest test/test_utils.py; python3 -m compileall tools/crawler_util.py media_platform/douyin/core.py media_platform/douyin/client.py media_platform/kuaishou/core.py media_platform/kuaishou/client.py media_platform/bilibili/core.py media_platform/bilibili/client.py media_platform/zhihu/core.py media_platform/zhihu/client.py media_platform/tieba/core.py media_platform/tieba/client.py media_platform/xhs/core.py media_platform/xhs/client.py media_platform/weibo/core.py media_platform/weibo/client.py test/test_utils.py
Not-tested: 各平台在真实 CDP 浏览器连接下的端到端抓取流程
2026-04-21 13:49:37 +08:00
程序员阿江(Relakkes)
d614ccf247 docs: translate comments and metadata to English
Update Chinese comments, variable descriptions, and metadata across
multiple configuration and core files to English. This improves
codebase accessibility for international developers. Additionally,
removed the sponsorship section from README files.
2026-02-12 05:30:11 +08:00
程序员阿江(Relakkes)
157ddfb21b i18n: translate all Chinese comments, docstrings, and logger messages to English
Comprehensive translation of Chinese text to English across the entire codebase:

- api/: FastAPI server documentation and logger messages
- cache/: Cache abstraction layer comments and docstrings
- database/: Database models and MongoDB store documentation
- media_platform/: All platform crawlers (Bilibili, Douyin, Kuaishou, Tieba, Weibo, Xiaohongshu, Zhihu)
- model/: Data model documentation
- proxy/: Proxy pool and provider documentation
- store/: Data storage layer comments
- tools/: Utility functions and browser automation
- test/: Test file documentation

Preserved: Chinese disclaimer header (lines 10-18) for legal compliance

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-26 23:27:19 +08:00
程序员阿江(Relakkes)
f1e7124654 fix: proxy extract error 2025-11-26 16:01:54 +08:00
程序员阿江(Relakkes)
ff8c92daad chore: add copyright to every file 2025-11-18 12:24:02 +08:00
yt210
7888f4c6bd 优化mongodb配置获取逻辑,移动存储基类位置。集成测试 2025-11-07 17:42:50 +08:00
persist-1
a38058856f test: 添加数据库同步测试脚本用于ORM与数据库结构对比与同步
fix(database): 修复大量不适当的字段类型
2025-09-08 00:13:00 +08:00
Relakkes
9fe3e47b0f chore: 增加代码学习声明,严格禁止非法、禁止商业、不当用途 2024-10-20 00:43:25 +08:00
Relakkes
4bba1447f8 feat: cache impl done 2024-06-02 19:57:13 +08:00
Relakkes
dde3c0429e refactor: IP代理池重构 2024-04-06 10:58:35 +08:00
Relakkes
1cec23f73d feat: 代理IP功能 Done 2023-12-08 00:10:04 +08:00
Relakkes
c530bd4219 feat: 代理IP缓存到redis中 2023-12-06 23:49:56 +08:00
Relakkes
986179b9c9 feat: 增加 IP 代理的最新实现 2023-12-02 16:14:36 +08:00
Relakkes
e75707443b feat: 增加配置项支持自由选择数据是否保存到关系型数据库中 2023-07-24 20:59:43 +08:00
Nanmi
745e59c875 feat: 完善类型注释,增加 mypy 类型检测 2023-07-16 17:57:18 +08:00