程序员阿江(Relakkes)
0c5f281212
fix: 避免复用浏览器时跨域 Cookie 过长导致请求失败
...
连接已有 Chrome 会把整个浏览器上下文的 cookie 带入平台 client。
除 xhs 外,多数平台仍直接读取全量 cookies,导致请求头过长并放大跨域污染。
本次将各平台的 cookie 读取统一收口到平台域名,并补上基础回归测试。
Constraint: 必须继续复用用户真实浏览器里的平台登录态
Rejected: 仅修复 xhs | 其他平台在连接已有浏览器时仍会携带超长 Cookie
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: 后续新增平台或调整 update_cookies 和 create client 流程时,只按平台域名读取 cookies
Tested: uv run pytest test/test_utils.py; python3 -m compileall tools/crawler_util.py media_platform/douyin/core.py media_platform/douyin/client.py media_platform/kuaishou/core.py media_platform/kuaishou/client.py media_platform/bilibili/core.py media_platform/bilibili/client.py media_platform/zhihu/core.py media_platform/zhihu/client.py media_platform/tieba/core.py media_platform/tieba/client.py media_platform/xhs/core.py media_platform/xhs/client.py media_platform/weibo/core.py media_platform/weibo/client.py test/test_utils.py
Not-tested: 各平台在真实 CDP 浏览器连接下的端到端抓取流程
2026-04-21 13:49:37 +08:00
程序员阿江(Relakkes)
157ddfb21b
i18n: translate all Chinese comments, docstrings, and logger messages to English
...
Comprehensive translation of Chinese text to English across the entire codebase:
- api/: FastAPI server documentation and logger messages
- cache/: Cache abstraction layer comments and docstrings
- database/: Database models and MongoDB store documentation
- media_platform/: All platform crawlers (Bilibili, Douyin, Kuaishou, Tieba, Weibo, Xiaohongshu, Zhihu)
- model/: Data model documentation
- proxy/: Proxy pool and provider documentation
- store/: Data storage layer comments
- tools/: Utility functions and browser automation
- test/: Test file documentation
Preserved: Chinese disclaimer header (lines 10-18) for legal compliance
🤖 Generated with [Claude Code](https://claude.com/claude-code )
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2025-12-26 23:27:19 +08:00
程序员阿江(Relakkes)
6eef02d08c
feat: ip proxy expired check
2025-11-25 12:39:10 +08:00
程序员阿江(Relakkes)
ff8c92daad
chore: add copyright to every file
2025-11-18 12:24:02 +08:00
程序员阿江(Relakkes)
ed6e0bfb5f
refactor: tieba 改为浏览器获取数据
2025-10-19 17:09:55 +08:00
未来可欺
0b81240aed
升级 httpx 版本至 0.28.1,并修改关键字参数 proxies 至 proxy
2025-07-31 22:48:02 +08:00
Relakkes
fbbead814a
fix: 贴吧创作者bug修复
2025-01-02 20:29:05 +08:00
unknown
7e53c4acfc
All_platform_comments_restrict
2024-10-23 16:32:02 +08:00
Relakkes
9fe3e47b0f
chore: 增加代码学习声明,严格禁止非法、禁止商业、不当用途
2024-10-20 00:43:25 +08:00
Relakkes Yang
acb29add28
feat: 百度贴吧支持创作者主页帖子爬取
2024-08-24 11:03:23 +08:00
Relakkes
8adb593ba6
temp commit
2024-08-24 09:12:03 +08:00
Relakkes
3f42368c02
feat: 百度贴吧done
2024-08-08 14:19:32 +08:00
Relakkes
df0f5c1113
feat: 百度贴吧子评论done
2024-08-07 04:13:15 +08:00
Relakkes
026d81e131
feat: 百度贴吧一级评论done
2024-08-07 02:34:56 +08:00
Relakkes
3c98808409
feat: 贴吧搜索重构
2024-08-07 01:01:21 +08:00
Relakkes
1b585cb215
temp commit
2024-08-06 19:21:34 +08:00
Relakkes
d347cf5a2c
feat: 帖子搜索 & 移除登录代码使用IP代理
2024-08-06 03:37:55 +08:00
Relakkes
a87094f2fd
feat: 百度贴吧架子 & 登录done
2024-08-05 18:51:51 +08:00