程序员阿江(Relakkes)
4de2a325a9
feat: ks comment api upgrade to v2
2026-01-09 21:09:39 +08:00
程序员阿江(Relakkes)
157ddfb21b
i18n: translate all Chinese comments, docstrings, and logger messages to English
...
Comprehensive translation of Chinese text to English across the entire codebase:
- api/: FastAPI server documentation and logger messages
- cache/: Cache abstraction layer comments and docstrings
- database/: Database models and MongoDB store documentation
- media_platform/: All platform crawlers (Bilibili, Douyin, Kuaishou, Tieba, Weibo, Xiaohongshu, Zhihu)
- model/: Data model documentation
- proxy/: Proxy pool and provider documentation
- store/: Data storage layer comments
- tools/: Utility functions and browser automation
- test/: Test file documentation
Preserved: Chinese disclaimer header (lines 10-18) for legal compliance
🤖 Generated with [Claude Code](https://claude.com/claude-code )
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2025-12-26 23:27:19 +08:00
程序员阿江(Relakkes)
6eef02d08c
feat: ip proxy expired check
2025-11-25 12:39:10 +08:00
程序员阿江(Relakkes)
ff8c92daad
chore: add copyright to every file
2025-11-18 12:24:02 +08:00
yt210
ef6948b305
新增存储到mongoDB
2025-11-06 10:40:30 +08:00
程序员阿江(Relakkes)
03e384bbe2
refactor: cdp模式下移除stealth注入
2025-10-19 15:32:03 +08:00
程序员阿江(Relakkes)
ae7955787c
feat: kuaishou support url link
2025-10-18 07:40:10 +08:00
程序员阿江(Relakkes)
2cf143cc7c
fix : #730
2025-09-26 18:10:30 +08:00
程序员阿江(Relakkes)
2bce3593f7
feat: support time deplay for all platform
2025-09-02 16:43:09 +08:00
未来可欺
0b81240aed
升级 httpx 版本至 0.28.1,并修改关键字参数 proxies 至 proxy
2025-07-31 22:48:02 +08:00
korruz
07a6e387ea
refactor: move format_proxy_info to utils and update crawler classes to use it
2025-07-29 14:16:24 +08:00
程序员阿江(Relakkes)
13b00f7a36
refactor: config update
2025-07-18 23:26:52 +08:00
gaoxiaobei
b913db64bb
refactor(config): move platform-specific configs to separate files
...
- Remove platform-specific configurations from base_config.py
- Create separate config files for each platform in their respective directories
- Update import statements in core files to use new platform-specific config modules
- Clean up unused and deprecated configuration options
2025-07-18 17:27:37 +08:00
程序员阿江(Relakkes)
848df2b491
feat: other platfrom support the cdp mode
2025-07-03 17:13:32 +08:00
Relakkes
061d1c15e2
feat: kuaishou search params update
2025-03-11 23:42:34 +08:00
unknown
7e53c4acfc
All_platform_comments_restrict
2024-10-23 16:32:02 +08:00
Relakkes
9fe3e47b0f
chore: 增加代码学习声明,严格禁止非法、禁止商业、不当用途
2024-10-20 00:43:25 +08:00
Relakkes
c70bd9e071
feat: 增加搜索词来源渠道
2024-08-23 08:29:24 +08:00
Relakkes
d3eeccbaac
feat: logger record current search page
2024-06-24 22:24:51 +08:00
Relakkes Yang
a0e5a29af8
fix: weibo bug
2024-06-17 00:25:48 +08:00
HIRO
1d224999af
fix 二级评论爬取bug
2024-06-13 15:57:09 +08:00
HIRO
a001556ba7
快手指定创作者主页和二级评论
2024-06-13 14:49:07 +08:00
nelzomal
eace7d1750
improve base config reading command line arg logic
2024-06-09 18:51:36 +08:00
Relakkes
4bba1447f8
feat: cache impl done
2024-06-02 19:57:13 +08:00
Relakkes
e64df93edd
feat: 由于xhs和dy现在检测playwright二维码登录了,大概率会出现滑块或者手机验证,增加登录态检测时间为5min,预留足够的时间手动过验证码。
2024-05-15 23:23:30 +08:00
Relakkes
87eb8aa6a7
fix : #230
2024-04-13 20:18:04 +08:00
Tianci-King
1115b0d90c
feat(core): 新增控制爬虫 参数起始页面的页数start_page;perf(argparse): 向命令行解析器添加程序参数起始页面页数和关键字
2024-04-12 00:52:47 +08:00
leantli
68a60faa7f
chore: 简化判断方式
2024-04-04 00:11:22 +08:00
leantli
133f978477
fix: 修复爬取视频/帖子最大数设置值较低导致不爬取的问题
2024-04-03 12:18:23 +08:00
Relakkes
e950e0d6e3
feat: add abstract api client to all platform
2024-03-30 21:27:25 +08:00
Relakkes
59cd9f67a0
feat: 支持评论模式是否开启爬取选项
2024-03-16 11:52:42 +08:00
Relakkes
e940a41033
refactor: 移除评论中指定数量和过滤特定关键词的逻辑
2024-01-17 23:02:05 +08:00
Relakkes
894dabcf63
refactor: 数据存储重构,分离不同类型的存储实现
2024-01-14 22:06:31 +08:00
Relakkes
e31aebbdfb
fix: 修复代理Bug
2024-01-13 15:50:02 +08:00
Relakkes
aba9f14f50
refactor: 规范日志打印
...
feat: B站指定视频ID爬取(bvid)
2023-12-23 01:04:08 +08:00
Relakkes
273c9a316b
fix: 修复日志打印时参数格式错误
2023-12-22 23:10:44 +08:00
peanutsplash
f17a85305e
添加功能:(哔哩哔哩,快手,小红书)每个视频/帖子抓取评论最大条数限制,评论关键词筛选
2023-12-13 23:53:12 +08:00
Relakkes
97d7a0c38b
feat: Bilibili comment done
2023-12-09 21:10:01 +08:00
Relakkes
1cec23f73d
feat: 代理IP功能 Done
2023-12-08 00:10:04 +08:00
Relakkes
a6e877de42
fix: 修复B站搜索Field命名 bug
...
refactor: ping接口统一更换为pong
2023-12-05 22:54:47 +08:00
Relakkes
94b5030ef0
feat: B站二维码、Cookie登录实现
2023-12-04 00:02:00 +08:00
Relakkes
5c920da288
feat: 快手二维码登录
2023-12-02 18:22:55 +08:00
Relakkes
986179b9c9
feat: 增加 IP 代理的最新实现
2023-12-02 16:14:36 +08:00
Relakkes
33721e5fbd
feat: 快手支持指定视频列表爬取
2023-11-27 23:07:04 +08:00
Relakkes
62534d7ee2
fix: 移出快手 client 多余的代码
2023-11-26 22:11:06 +08:00
Relakkes
dfb1788141
feat: 快手视频评论爬取done;数据保存到DB、CSV done
2023-11-26 21:43:39 +08:00
Relakkes
bdf36ccb09
feat: 快手关键词搜索存储CSV完成
2023-11-26 01:05:52 +08:00
Relakkes
512192a93e
feat: 搜索接口调试完成
2023-11-25 00:02:33 +08:00
Relakkes
f08b2ceb76
feat: 1、命令行支持快手 2、快手playwright 代码 done
2023-11-24 00:04:33 +08:00
Relakkes
95ca606938
feat: 快手文件目录建立
2023-11-23 23:13:54 +08:00