程序员阿江(Relakkes)
157ddfb21b
i18n: translate all Chinese comments, docstrings, and logger messages to English
...
Comprehensive translation of Chinese text to English across the entire codebase:
- api/: FastAPI server documentation and logger messages
- cache/: Cache abstraction layer comments and docstrings
- database/: Database models and MongoDB store documentation
- media_platform/: All platform crawlers (Bilibili, Douyin, Kuaishou, Tieba, Weibo, Xiaohongshu, Zhihu)
- model/: Data model documentation
- proxy/: Proxy pool and provider documentation
- store/: Data storage layer comments
- tools/: Utility functions and browser automation
- test/: Test file documentation
Preserved: Chinese disclaimer header (lines 10-18) for legal compliance
🤖 Generated with [Claude Code](https://claude.com/claude-code )
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2025-12-26 23:27:19 +08:00
MEI
ff9a1624f1
fix: params参数以及路径问题
2025-12-03 10:31:32 +08:00
程序员阿江(Relakkes)
f989ce0788
feat: xhs sign playwright version
2025-11-27 10:53:08 +08:00
程序员阿江(Relakkes)
6eef02d08c
feat: ip proxy expired check
2025-11-25 12:39:10 +08:00
程序员阿江(Relakkes)
ff8c92daad
chore: add copyright to every file
2025-11-18 12:24:02 +08:00
程序员阿江(Relakkes)
a1c5e07df8
fix: xhs sub comment bugfix #769
2025-11-17 11:47:33 +08:00
程序员阿江(Relakkes)
b6caa7a85e
refactor: add xhs creator params
2025-11-10 21:10:03 +08:00
程序员阿江(Relakkes)
b5dab6d1e8
refactor: 使用 xhshow 替代 playwright 签名方案
...
感谢 @Cloxl/xhshow 开源项目
🤖 Generated with [Claude Code](https://claude.com/claude-code )
Co-Authored-By: Claude <noreply@anthropic.com >
2025-11-10 18:12:45 +08:00
程序员阿江(Relakkes)
3f5925e326
feat: update xhs sign
2025-10-27 19:06:07 +08:00
程序员阿江(Relakkes)
a9dd08680f
feat: xhs support creator url link
2025-10-18 07:20:09 +08:00
刘小龙
c87df59996
log client modify
2025-09-09 15:27:46 +08:00
程序员阿江(Relakkes)
eb799e1fa7
refactor: xhs extractor
2025-09-02 14:50:32 +08:00
未来可欺
6a10d0d11c
原始的HTTPStatusError不能捕获像ConnectError、ReadError这些异常类型,本次提交修改了捕获异常的类型为httpx模块请求异常的基类:HTTPError,以便捕获在httpx.request方法中引发的任何异常(例如ip被封,服务器拒接连接),正确处理爬取媒体被中断时并不会导致爬取文本的中断逻辑
2025-08-06 11:24:51 +08:00
未来可欺
81f2dbe4ab
添加了对媒体资源服务器的异常处理,参见 issue #691
2025-08-05 13:11:00 +08:00
未来可欺
0b81240aed
升级 httpx 版本至 0.28.1,并修改关键字参数 proxies 至 proxy
2025-07-31 22:48:02 +08:00
未来可欺
93a1c27fff
通过测试search模式,修复部分运行时的bug,并对能够爬取媒体的平台设置了较长的超时时间
2025-07-30 21:19:56 +08:00
程序员阿江(Relakkes)
a4d9aaa34a
refactor: xhs update
2025-07-21 21:26:16 +08:00
chimeElm
26a845581e
Update client.py
...
修复CRAWLER_MAX_NOTES_COUNT在爬取小红书作者帖子时失效的问题
2025-06-07 02:41:09 +08:00
crpa33
274d64aefc
处理xhs意外的评论信息为空的情况
...
报错就会打断我,我没辙
2025-04-02 11:59:27 +08:00
leantli
e830ada574
feat: xhs comments add xsec_token
2024-12-03 18:25:21 +08:00
Relakkes
43dffeb2d1
feat: xhs帖子详情获取优化
2024-11-26 13:37:53 +08:00
lyx0727
705b810269
fix: xhs验证码处理,跳转链接修复
2024-11-01 22:08:46 +08:00
lyx0727
1a37df4d5e
fix: xhs出现验证码时报错,增加用户手动验证
2024-11-01 18:52:16 +08:00
程序员阿江(Relakkes)
5a27ad089c
Merge pull request #462 from FloRainRJY/xiaohongshu_comment_number_restrict
...
All_platform_comments_restrict
2024-10-24 15:31:13 +08:00
liugangdao
78c09c4ae1
fix:翻页时search id不变
2024-10-24 13:57:27 +08:00
unknown
7e53c4acfc
All_platform_comments_restrict
2024-10-23 16:32:02 +08:00
unknown
19269c66fd
xiaohongshu_comment_number_restrict
2024-10-22 20:33:10 +08:00
Relakkes
03e393949a
fix: xhs帖子详情问题更新
2024-10-20 00:59:08 +08:00
Relakkes
9fe3e47b0f
chore: 增加代码学习声明,严格禁止非法、禁止商业、不当用途
2024-10-20 00:43:25 +08:00
Relakkes
6dd3420743
fix : #423
2024-09-02 22:57:28 +08:00
Relakkes
65699aa1cb
feat: xhs支持获取评论的点赞数量
2024-08-24 06:07:33 +08:00
Relakkes
f371675d47
fix: xhs指定笔记ID获取方式增加解析html方式,原来的由于xsec_token导致失效
2024-08-11 22:37:10 +08:00
Relakkes
573ca9a659
feat: xhs笔记详情更新
2024-07-25 00:44:46 +08:00
helloteemo
b95dc2c125
fix: 小红书下载新版本使用>3.10特性, 降低使用版本
2024-07-12 09:50:03 +08:00
helloteemo
6545a15ff3
feature: 支持小红书图片、视频下载
2024-07-11 22:56:30 +08:00
helloteemo
e71690a985
fix: 解决小红书图片水印问题
2024-07-11 17:39:48 +08:00
Relakkes
87eb8aa6a7
fix : #230
2024-04-13 20:18:04 +08:00
leantli
ad01dfba95
feat: 轻量化支持爬取小红书二级评论
2024-04-12 17:32:20 +08:00
leantli
81a9946afd
feat: 支持爬取小红书二级评论
2024-04-11 17:16:13 +08:00
Relakkes
8f02da73ad
fix : #219
...
docs: update README.md
2024-04-08 00:19:50 +08:00
Relakkes
e950e0d6e3
feat: add abstract api client to all platform
2024-03-30 21:27:25 +08:00
Relakkes
67ec49498a
refactor: rename xhs to xiaohongshu
2024-03-30 21:17:33 +08:00
Relakkes
96309dcfee
fix: 小红书创作者功能数据获取优化
2024-03-17 14:50:10 +08:00
Relakkes
41fee4ff4f
feat:小红书支持获取评论中的图片链接 #145
2024-03-07 22:30:44 +08:00
jayeeliu@gmail.com
61ba8c5cc7
feat: 小红书支持通过博主ID采集笔记和评论,小红书type=search时支持配置按哪种排序方式获取笔记数据,小红书笔记增加视频地址和标签字段
2024-03-02 01:49:42 +08:00
Relakkes
e0f9a487e4
refactor: 代码优化
2024-01-16 00:40:07 +08:00
Relakkes
aba9f14f50
refactor: 规范日志打印
...
feat: B站指定视频ID爬取(bvid)
2023-12-23 01:04:08 +08:00
Relakkes
273c9a316b
fix: 修复日志打印时参数格式错误
2023-12-22 23:10:44 +08:00
Relakkes
2e032d09dc
refactor: xhs add log
2023-12-19 23:05:12 +08:00
Relakkes
a6e877de42
fix: 修复B站搜索Field命名 bug
...
refactor: ping接口统一更换为pong
2023-12-05 22:54:47 +08:00