Commit Graph

66 Commits

Author SHA1 Message Date
程序员阿江(Relakkes)
d614ccf247 docs: translate comments and metadata to English
Update Chinese comments, variable descriptions, and metadata across
multiple configuration and core files to English. This improves
codebase accessibility for international developers. Additionally,
removed the sponsorship section from README files.
2026-02-12 05:30:11 +08:00
ouzhuowei
e54463ac78 处理子评论获取失败导致整个流程中断问题
Co-Authored-By: ouzhuowei <190020754@qq.com>
2026-02-10 17:53:30 +08:00
程序员阿江(Relakkes)
c309871485 refactor(xhs): improve login state check logic 2026-02-03 20:49:46 +08:00
程序员阿江(Relakkes)
6625663bde feat: #823 2026-02-03 20:40:15 +08:00
程序员阿江(Relakkes)
157ddfb21b i18n: translate all Chinese comments, docstrings, and logger messages to English
Comprehensive translation of Chinese text to English across the entire codebase:

- api/: FastAPI server documentation and logger messages
- cache/: Cache abstraction layer comments and docstrings
- database/: Database models and MongoDB store documentation
- media_platform/: All platform crawlers (Bilibili, Douyin, Kuaishou, Tieba, Weibo, Xiaohongshu, Zhihu)
- model/: Data model documentation
- proxy/: Proxy pool and provider documentation
- store/: Data storage layer comments
- tools/: Utility functions and browser automation
- test/: Test file documentation

Preserved: Chinese disclaimer header (lines 10-18) for legal compliance

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-26 23:27:19 +08:00
MEI
ff9a1624f1 fix: params参数以及路径问题 2025-12-03 10:31:32 +08:00
程序员阿江(Relakkes)
f989ce0788 feat: xhs sign playwright version 2025-11-27 10:53:08 +08:00
程序员阿江(Relakkes)
6eef02d08c feat: ip proxy expired check 2025-11-25 12:39:10 +08:00
程序员阿江(Relakkes)
ff8c92daad chore: add copyright to every file 2025-11-18 12:24:02 +08:00
程序员阿江(Relakkes)
a1c5e07df8 fix: xhs sub comment bugfix #769 2025-11-17 11:47:33 +08:00
程序员阿江(Relakkes)
b6caa7a85e refactor: add xhs creator params 2025-11-10 21:10:03 +08:00
程序员阿江(Relakkes)
b5dab6d1e8 refactor: 使用 xhshow 替代 playwright 签名方案
感谢 @Cloxl/xhshow 开源项目

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-10 18:12:45 +08:00
程序员阿江(Relakkes)
3f5925e326 feat: update xhs sign 2025-10-27 19:06:07 +08:00
程序员阿江(Relakkes)
a9dd08680f feat: xhs support creator url link 2025-10-18 07:20:09 +08:00
刘小龙
c87df59996 log client modify 2025-09-09 15:27:46 +08:00
程序员阿江(Relakkes)
eb799e1fa7 refactor: xhs extractor 2025-09-02 14:50:32 +08:00
未来可欺
6a10d0d11c 原始的HTTPStatusError不能捕获像ConnectError、ReadError这些异常类型,本次提交修改了捕获异常的类型为httpx模块请求异常的基类:HTTPError,以便捕获在httpx.request方法中引发的任何异常(例如ip被封,服务器拒接连接),正确处理爬取媒体被中断时并不会导致爬取文本的中断逻辑 2025-08-06 11:24:51 +08:00
未来可欺
81f2dbe4ab 添加了对媒体资源服务器的异常处理,参见 issue #691 2025-08-05 13:11:00 +08:00
未来可欺
0b81240aed 升级 httpx 版本至 0.28.1,并修改关键字参数 proxies 至 proxy 2025-07-31 22:48:02 +08:00
未来可欺
93a1c27fff 通过测试search模式,修复部分运行时的bug,并对能够爬取媒体的平台设置了较长的超时时间 2025-07-30 21:19:56 +08:00
程序员阿江(Relakkes)
a4d9aaa34a refactor: xhs update 2025-07-21 21:26:16 +08:00
chimeElm
26a845581e Update client.py
修复CRAWLER_MAX_NOTES_COUNT在爬取小红书作者帖子时失效的问题
2025-06-07 02:41:09 +08:00
crpa33
274d64aefc 处理xhs意外的评论信息为空的情况
报错就会打断我,我没辙
2025-04-02 11:59:27 +08:00
leantli
e830ada574 feat: xhs comments add xsec_token 2024-12-03 18:25:21 +08:00
Relakkes
43dffeb2d1 feat: xhs帖子详情获取优化 2024-11-26 13:37:53 +08:00
lyx0727
705b810269 fix: xhs验证码处理,跳转链接修复 2024-11-01 22:08:46 +08:00
lyx0727
1a37df4d5e fix: xhs出现验证码时报错,增加用户手动验证 2024-11-01 18:52:16 +08:00
程序员阿江(Relakkes)
5a27ad089c Merge pull request #462 from FloRainRJY/xiaohongshu_comment_number_restrict
All_platform_comments_restrict
2024-10-24 15:31:13 +08:00
liugangdao
78c09c4ae1 fix:翻页时search id不变 2024-10-24 13:57:27 +08:00
unknown
7e53c4acfc All_platform_comments_restrict 2024-10-23 16:32:02 +08:00
unknown
19269c66fd xiaohongshu_comment_number_restrict 2024-10-22 20:33:10 +08:00
Relakkes
03e393949a fix: xhs帖子详情问题更新 2024-10-20 00:59:08 +08:00
Relakkes
9fe3e47b0f chore: 增加代码学习声明,严格禁止非法、禁止商业、不当用途 2024-10-20 00:43:25 +08:00
Relakkes
6dd3420743 fix: #423 2024-09-02 22:57:28 +08:00
Relakkes
65699aa1cb feat: xhs支持获取评论的点赞数量 2024-08-24 06:07:33 +08:00
Relakkes
f371675d47 fix: xhs指定笔记ID获取方式增加解析html方式,原来的由于xsec_token导致失效 2024-08-11 22:37:10 +08:00
Relakkes
573ca9a659 feat: xhs笔记详情更新 2024-07-25 00:44:46 +08:00
helloteemo
b95dc2c125 fix: 小红书下载新版本使用>3.10特性, 降低使用版本 2024-07-12 09:50:03 +08:00
helloteemo
6545a15ff3 feature: 支持小红书图片、视频下载 2024-07-11 22:56:30 +08:00
helloteemo
e71690a985 fix: 解决小红书图片水印问题 2024-07-11 17:39:48 +08:00
Relakkes
87eb8aa6a7 fix: #230 2024-04-13 20:18:04 +08:00
leantli
ad01dfba95 feat: 轻量化支持爬取小红书二级评论 2024-04-12 17:32:20 +08:00
leantli
81a9946afd feat: 支持爬取小红书二级评论 2024-04-11 17:16:13 +08:00
Relakkes
8f02da73ad fix: #219
docs: update README.md
2024-04-08 00:19:50 +08:00
Relakkes
e950e0d6e3 feat: add abstract api client to all platform 2024-03-30 21:27:25 +08:00
Relakkes
67ec49498a refactor: rename xhs to xiaohongshu 2024-03-30 21:17:33 +08:00
Relakkes
96309dcfee fix: 小红书创作者功能数据获取优化 2024-03-17 14:50:10 +08:00
Relakkes
41fee4ff4f feat:小红书支持获取评论中的图片链接 #145 2024-03-07 22:30:44 +08:00
jayeeliu@gmail.com
61ba8c5cc7 feat: 小红书支持通过博主ID采集笔记和评论,小红书type=search时支持配置按哪种排序方式获取笔记数据,小红书笔记增加视频地址和标签字段 2024-03-02 01:49:42 +08:00
Relakkes
e0f9a487e4 refactor: 代码优化 2024-01-16 00:40:07 +08:00