736 Commits

Author SHA1 Message Date
程序员阿江-Relakkes
13b6140f22 Merge pull request #831 from ouzhuowei/fix_redis_and_proxy
适配没有redisKeys和快代理没有账号密码的情况
2026-02-13 21:18:26 +08:00
ouzhuowei
279c293147 删除不必要的注释
Co-Authored-By: ouzhuowei <190020754@qq.com>
2026-02-13 09:54:10 +08:00
ouzhuowei
db47d0e6f4 适配没有redisKeys和快代理没有账号密码的情况
Co-Authored-By: ouzhuowei <190020754@qq.com>
2026-02-13 09:42:15 +08:00
程序员阿江(Relakkes)
d614ccf247 docs: translate comments and metadata to English
Update Chinese comments, variable descriptions, and metadata across
multiple configuration and core files to English. This improves
codebase accessibility for international developers. Additionally,
removed the sponsorship section from README files.
2026-02-12 05:30:11 +08:00
程序员阿江-Relakkes
257743b016 Merge pull request #828 from ouzhuowei/add_save_data_path
补充代理配置的arp
2026-02-12 04:47:25 +08:00
程序员阿江-Relakkes
dcaa11eeb9 Merge pull request #829 from ouzhuowei/update_sub_comment_error
处理子评论获取失败导致整个流程中断问题
2026-02-12 04:46:34 +08:00
ouzhuowei
e54463ac78 处理子评论获取失败导致整个流程中断问题
Co-Authored-By: ouzhuowei <190020754@qq.com>
2026-02-10 17:53:30 +08:00
ouzhuowei
212276bc30 Revert "新增日志存储逻辑"
This reverts commit 30cf16af0c.

Co-Authored-By: ouzhuowei <190020754@qq.com>
2026-02-10 15:03:40 +08:00
ouzhuowei
30cf16af0c 新增日志存储逻辑
Co-Authored-By: ouzhuowei <190020754@qq.com>
2026-02-06 12:33:35 +08:00
ouzhuowei
80e9c866a0 Merge branch 'add_save_data_path' into add_log_config
Co-Authored-By: ouzhuowei <190020754@qq.com>
2026-02-06 12:24:57 +08:00
ouzhuowei
90280a261a 补充代理配置的arp
Co-Authored-By: ouzhuowei <190020754@qq.com>
2026-02-06 09:58:37 +08:00
程序员阿江-Relakkes
4ad065ce9a Merge pull request #825 from ouzhuowei/add_save_data_path
新增数据保存路径,默认不指定则保存到data文件夹下
2026-02-04 18:03:22 +08:00
ouzhuowei
2a0d1fd69f 补充各平台的媒体存储文件路径适配
Co-Authored-By: ouzhuowei <190020754@qq.com>
2026-02-04 09:48:39 +08:00
程序员阿江(Relakkes)
c309871485 refactor(xhs): improve login state check logic 2026-02-03 20:49:46 +08:00
程序员阿江(Relakkes)
6625663bde feat: #823 2026-02-03 20:40:15 +08:00
程序员阿江(Relakkes)
fb42ab5b60 fix: #826 2026-02-03 20:35:33 +08:00
ouzhuowei
7484156f02 新增数据保存路径,默认不指定则保存到data文件夹下
Co-Authored-By: ouzhuowei <190020754@qq.com>
2026-02-03 11:24:22 +08:00
程序员阿江(Relakkes)
413b5d9034 docs: fix README heading levels, sync Pro section across languages
- Fix h3→h2 for standalone sections (交流群组, 赞助商展示, 成为赞助者, 其他) in README.md
- Remove WebUI standalone heading (kept as collapsible only)
- Remove WandouHTTP sponsor from EN/ES versions
- Expand Pro section (remove <details> collapse) in EN/ES to match CN
- Add Content Deconstruction Agent to Pro feature list in EN/ES

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 00:40:27 +08:00
程序员阿江(Relakkes)
dbbc2c7439 docs: update README.md 2026-02-02 20:25:51 +08:00
程序员阿江-Relakkes
51a7d94de8 Merge pull request #821 from wanzirong/feature/max-concurrency-param
feat: 添加并发爬虫数量控制参数 --max_concurrency_num
2026-01-31 00:31:15 +08:00
wanzirong
df39d293de 修改--max_concurrency为--max_concurrency_num,保持命名一致 2026-01-30 11:15:06 +08:00
wanzirong
79048e265e feat: 添加并发爬虫数量控制参数
- 新增 --max_concurrency 命令行参数
- 用于控制并发爬虫数量
- 默认值为 1
2026-01-30 11:15:05 +08:00
程序员阿江-Relakkes
94553fd818 Merge pull request #817 from wanzirong/dev
feat: 添加命令行参数控制评论爬取数量
2026-01-21 16:49:13 +08:00
wanzirong
90f72536ba refactor: 简化命令行参数命名
- 将 --max_comments_per_post 重命名为 --max_comments_count_singlenotes,与配置项名称保持一致
- 移除 --xhs_sort_type 参数(暂不需要)
- 保持代码简洁,减少不必要的功能
2026-01-21 16:30:07 +08:00
wanzirong
f7d27ab43a feat: 添加命令行参数支持
- 添加 --max_comments_per_post 参数用于控制每个帖子爬取的评论数量
- 添加 --xhs_sort_type 参数用于控制小红书排序方式
- 修复小红书 core.py 中 CRAWLER_MAX_COMMENTS_COUNT_SINGLENOTES 的导入方式
  从直接导入改为通过 config 模块访问,使命令行参数能正确生效
2026-01-21 16:23:47 +08:00
程序员阿江(Relakkes)
be5b786a74 docs: update docs 2026-01-19 12:23:04 +08:00
程序员阿江-Relakkes
04fb716a44 Merge pull request #815 from 2470370075g-ux/fix-typo
修复拼写错误
2026-01-18 22:24:57 +08:00
WangXX
1f89713b90 修复拼写错误 2026-01-18 22:22:31 +08:00
程序员阿江-Relakkes
00a9e19139 Merge pull request #809 from orbisai0security/fix-cve-2023-50447-requirements.txt
[Security] Fix CRITICAL vulnerability: CVE-2023-50447
2026-01-13 14:40:23 +08:00
orbisai0security
8a2c349d67 fix: resolve critical vulnerability CVE-2023-50447
Automatically generated security fix
2026-01-12 15:10:10 +00:00
程序员阿江(Relakkes)
4de2a325a9 feat: ks comment api upgrade to v2 2026-01-09 21:09:39 +08:00
程序员阿江-Relakkes
2517e51ed4 Merge pull request #805 from MissMyDearBear/feature-bear
fix the login status error after scan the QR code
2026-01-09 14:18:16 +08:00
Alen Bear
e3d7fa7bed Merge branch 'NanmiCoder:main' into feature-bear 2026-01-09 14:14:37 +08:00
bear
a59b385615 fix the login status error after scan the QR code 2026-01-09 14:11:47 +08:00
程序员阿江-Relakkes
7c240747b6 Merge pull request #807 from DoiiarX/main
feat(database): add PostgreSQL support and fix Windows subprocess encoding
2026-01-09 10:53:57 +08:00
Doiiars
70a6ca55bb feat(database): add PostgreSQL support and fix Windows subprocess encoding 2026-01-09 00:41:59 +08:00
程序员阿江(Relakkes)
57b688fea4 feat: webui support light theme 2026-01-06 11:16:48 +08:00
程序员阿江(Relakkes)
ee4539c8fa chore: stop tracking .DS_Store 2026-01-06 11:11:49 +08:00
程序员阿江(Relakkes)
c895f53e22 fix: #803 2026-01-05 22:29:34 +08:00
程序员阿江(Relakkes)
99db95c499 fix: 'utf-8' codec can't decode error 2026-01-04 10:48:15 +08:00
程序员阿江-Relakkes
483c5ec8c6 Merge pull request #802 from Cae1anSou/fix/douyin-concurrent-comments
fix: fetch Douyin comments concurrently after each page instead of waiting for all pages
2026-01-03 22:38:26 +08:00
Caelan_Windows
c56b8c4c5d fix(douyin): fetch comments concurrently after each page instead of waiting for all pages
- Moved batch_get_note_comments call inside the pagination loop
- Comments are now fetched immediately after each page of videos is processed
- This allows real-time observation of comment crawling progress
- Improves data availability by not waiting for all video data to be collected first
2026-01-03 01:47:24 +08:00
程序员阿江(Relakkes)
a47c119303 docs: update 2025-12-30 17:10:13 +08:00
程序员阿江(Relakkes)
157ddfb21b i18n: translate all Chinese comments, docstrings, and logger messages to English
Comprehensive translation of Chinese text to English across the entire codebase:

- api/: FastAPI server documentation and logger messages
- cache/: Cache abstraction layer comments and docstrings
- database/: Database models and MongoDB store documentation
- media_platform/: All platform crawlers (Bilibili, Douyin, Kuaishou, Tieba, Weibo, Xiaohongshu, Zhihu)
- model/: Data model documentation
- proxy/: Proxy pool and provider documentation
- store/: Data storage layer comments
- tools/: Utility functions and browser automation
- test/: Test file documentation

Preserved: Chinese disclaimer header (lines 10-18) for legal compliance

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-26 23:27:19 +08:00
程序员阿江(Relakkes)
1544d13dd5 docs: update README.md 2025-12-26 22:41:32 +08:00
程序员阿江(Relakkes)
55d8c7783f feat: webo full context support 2025-12-26 19:22:24 +08:00
程序员阿江(Relakkes)
ff1b681311 fix: weibo get note image fixed 2025-12-26 00:47:20 +08:00
程序员阿江(Relakkes)
11500ef57a fix: #799 2025-12-24 11:45:07 +08:00
程序员阿江(Relakkes)
b9663c6a6d fix: #798 2025-12-22 17:44:35 +08:00
程序员阿江(Relakkes)
1a38ae12bd docs: update README.md 2025-12-19 00:23:55 +08:00