程序员阿江(Relakkes)
d614ccf247
docs: translate comments and metadata to English
...
Update Chinese comments, variable descriptions, and metadata across
multiple configuration and core files to English. This improves
codebase accessibility for international developers. Additionally,
removed the sponsorship section from README files.
2026-02-12 05:30:11 +08:00
程序员阿江-Relakkes
4ad065ce9a
Merge pull request #825 from ouzhuowei/add_save_data_path
...
新增数据保存路径,默认不指定则保存到data文件夹下
2026-02-04 18:03:22 +08:00
ouzhuowei
2a0d1fd69f
补充各平台的媒体存储文件路径适配
...
Co-Authored-By: ouzhuowei <190020754@qq.com >
2026-02-04 09:48:39 +08:00
程序员阿江(Relakkes)
fb42ab5b60
fix : #826
2026-02-03 20:35:33 +08:00
程序员阿江(Relakkes)
4de2a325a9
feat: ks comment api upgrade to v2
2026-01-09 21:09:39 +08:00
Doiiars
70a6ca55bb
feat(database): add PostgreSQL support and fix Windows subprocess encoding
2026-01-09 00:41:59 +08:00
程序员阿江(Relakkes)
c895f53e22
fix : #803
2026-01-05 22:29:34 +08:00
程序员阿江(Relakkes)
157ddfb21b
i18n: translate all Chinese comments, docstrings, and logger messages to English
...
Comprehensive translation of Chinese text to English across the entire codebase:
- api/: FastAPI server documentation and logger messages
- cache/: Cache abstraction layer comments and docstrings
- database/: Database models and MongoDB store documentation
- media_platform/: All platform crawlers (Bilibili, Douyin, Kuaishou, Tieba, Weibo, Xiaohongshu, Zhihu)
- model/: Data model documentation
- proxy/: Proxy pool and provider documentation
- store/: Data storage layer comments
- tools/: Utility functions and browser automation
- test/: Test file documentation
Preserved: Chinese disclaimer header (lines 10-18) for legal compliance
🤖 Generated with [Claude Code](https://claude.com/claude-code )
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2025-12-26 23:27:19 +08:00
程序员阿江(Relakkes)
6e858c1a00
feat: excel store with other platform
2025-11-28 15:12:36 +08:00
hsparks.codes
46ef86ddef
feat: Add Excel export functionality and unit tests
...
Features:
- Excel export with formatted multi-sheet workbooks (Contents, Comments, Creators)
- Professional styling: blue headers, auto-width columns, borders, text wrapping
- Smart export: empty sheets automatically removed
- Support for all platforms (xhs, dy, ks, bili, wb, tieba, zhihu)
Testing:
- Added pytest framework with asyncio support
- Unit tests for Excel store functionality
- Unit tests for store factory pattern
- Shared fixtures for test data
- Test coverage for edge cases
Documentation:
- Comprehensive Excel export guide (docs/excel_export_guide.md)
- Updated README.md and README_en.md with Excel examples
- Updated config comments to include excel option
Dependencies:
- Added openpyxl>=3.1.2 for Excel support
- Added pytest>=7.4.0 and pytest-asyncio>=0.21.0 for testing
This contribution adds immediate value for users who need data analysis
capabilities and establishes a testing foundation for future development.
2025-11-28 04:44:12 +01:00
程序员阿江(Relakkes)
ff8c92daad
chore: add copyright to every file
2025-11-18 12:24:02 +08:00
yangtao210
58eb89f073
Merge branch 'NanmiCoder:main' into main
2025-11-07 17:44:09 +08:00
yt210
b61ec54a72
优化mongodb配置获取逻辑,移动存储基类位置。
2025-11-07 17:42:28 +08:00
程序员阿江-Relakkes
05a1782746
Merge pull request #764 from yangtao210/main
...
新增存储到mongoDB
2025-11-06 06:10:49 -05:00
yt210
ef6948b305
新增存储到mongoDB
2025-11-06 10:40:30 +08:00
程序员阿江(Relakkes)
889fa01466
fix: bili词云图修复
2025-11-02 13:25:31 +08:00
persist-1
0d0af57a01
fix(store): 修复'crawler_type_var'的不当使用导致csv/json保存文件名异常的bug
2025-09-10 23:47:05 +08:00
persist-1
40de0e47e5
fix(store): 将async for循环替换为async with语句来修复zhihu数据库会话管理
2025-09-08 00:29:04 +08:00
persist-1
684a16ed9a
fix(数据库): 修复模型字段类型以支持更广泛的数据格式;
...
修复xhs评论存储方法,从批量处理改为单条处理
2025-09-07 04:10:49 +08:00
persist-1
e92c6130e1
fix(store): 修复存储实现的AsyncFileWriter导入
...
重构小红书存储实现,将store_comments方法改为处理单个评论的store_comment
为多个平台添加AsyncFileWriter工具类导入
2025-09-06 04:41:37 +08:00
persist-1
be306c6f54
refactor(database): 重构数据库存储实现,使用SQLAlchemy ORM替代原始SQL操作
...
- 删除旧的async_db.py和async_sqlite_db.py实现
- 新增SQLAlchemy ORM模型和数据库会话管理
- 统一各平台存储实现到_store_impl.py文件
- 添加数据库初始化功能支持
- 更新.gitignore和pyproject.toml依赖配置
- 优化文件存储路径和命名规范
2025-09-06 04:10:20 +08:00
Czs-HF
48da268bc5
fix: 为抖音JSON存储添加格式化输出
...
- 在DouyinJsonStoreImplement.save_data_to_json方法中添加indent=4参数
- 使抖音JSON输出格式与小红书保持一致,提高可读性
- 解决JSON文件所有内容都在一行的问题
2025-08-16 12:52:37 +08:00
未来可欺
a7cc18ec7d
修改部分文档
2025-07-30 18:58:10 +08:00
未来可欺
ecddfbe02c
将store文件夹中后缀名为_video或_image的.py文件统一更名为以_media.py为结尾的命名方式,避免某些平台仅有_video.py文件或_image.py文件的单独实现。之后的所有存储视频或图像的代码均放在此文件中实现
2025-07-30 18:32:08 +08:00
未来可欺
173bc08a9d
添加了抖音存储视频以及图片的逻辑,并将config.py中ENABLE_GET_IMAGES参数更名为ENABLE_GET_MEIDAS,在此基础上略微修改存储逻辑
2025-07-30 18:24:08 +08:00
翟持江
e6db6be1ca
Update __init__.py,添加了提取抖音笔记图片的逻辑
2025-07-30 10:45:38 +08:00
persist-1
19df1734f1
chore: 增加--help参数中文显示支持及douyin_aweme表music_download_url字段\n\n- 为命令行参数增加中文显示支持,提升用户体验\n- 在douyin_aweme表中新增music_download_url字段用于存储视频音乐下载链接\n- 更新相关数据库表结构文件(tables.sql, sqlite_tables.sql)\n- 实现音乐下载URL提取逻辑并集成到数据存储流程
2025-07-24 22:39:53 +08:00
买定不离手
3365095c62
fix: 完善Bilibili和抖音平台SQLite SQL语句适配
...
- 更新 store/bilibili/bilibili_store_sql.py 文件,优化Bilibili平台SQLite数据库SQL语句和查询逻辑
- 更新 store/douyin/douyin_store_sql.py 文件,修复抖音平台SQLite数据存储的SQL语句兼容性问题
2025-07-14 03:51:19 +08:00
买定不离手
1298022410
refactor: 更新各平台store模块初始化以支持SQLite
...
- 更新 store/bilibili/__init__.py 文件,导入SQLite存储实现类和相关模块
- 更新 store/douyin/__init__.py 文件,集成抖音平台的SQLite数据存储接口
- 更新 store/kuaishou/__init__.py 文件,添加快手平台SQLite存储模块的导入声明
- 更新 store/tieba/__init__.py 文件,引入贴吧平台SQLite数据库操作模块
- 更新 store/weibo/__init__.py 文件,整合微博平台SQLite存储功能模块
- 更新 store/xhs/__init__.py 文件,导入小红书平台SQLite数据存储实现
- 更新 store/zhihu/__init__.py 文件,集成知乎平台SQLite数据库存储模块
2025-07-14 03:51:08 +08:00
买定不离手
6f274d476b
feat: 添加各平台SQLite存储实现文件\n\n- 新增 store/bilibili/bilibili_store_impl.py: B站SQLite存储实现\n- 新增 store/douyin/douyin_store_impl.py: 抖音SQLite存储实现\n- 新增 store/kuaishou/kuaishou_store_impl.py: 快手SQLite存储实现\n- 新增 store/tieba/tieba_store_impl.py: 贴吧SQLite存储实现\n- 新增 store/weibo/weibo_store_impl.py: 微博SQLite存储实现\n- 新增 store/xhs/xhs_store_impl.py: 小红书SQLite存储实现\n- 新增 store/zhihu/zhihu_store_impl.py: 知乎SQLite存储实现
2025-07-14 03:36:36 +08:00
买定不离手
fb938f38aa
feat: 更新各平台store SQL文件以支持SQLite\n\n- 更新 store/kuaishou/kuaishou_store_sql.py: 快手平台SQLite适配\n- 更新 store/tieba/tieba_store_sql.py: 贴吧平台SQLite适配\n- 更新 store/weibo/weibo_store_sql.py: 微博平台SQLite适配\n- 更新 store/xhs/xhs_store_sql.py: 小红书平台SQLite适配\n- 更新 store/zhihu/zhihu_store_sql.py: 知乎平台SQLite适配
2025-07-14 03:36:20 +08:00
Relakkes
fd33813f8f
feat: add like_count field to bilibi for issue #623
2025-06-20 15:50:38 +08:00
Relakkes
d55d8b1efa
feat: Douyin supports obtaining video links and cover images. for issue #620
2025-06-14 23:59:08 +08:00
Bowenwin
66843f216a
finish_all_for_expand_bili
2025-05-22 22:26:30 +08:00
Bowenwin
59619fff0a
finish_all
2025-05-22 22:06:06 +08:00
Bowenwin
44e3d370ff
fix_words
2025-05-22 20:31:48 +08:00
Bowenwin
a356358c21
get_fans_and_get_followings
2025-05-19 19:57:36 +08:00
翟持江
b675547aab
Update __init__.py,为bilibili的视频信息、up主信息、评论信息添加额外字段
2025-04-19 02:29:22 +08:00
Relakkes
30d0e733d5
feat: douyin adds comment images
2025-01-15 14:50:05 +08:00
HuiLong
d929ad16ae
fix xhs get gender
2024-12-28 20:24:37 +08:00
Relakkes
79bf9fc05d
chore: add xhs field comment for issue #526
2024-12-26 18:28:23 +08:00
liudongkai
33e7ef016d
feat: xhs 非代理模式下增加随机等待间隔, db存储模式下增加存储xsec_token字段
2024-12-05 21:10:31 +08:00
Relakkes
8ab4c67443
feat: 抖音支持评论点赞数量 #495
2024-11-16 00:37:48 +08:00
Relakkes
9fe3e47b0f
chore: 增加代码学习声明,严格禁止非法、禁止商业、不当用途
2024-10-20 00:43:25 +08:00
Relakkes
da8f1c62b8
feat: 知乎支持创作者主页数据爬取(回答、文章、视频)
2024-10-16 21:02:27 +08:00
Relakkes
b7e57da0d2
feat: 知乎支持(关键词、评论)
2024-09-08 00:00:04 +08:00
tooyang
0c1adb75fe
fix: json内容保存格式优化,支持缩进
2024-09-04 11:24:12 +08:00
Relakkes Yang
acb29add28
feat: 百度贴吧支持创作者主页帖子爬取
2024-08-24 11:03:23 +08:00
Relakkes
8adb593ba6
temp commit
2024-08-24 09:12:03 +08:00
Relakkes
65699aa1cb
feat: xhs支持获取评论的点赞数量
2024-08-24 06:07:33 +08:00