程序员阿江(Relakkes)
4de2a325a9
feat: ks comment api upgrade to v2
2026-01-09 21:09:39 +08:00
Doiiars
70a6ca55bb
feat(database): add PostgreSQL support and fix Windows subprocess encoding
2026-01-09 00:41:59 +08:00
程序员阿江(Relakkes)
c895f53e22
fix : #803
2026-01-05 22:29:34 +08:00
程序员阿江(Relakkes)
157ddfb21b
i18n: translate all Chinese comments, docstrings, and logger messages to English
...
Comprehensive translation of Chinese text to English across the entire codebase:
- api/: FastAPI server documentation and logger messages
- cache/: Cache abstraction layer comments and docstrings
- database/: Database models and MongoDB store documentation
- media_platform/: All platform crawlers (Bilibili, Douyin, Kuaishou, Tieba, Weibo, Xiaohongshu, Zhihu)
- model/: Data model documentation
- proxy/: Proxy pool and provider documentation
- store/: Data storage layer comments
- tools/: Utility functions and browser automation
- test/: Test file documentation
Preserved: Chinese disclaimer header (lines 10-18) for legal compliance
🤖 Generated with [Claude Code](https://claude.com/claude-code )
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2025-12-26 23:27:19 +08:00
程序员阿江(Relakkes)
6e858c1a00
feat: excel store with other platform
2025-11-28 15:12:36 +08:00
hsparks.codes
46ef86ddef
feat: Add Excel export functionality and unit tests
...
Features:
- Excel export with formatted multi-sheet workbooks (Contents, Comments, Creators)
- Professional styling: blue headers, auto-width columns, borders, text wrapping
- Smart export: empty sheets automatically removed
- Support for all platforms (xhs, dy, ks, bili, wb, tieba, zhihu)
Testing:
- Added pytest framework with asyncio support
- Unit tests for Excel store functionality
- Unit tests for store factory pattern
- Shared fixtures for test data
- Test coverage for edge cases
Documentation:
- Comprehensive Excel export guide (docs/excel_export_guide.md)
- Updated README.md and README_en.md with Excel examples
- Updated config comments to include excel option
Dependencies:
- Added openpyxl>=3.1.2 for Excel support
- Added pytest>=7.4.0 and pytest-asyncio>=0.21.0 for testing
This contribution adds immediate value for users who need data analysis
capabilities and establishes a testing foundation for future development.
2025-11-28 04:44:12 +01:00
程序员阿江(Relakkes)
ff8c92daad
chore: add copyright to every file
2025-11-18 12:24:02 +08:00
yangtao210
58eb89f073
Merge branch 'NanmiCoder:main' into main
2025-11-07 17:44:09 +08:00
yt210
b61ec54a72
优化mongodb配置获取逻辑,移动存储基类位置。
2025-11-07 17:42:28 +08:00
程序员阿江-Relakkes
05a1782746
Merge pull request #764 from yangtao210/main
...
新增存储到mongoDB
2025-11-06 06:10:49 -05:00
yt210
ef6948b305
新增存储到mongoDB
2025-11-06 10:40:30 +08:00
程序员阿江(Relakkes)
889fa01466
fix: bili词云图修复
2025-11-02 13:25:31 +08:00
persist-1
0d0af57a01
fix(store): 修复'crawler_type_var'的不当使用导致csv/json保存文件名异常的bug
2025-09-10 23:47:05 +08:00
persist-1
40de0e47e5
fix(store): 将async for循环替换为async with语句来修复zhihu数据库会话管理
2025-09-08 00:29:04 +08:00
persist-1
684a16ed9a
fix(数据库): 修复模型字段类型以支持更广泛的数据格式;
...
修复xhs评论存储方法,从批量处理改为单条处理
2025-09-07 04:10:49 +08:00
persist-1
e92c6130e1
fix(store): 修复存储实现的AsyncFileWriter导入
...
重构小红书存储实现,将store_comments方法改为处理单个评论的store_comment
为多个平台添加AsyncFileWriter工具类导入
2025-09-06 04:41:37 +08:00
persist-1
be306c6f54
refactor(database): 重构数据库存储实现,使用SQLAlchemy ORM替代原始SQL操作
...
- 删除旧的async_db.py和async_sqlite_db.py实现
- 新增SQLAlchemy ORM模型和数据库会话管理
- 统一各平台存储实现到_store_impl.py文件
- 添加数据库初始化功能支持
- 更新.gitignore和pyproject.toml依赖配置
- 优化文件存储路径和命名规范
2025-09-06 04:10:20 +08:00
Czs-HF
48da268bc5
fix: 为抖音JSON存储添加格式化输出
...
- 在DouyinJsonStoreImplement.save_data_to_json方法中添加indent=4参数
- 使抖音JSON输出格式与小红书保持一致,提高可读性
- 解决JSON文件所有内容都在一行的问题
2025-08-16 12:52:37 +08:00
未来可欺
a7cc18ec7d
修改部分文档
2025-07-30 18:58:10 +08:00
未来可欺
ecddfbe02c
将store文件夹中后缀名为_video或_image的.py文件统一更名为以_media.py为结尾的命名方式,避免某些平台仅有_video.py文件或_image.py文件的单独实现。之后的所有存储视频或图像的代码均放在此文件中实现
2025-07-30 18:32:08 +08:00
未来可欺
173bc08a9d
添加了抖音存储视频以及图片的逻辑,并将config.py中ENABLE_GET_IMAGES参数更名为ENABLE_GET_MEIDAS,在此基础上略微修改存储逻辑
2025-07-30 18:24:08 +08:00
翟持江
e6db6be1ca
Update __init__.py,添加了提取抖音笔记图片的逻辑
2025-07-30 10:45:38 +08:00
persist-1
19df1734f1
chore: 增加--help参数中文显示支持及douyin_aweme表music_download_url字段\n\n- 为命令行参数增加中文显示支持,提升用户体验\n- 在douyin_aweme表中新增music_download_url字段用于存储视频音乐下载链接\n- 更新相关数据库表结构文件(tables.sql, sqlite_tables.sql)\n- 实现音乐下载URL提取逻辑并集成到数据存储流程
2025-07-24 22:39:53 +08:00
买定不离手
3365095c62
fix: 完善Bilibili和抖音平台SQLite SQL语句适配
...
- 更新 store/bilibili/bilibili_store_sql.py 文件,优化Bilibili平台SQLite数据库SQL语句和查询逻辑
- 更新 store/douyin/douyin_store_sql.py 文件,修复抖音平台SQLite数据存储的SQL语句兼容性问题
2025-07-14 03:51:19 +08:00
买定不离手
1298022410
refactor: 更新各平台store模块初始化以支持SQLite
...
- 更新 store/bilibili/__init__.py 文件,导入SQLite存储实现类和相关模块
- 更新 store/douyin/__init__.py 文件,集成抖音平台的SQLite数据存储接口
- 更新 store/kuaishou/__init__.py 文件,添加快手平台SQLite存储模块的导入声明
- 更新 store/tieba/__init__.py 文件,引入贴吧平台SQLite数据库操作模块
- 更新 store/weibo/__init__.py 文件,整合微博平台SQLite存储功能模块
- 更新 store/xhs/__init__.py 文件,导入小红书平台SQLite数据存储实现
- 更新 store/zhihu/__init__.py 文件,集成知乎平台SQLite数据库存储模块
2025-07-14 03:51:08 +08:00
买定不离手
6f274d476b
feat: 添加各平台SQLite存储实现文件\n\n- 新增 store/bilibili/bilibili_store_impl.py: B站SQLite存储实现\n- 新增 store/douyin/douyin_store_impl.py: 抖音SQLite存储实现\n- 新增 store/kuaishou/kuaishou_store_impl.py: 快手SQLite存储实现\n- 新增 store/tieba/tieba_store_impl.py: 贴吧SQLite存储实现\n- 新增 store/weibo/weibo_store_impl.py: 微博SQLite存储实现\n- 新增 store/xhs/xhs_store_impl.py: 小红书SQLite存储实现\n- 新增 store/zhihu/zhihu_store_impl.py: 知乎SQLite存储实现
2025-07-14 03:36:36 +08:00
买定不离手
fb938f38aa
feat: 更新各平台store SQL文件以支持SQLite\n\n- 更新 store/kuaishou/kuaishou_store_sql.py: 快手平台SQLite适配\n- 更新 store/tieba/tieba_store_sql.py: 贴吧平台SQLite适配\n- 更新 store/weibo/weibo_store_sql.py: 微博平台SQLite适配\n- 更新 store/xhs/xhs_store_sql.py: 小红书平台SQLite适配\n- 更新 store/zhihu/zhihu_store_sql.py: 知乎平台SQLite适配
2025-07-14 03:36:20 +08:00
Relakkes
fd33813f8f
feat: add like_count field to bilibi for issue #623
2025-06-20 15:50:38 +08:00
Relakkes
d55d8b1efa
feat: Douyin supports obtaining video links and cover images. for issue #620
2025-06-14 23:59:08 +08:00
Bowenwin
66843f216a
finish_all_for_expand_bili
2025-05-22 22:26:30 +08:00
Bowenwin
59619fff0a
finish_all
2025-05-22 22:06:06 +08:00
Bowenwin
44e3d370ff
fix_words
2025-05-22 20:31:48 +08:00
Bowenwin
a356358c21
get_fans_and_get_followings
2025-05-19 19:57:36 +08:00
翟持江
b675547aab
Update __init__.py,为bilibili的视频信息、up主信息、评论信息添加额外字段
2025-04-19 02:29:22 +08:00
Relakkes
30d0e733d5
feat: douyin adds comment images
2025-01-15 14:50:05 +08:00
HuiLong
d929ad16ae
fix xhs get gender
2024-12-28 20:24:37 +08:00
Relakkes
79bf9fc05d
chore: add xhs field comment for issue #526
2024-12-26 18:28:23 +08:00
liudongkai
33e7ef016d
feat: xhs 非代理模式下增加随机等待间隔, db存储模式下增加存储xsec_token字段
2024-12-05 21:10:31 +08:00
Relakkes
8ab4c67443
feat: 抖音支持评论点赞数量 #495
2024-11-16 00:37:48 +08:00
Relakkes
9fe3e47b0f
chore: 增加代码学习声明,严格禁止非法、禁止商业、不当用途
2024-10-20 00:43:25 +08:00
Relakkes
da8f1c62b8
feat: 知乎支持创作者主页数据爬取(回答、文章、视频)
2024-10-16 21:02:27 +08:00
Relakkes
b7e57da0d2
feat: 知乎支持(关键词、评论)
2024-09-08 00:00:04 +08:00
tooyang
0c1adb75fe
fix: json内容保存格式优化,支持缩进
2024-09-04 11:24:12 +08:00
Relakkes Yang
acb29add28
feat: 百度贴吧支持创作者主页帖子爬取
2024-08-24 11:03:23 +08:00
Relakkes
8adb593ba6
temp commit
2024-08-24 09:12:03 +08:00
Relakkes
65699aa1cb
feat: xhs支持获取评论的点赞数量
2024-08-24 06:07:33 +08:00
Relakkes
ab7d8142af
feat: weibo支持指定创作者主页
2024-08-24 05:52:11 +08:00
Relakkes
c70bd9e071
feat: 增加搜索词来源渠道
2024-08-23 08:29:24 +08:00
Relakkes
e7c6c57aa6
fix: kuaishou store not implement store_creator
2024-08-08 16:36:33 +08:00
Relakkes
3f42368c02
feat: 百度贴吧done
2024-08-08 14:19:32 +08:00