mirror of
https://github.com/NanmiCoder/MediaCrawler.git
synced 2026-06-09 11:27:26 +08:00
feat: Add Excel export functionality and unit tests
Features: - Excel export with formatted multi-sheet workbooks (Contents, Comments, Creators) - Professional styling: blue headers, auto-width columns, borders, text wrapping - Smart export: empty sheets automatically removed - Support for all platforms (xhs, dy, ks, bili, wb, tieba, zhihu) Testing: - Added pytest framework with asyncio support - Unit tests for Excel store functionality - Unit tests for store factory pattern - Shared fixtures for test data - Test coverage for edge cases Documentation: - Comprehensive Excel export guide (docs/excel_export_guide.md) - Updated README.md and README_en.md with Excel examples - Updated config comments to include excel option Dependencies: - Added openpyxl>=3.1.2 for Excel support - Added pytest>=7.4.0 and pytest-asyncio>=0.21.0 for testing This contribution adds immediate value for users who need data analysis capabilities and establishes a testing foundation for future development.
This commit is contained in:
@@ -212,6 +212,10 @@ python main.py --help
|
||||
支持多种数据存储方式:
|
||||
- **CSV 文件**:支持保存到 CSV 中(`data/` 目录下)
|
||||
- **JSON 文件**:支持保存到 JSON 中(`data/` 目录下)
|
||||
- **Excel 文件**:支持保存到格式化的 Excel 文件(`data/` 目录下)✨ 新功能
|
||||
- 多工作表支持(内容、评论、创作者)
|
||||
- 专业格式化(标题样式、自动列宽、边框)
|
||||
- 易于分析和分享
|
||||
- **数据库存储**
|
||||
- 使用参数 `--init_db` 进行数据库初始化(使用`--init_db`时不需要携带其他optional)
|
||||
- **SQLite 数据库**:轻量级数据库,无需服务器,适合个人使用(推荐)
|
||||
@@ -224,6 +228,9 @@ python main.py --help
|
||||
|
||||
### 使用示例:
|
||||
```shell
|
||||
# 使用 Excel 存储数据(推荐用于数据分析)✨ 新功能
|
||||
uv run main.py --platform xhs --lt qrcode --type search --save_data_option excel
|
||||
|
||||
# 初始化 SQLite 数据库(使用'--init_db'时不需要携带其他optional)
|
||||
uv run main.py --init_db sqlite
|
||||
# 使用 SQLite 存储数据(推荐个人用户使用)
|
||||
|
||||
Reference in New Issue
Block a user