Relakkes
|
fd33813f8f
|
feat: add like_count field to bilibi for issue #623
|
2025-06-20 15:50:38 +08:00 |
|
Relakkes
|
31bcdb191f
|
docs: update README.md
|
2025-06-16 13:58:09 +08:00 |
|
Relakkes
|
d55d8b1efa
|
feat: Douyin supports obtaining video links and cover images. for issue #620
|
2025-06-14 23:59:08 +08:00 |
|
Relakkes
|
ed1dc7916a
|
docs: update README.md
|
2025-06-08 15:56:02 +08:00 |
|
程序员阿江(Relakkes)
|
6323e2d45b
|
Merge pull request #616 from chimeElm/main
修复CRAWLER_MAX_NOTES_COUNT在爬取小红书作者帖子时失效的问题
|
2025-06-07 14:43:37 +08:00 |
|
chimeElm
|
26a845581e
|
Update client.py
修复CRAWLER_MAX_NOTES_COUNT在爬取小红书作者帖子时失效的问题
|
2025-06-07 02:41:09 +08:00 |
|
Relakkes
|
23c8f8f87b
|
docs: add english license
|
2025-06-01 23:20:11 +08:00 |
|
Relakkes
|
1e7b950d3e
|
Revert "chore: remove sponor"
This reverts commit 242c06c345.
|
2025-05-26 22:35:18 +08:00 |
|
Relakkes
|
242c06c345
|
chore: remove sponor
|
2025-05-25 11:54:38 +08:00 |
|
程序员阿江(Relakkes)
|
ff41faeb00
|
Merge pull request #608 from Bowenwin/bili_expand
Bili_function_expand
|
2025-05-22 23:14:58 +08:00 |
|
Bowenwin
|
66843f216a
|
finish_all_for_expand_bili
|
2025-05-22 22:26:30 +08:00 |
|
Bowenwin
|
59619fff0a
|
finish_all
|
2025-05-22 22:06:06 +08:00 |
|
Bowenwin
|
44e3d370ff
|
fix_words
|
2025-05-22 20:31:48 +08:00 |
|
程序员阿江(Relakkes)
|
7ed6621933
|
Merge pull request #603 from Bowenwin/fix_words
Fix words
|
2025-05-19 23:16:12 +08:00 |
|
Bowenwin
|
703a6e84cb
|
fix_words
|
2025-05-19 20:07:20 +08:00 |
|
Bowenwin
|
144b8bec6a
|
fix_words
|
2025-05-19 20:04:00 +08:00 |
|
Bowenwin
|
a356358c21
|
get_fans_and_get_followings
|
2025-05-19 19:57:36 +08:00 |
|
Relakkes
|
654260cbce
|
docs: update README.md
|
2025-05-13 18:42:58 +08:00 |
|
Relakkes
|
79a9824f6a
|
fix: modify dy schema
|
2025-04-30 16:47:13 +08:00 |
|
Relakkes
|
67d31bf42a
|
fix: dy update fp params
|
2025-04-30 13:26:22 +08:00 |
|
程序员阿江(Relakkes)
|
2a41b684ad
|
Merge pull request #590 from 2513502304/main
关于 issue #589 的增强方法
|
2025-04-20 14:14:55 +08:00 |
|
翟持江
|
af5a393a7a
|
Update core.py,删除了其它代码贡献者所添加的try-catch语句,该段try-catch语句将会影响其代码的最终逻辑并令其失效,使其仅能爬取当天一天数据而无法跳转到下一天(原先的逻辑就是try-catch捕获异常从而进入下一天,不要再向该语句中添加捕获异常操作或者finally语句!)
|
2025-04-19 04:34:24 +08:00 |
|
翟持江
|
b675547aab
|
Update __init__.py,为bilibili的视频信息、up主信息、评论信息添加额外字段
|
2025-04-19 02:29:22 +08:00 |
|
翟持江
|
ec97001451
|
Update tables.sql
|
2025-04-19 02:22:22 +08:00 |
|
翟持江
|
9935a07279
|
Add files via upload
|
2025-04-19 02:18:52 +08:00 |
|
Relakkes
|
cb2ae6cbab
|
docs: add sponsor
|
2025-04-16 16:49:49 +08:00 |
|
Relakkes
|
0d715a9f32
|
fix: bili qrcode login fix
|
2025-04-08 21:11:40 +08:00 |
|
Relakkes
|
660fd18a95
|
fix: dy login fix
|
2025-04-08 20:58:04 +08:00 |
|
程序员阿江(Relakkes)
|
afbd4ec1bf
|
Merge pull request #572 from crpa33/main
避免预料之外的数据为None的情况终止进程
|
2025-04-02 13:34:41 +08:00 |
|
crpa33
|
274d64aefc
|
处理xhs意外的评论信息为空的情况
报错就会打断我,我没辙
|
2025-04-02 11:59:27 +08:00 |
|
crpa33
|
a39b571d27
|
输出到日志-处理视频搜索页任务列表构造的错误
|
2025-04-02 11:57:28 +08:00 |
|
crpa33
|
413d91a520
|
输出到日志-author被封禁或存在错误
|
2025-04-02 11:52:36 +08:00 |
|
crpa33
|
eaf14721f8
|
输出到日志-NoneType导致的推导式错误
|
2025-04-02 11:48:36 +08:00 |
|
crpa33
|
2c4af2337e
|
douyin搜索页为空跳下一关键词
预计页数没到,空了也跳
|
2025-03-27 23:32:21 +08:00 |
|
crpa33
|
3c72fc48b0
|
保护author为None但未被识别的情况
|
2025-03-27 23:22:47 +08:00 |
|
crpa33
|
6b6e2b8ba0
|
修复NoneType导致的推导式错误
|
2025-03-27 23:18:01 +08:00 |
|
Relakkes
|
dfddfa7fdc
|
docs: update README.md
|
2025-03-23 20:35:11 +08:00 |
|
Relakkes
|
daaea7155b
|
feat: add uv tool to manage project
|
2025-03-23 18:13:13 +08:00 |
|
Relakkes
|
8030d2a02f
|
docs: removed sponsor
|
2025-03-13 15:07:54 +08:00 |
|
Relakkes
|
061d1c15e2
|
feat: kuaishou search params update
|
2025-03-11 23:42:34 +08:00 |
|
Relakkes
|
f2cf864c27
|
fix: zhihu article url error #564
|
2025-03-03 18:18:41 +08:00 |
|
Relakkes
|
b43d6b7b91
|
chore: update config
|
2025-02-12 10:58:48 +08:00 |
|
Relakkes
|
66a7ab1db8
|
refactor: bibi default to get without time data
|
2025-02-12 10:58:15 +08:00 |
|
Relakkes
|
678ce1bfac
|
fix: bilibili bugfix
|
2025-02-10 17:13:37 +08:00 |
|
Relakkes
|
457205efd8
|
docs: add sponsor
|
2025-02-08 15:28:18 +08:00 |
|
程序员阿江(Relakkes)
|
38f2b36bf5
|
Merge pull request #542 from 2513502304/main
Update core.py,为爬取类型为`detail`和`creator`的任务,添加了和`search`任务一样的,用于转存up主信息的`bilibili_store.update_up_info`的函数调用
|
2025-01-20 19:30:10 +08:00 |
|
翟持江
|
0364b23b5b
|
Update core.py,为爬取类型为detail和creator的任务,添加了和search任务一样的,用于转存up主信息的bilibili_store.update_up_info的函数调用
正如`search`函数中一样,在调用`get_video_info_task`后,`bilibili_video`和`bilibili_up_info`信息都将获得。
原先的`get_specified_videos`在`detail`任务中仅保存了指定`bilibili_video`的信息,而`bilibili_up_info`信息尚未保存,`creator`任务的`get_creator_videos`中也调用了`get_specified_videos`获取指定创作者下所有的视频信息,同理也未保存`bilibili_up_info`信息。
所以只需为`get_specified_videos`添加一句`await bilibili_store.update_up_info(video_detail)`即可和`search`任务下获得的数据文件个数保持一致,不会缺少对应up主的个人信息。
已测试:
- 原先仅`search`任务下产生`*_creator.csv`、`*_contents.csv`、`*_comments.csv`,而`detail`和`creator`任务下缺少`*_creator.csv`文件。
- 此次提交后将使三种模式下的数据文件个数一致。
|
2025-01-19 19:55:18 +08:00 |
|
程序员阿江(Relakkes)
|
4b63ea68ec
|
Merge pull request #538 from 2513502304/main
feat: bilibli support date range filter
|
2025-01-17 19:43:57 +08:00 |
|
翟持江
|
2d93ec5a82
|
Update core.py,更改了错误的缩进
|
2025-01-15 18:33:12 +08:00 |
|
翟持江
|
8741952cb5
|
Update requirements.txt,添加了pandas模块,datetime为Python标准库模块无需添加
|
2025-01-15 18:27:40 +08:00 |
|