download-youtube-subtitle
Mục lục bài viết
download-youtube-subtitle
due to changes of youtube api, you need to UPGRADE to 2.0.0, see Install and Run
Download Youtube Subtitle
Download youtube subtitles(closed caption, cc) or srt as txt or json.
Features
python version of algolia/youtube-captions-scraper: Fetch youtube user submitted or fallback to auto-generated captions
Example
save as txt
dl-youtube-cc https://www.youtube.com/watch?v=wgnigj1ngye --translation ja
or
dl-youtube-cc wgNiGj1nGYE --translation ja
will saved as Version1.5SpecialProgramGenshinImpact.txt
https://www.youtube.com/watch?v=wgNiGj1nGYE
---------00:00----------
從前,有一對雙胞胎結伴在宇宙中旅行
昔々、宇宙を一緒に旅している双子のペアがいました
---------00:05----------
但有一天,他們前路遇阻
しかしある日、彼らの道は封鎖されました
---------00:07----------
被一個未知的神明生生分離
未知の神によって隔てられている
save as json
dl-youtube-cc wgNiGj1nGYE --translation ja --to_json=True
will saved as Version1.5SpecialProgramGenshinImpact.json
{
"original"
:
[
{
"start"
:
"0"
,
"dur"
:
"5.056"
,
"text"
:
"Once upon a time, two twins traveled together throughout the universe."
},
// continue
],
"translation"
:
[
{
"start"
:
"0"
,
"dur"
:
"5.056"
,
"text"
:
"昔々、2人の双子が一緒に宇宙を旅していました。"
},
// continue
],
"merged"
:
[
{
"start"
:
"0"
,
"dur"
:
"5.056"
,
"text"
:
"Once upon a time, two twins traveled together throughout the universe."
,
"translate_text"
:
"昔々、2人の双子が一緒に宇宙を旅していました。"
},
// continue
]
use caption_num caption_num_second to get full control
All available caption will be displayed, use --caption_num
--caption_num_second
to choose the caption which will be displayed as original or translation transcript.
>> dl-youtube-cc "wgNiGj1nGYE"
--caption_num=
0
--caption_num_second=
3
, --output_file=
"0,3-zh,es.txt"
INFO: available caption(
s)
:
INFO: ✔ as original #0. .zh-Hant 中文(繁體字)
INFO: ⭕ #1. .zh-Hans 中文(簡體字)
INFO: ⭕ #2. .id 印尼文
INFO: ✔ as translation #3. .es 西班牙文
INFO: ⭕ #4. .fr 法文
INFO: ⭕ #5. .ru 俄文
INFO: ⭕ #6. .en-US 英文(美國)
INFO: ⭕ #7. .th 泰文
INFO: ⭕ #8. .vi 越南文
INFO: ⭕ #9. .pt 葡萄牙文
INFO: ⭕ #10. .de 德文
INFO: ✔ marks chosen one in
0
-index
INFO: given by --caption_num default to 0
as original
INFO: Save to 0
,3-zh,es.txt
Install and Run
pip install download-youtube-subtitle
orpip install download-youtube-subtitle --user
dl-youtube-cc -h
or uninstall to reinstall new version
pip uninstall download-youtube-subtitle -y
run in cli
dl-youtube-cc -h
will show the following.
NAME
dl-youtube-cc - download youtube closed caption(subtitles) by videoID
SYNOPSIS
dl-youtube-cc VIDEOID <flags>
DESCRIPTION
Examples:
dl-youtube-cc -h # to see this helpful infomation
dl-youtube-cc wgNiGj1nGYE --translation 'ja' # use japanese translation, see ./lang_code for full list
dl-youtube-cc wgNiGj1nGYE --caption_num=1 --translation 'ja' # choose the caption num for original transcript and use japanese translation,
dl-youtube-cc wgNiGj1nGYE --caption_num=1 --caption_num_second=2 # manually choose the original and translation transcript from available caption list
dl-youtube-cc wgNiGj1nGYE --translation False # without translation
dl-youtube-cc wgNiGj1nGYE --save_to_file=False # print stuff in console
dl-youtube-cc wgNiGj1nGYE --output_file='test.txt' # print stuff in named file
dl-youtube-cc wgNiGj1nGYE --to_json=True # print stuff in json
POSITIONAL ARGUMENTS
VIDEOID
Type: str
the video link or the id of youtube video, the string after 'v=' in a youtube video link
FLAGS
--translation=TRANSLATION
Type: typing.Union[str, bool]
Default: 'zh-Hans'
which will be displayed as original transcript, default to 'zh-Hans' for simplified Chinese, see ./lang_code.json for full list, or pass False to disable translation
--caption_num=CAPTION_NUM
Type: int
Default: 0
choose the caption which will be displayed as original transcript
--caption_num_second=CAPTION_NUM_SECOND
Type: Optional[int]
Default: None
will surpass translation option, choose the caption which will be displayed as translation transcript
--output_file=OUTPUT_FILE
Type: Optional[str]
Default: None
default to video title
--save_to_file=SAVE_TO_FILE
Type: bool
Default: True
pass False to print in console
--to_json=TO_JSON
Type: bool
Default: False
pass True to export caption to json
--remove_font_tag=REMOVE_FONT_TAG
Type: bool
Default: True
remove font tag
Use in Code
import
download_youtube_subtitle.common
as
common
import
download_youtube_subtitle.main
as
download_youtube_subtitle
# ...
Development
Environment Setup
for conda
pip install 'fire'
'requests'
'IPython'
'sure'
Usage
python main.py -h
python main.py VIDEOID
Tests
cd
tests
./run.sh
./test_cli.sh
Ref
deployment – How can I use setuptools to generate a console_scripts entry point which calls python -m mypackage
? – Stack Overflow
Packaging Python Projects — Python Packaging User Guide
./nb/notebook2script.py
from course-v3/nbs/dl2 at master · fastai/course-v3
Google Style Python Docstrings