download-youtube-subtitle
Mục lục bài viết
download-youtube-subtitle
due to changes of youtube api, you need to UPGRADE to 2.0.0, see Install and Run
Download Youtube Subtitle ![Build Status](https://warehouse-camo.ingress.cmh1.psfhosted.org/056d86ea48a945ad9a39370d52ea2fd0697f3b06/68747470733a2f2f7472617669732d63692e636f6d2f78737468756e6465722f646f776e6c6f61642d796f75747562652d7375627469746c652e7376673f6272616e63683d6d6173746572)
Download youtube subtitles(closed caption, cc) or srt as txt or json.
Features
python version of algolia/youtube-captions-scraper: Fetch youtube user submitted or fallback to auto-generated captions
Example
save as txt
dl-youtube-cc https://www.youtube.com/watch?v=wgnigj1ngye --translation ja
or
dl-youtube-cc wgNiGj1nGYE --translation ja
will saved as Version1.5SpecialProgramGenshinImpact.txt
https://www.youtube.com/watch?v=wgNiGj1nGYE
---------00:00----------
從前,有一對雙胞胎結伴在宇宙中旅行
昔々、宇宙を一緒に旅している双子のペアがいました
---------00:05----------
但有一天,他們前路遇阻
しかしある日、彼らの道は封鎖されました
---------00:07----------
被一個未知的神明生生分離
未知の神によって隔てられている
save as json
dl-youtube-cc wgNiGj1nGYE --translation ja --to_json=True
will saved as Version1.5SpecialProgramGenshinImpact.json
{
"original"
:
[
{
"start"
:
"0"
,
"dur"
:
"5.056"
,
"text"
:
"Once upon a time, two twins traveled together throughout the universe."
},
// continue
],
"translation"
:
[
{
"start"
:
"0"
,
"dur"
:
"5.056"
,
"text"
:
"昔々、2人の双子が一緒に宇宙を旅していました。"
},
// continue
],
"merged"
:
[
{
"start"
:
"0"
,
"dur"
:
"5.056"
,
"text"
:
"Once upon a time, two twins traveled together throughout the universe."
,
"translate_text"
:
"昔々、2人の双子が一緒に宇宙を旅していました。"
},
// continue
]
use caption_num caption_num_second to get full control
All available caption will be displayed, use --caption_num
--caption_num_second
to choose the caption which will be displayed as original or translation transcript.
>> dl-youtube-cc "wgNiGj1nGYE"
--caption_num=
0
--caption_num_second=
3
, --output_file=
"0,3-zh,es.txt"
INFO: available caption(
s)
:
INFO: ✔ as original #0. .zh-Hant 中文(繁體字)
INFO: ⭕ #1. .zh-Hans 中文(簡體字)
INFO: ⭕ #2. .id 印尼文
INFO: ✔ as translation #3. .es 西班牙文
INFO: ⭕ #4. .fr 法文
INFO: ⭕ #5. .ru 俄文
INFO: ⭕ #6. .en-US 英文(美國)
INFO: ⭕ #7. .th 泰文
INFO: ⭕ #8. .vi 越南文
INFO: ⭕ #9. .pt 葡萄牙文
INFO: ⭕ #10. .de 德文
INFO: ✔ marks chosen one in
0
-index
INFO: given by --caption_num default to 0
as original
INFO: Save to 0
,3-zh,es.txt
Install and Run
pip install download-youtube-subtitle
orpip install download-youtube-subtitle --user
dl-youtube-cc -h
or uninstall to reinstall new version
pip uninstall download-youtube-subtitle -y
run in cli
dl-youtube-cc -h
will show the following.
NAME
dl-youtube-cc - download youtube closed caption(subtitles) by videoID
SYNOPSIS
dl-youtube-cc VIDEOID <flags>
DESCRIPTION
Examples:
dl-youtube-cc -h # to see this helpful infomation
dl-youtube-cc wgNiGj1nGYE --translation 'ja' # use japanese translation, see ./lang_code for full list
dl-youtube-cc wgNiGj1nGYE --caption_num=1 --translation 'ja' # choose the caption num for original transcript and use japanese translation,
dl-youtube-cc wgNiGj1nGYE --caption_num=1 --caption_num_second=2 # manually choose the original and translation transcript from available caption list
dl-youtube-cc wgNiGj1nGYE --translation False # without translation
dl-youtube-cc wgNiGj1nGYE --save_to_file=False # print stuff in console
dl-youtube-cc wgNiGj1nGYE --output_file='test.txt' # print stuff in named file
dl-youtube-cc wgNiGj1nGYE --to_json=True # print stuff in json
POSITIONAL ARGUMENTS
VIDEOID
Type: str
the video link or the id of youtube video, the string after 'v=' in a youtube video link
FLAGS
--translation=TRANSLATION
Type: typing.Union[str, bool]
Default: 'zh-Hans'
which will be displayed as original transcript, default to 'zh-Hans' for simplified Chinese, see ./lang_code.json for full list, or pass False to disable translation
--caption_num=CAPTION_NUM
Type: int
Default: 0
choose the caption which will be displayed as original transcript
--caption_num_second=CAPTION_NUM_SECOND
Type: Optional[int]
Default: None
will surpass translation option, choose the caption which will be displayed as translation transcript
--output_file=OUTPUT_FILE
Type: Optional[str]
Default: None
default to video title
--save_to_file=SAVE_TO_FILE
Type: bool
Default: True
pass False to print in console
--to_json=TO_JSON
Type: bool
Default: False
pass True to export caption to json
--remove_font_tag=REMOVE_FONT_TAG
Type: bool
Default: True
remove font tag
Use in Code
import
download_youtube_subtitle.common
as
common
import
download_youtube_subtitle.main
as
download_youtube_subtitle
# ...
Development
Environment Setup
for conda
pip install 'fire'
'requests'
'IPython'
'sure'
Usage
python main.py -h
python main.py VIDEOID
Tests
cd
tests
./run.sh
./test_cli.sh
Ref
deployment – How can I use setuptools to generate a console_scripts entry point which calls python -m mypackage
? – Stack Overflow
Packaging Python Projects — Python Packaging User Guide
./nb/notebook2script.py
from course-v3/nbs/dl2 at master · fastai/course-v3
Google Style Python Docstrings