-
Notifications
You must be signed in to change notification settings - Fork 48
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
30 changed files
with
711 additions
and
201 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
.env | ||
media | ||
SDK/Baidu/Voice/access_token.json | ||
*.pyc |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
<center><h1>TextCreateVideo</h1></center> | ||
|
||
<h2>介绍</h2> | ||
受到项目 <a herf="https://github.com/guifaChild/text_to_vedio">text_to_vedio</a> | ||
启发,编写此代码,用于帮助自媒体快速生产视频,目前只能通过文字来生成AI图片,后期会支持图片生成图片,增加web页面,让不会技术也能通过页面配置生成图片,项目主要是使用chatGPT+百度云API+Stable | ||
Diffusion+MoviePy实现,项目整体设计为组件式结构,将第三方做成插件集成,实现项目热拔插,chatGPT故障或异常可以自动换成其他大模型,其他同上。 | ||
|
||
<h2>更新信息</h2> | ||
[2023/08/01] 发布第一版,实现文本到视频流的全过程 | ||
|
||
<h2>使用方式</h2> | ||
<ol> | ||
<li>先下载Stable Diffusion, window电脑推荐使用B站大佬@秋葉aaaki制作并免费发布的Stable Diffusion WebUI启动器电脑桌面版。<a href="https://www.zmthome.com/site/5432.html">绘世</a> Mac 用户下载 <a href="https://github.com/AUTOMATIC1111/stable-diffusion-webui">官网</a> 插一嘴 Mac M1还是别装了,很慢😭</li> | ||
<li>由于我ChatGPT免费API次数用完了😭,所以用的第三方服务商的API <a href="https://fastgpt.run/">fastgpt</a>,我会在B站发视频教怎么用,新人注册送好几块钱,调用了几千上万次应该是不成问题(不是广告!),有chatGPT的API权限最好!</li> | ||
<li>百度合成语音API,免费的,B站会放视频</li> | ||
<li>看我config.py文件里要求的配置,配置完启动cli_demo.py 就能得到视频,文件都放在media文件中,后期会让异步启动main.py,使用web页面配置。</li> | ||
<p>文件不要太大,最后一章一章来生成。</p> | ||
</ol> | ||
|
||
<h2>代码逻辑</h2> | ||
<ul> | ||
<li> | ||
第一步、将用户输入的文本进行切割,按照逗号或者句号切割 | ||
</li> | ||
<li> | ||
<p>第二步、使用chatGPT生成提示词</p> | ||
</li> | ||
<li> | ||
<p>第三步、调用百度语音合成包进行语音合成</p> | ||
</li> | ||
<li> | ||
<p>第四步、使用stable diffusion生成图片</p> | ||
</li> | ||
<li> | ||
<p>第五步、使用moviepy将图片和语音结合起来生成视频</p> | ||
</li> | ||
</ul> | ||
<h2>协议</h2> | ||
本仓库的代码依照 Apache-2.0 协议开源。 | ||
<h2>严禁</h2> | ||
|
||
未经许可,严禁商用。 | ||
<hr/> | ||
QQ群:100419879 | ||
|
||
 | ||
|
||
wx群:864399407 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
#!/usr/bin/python | ||
# -*- coding: UTF-8 -*- | ||
# @author:anning | ||
# @email:[email protected] | ||
# @time:2023/08/01 09:33 | ||
# @file:__init__.py.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,93 @@ | ||
#!/usr/bin/python | ||
# -*- coding: UTF-8 -*- | ||
# @author:anning | ||
# @email:[email protected] | ||
# @time:2023/08/01 09:33 | ||
# @file:app.py | ||
import urllib | ||
from urllib.parse import quote_plus | ||
import uuid | ||
from datetime import datetime | ||
|
||
import requests | ||
import os | ||
import json | ||
from config import client_id, client_secret, file_path | ||
|
||
|
||
class Main: | ||
client_id = client_id | ||
client_secret = client_secret | ||
|
||
def create_access_token(self): | ||
url = f"https://aip.baidubce.com/oauth/2.0/token?grant_type=client_credentials&client_id={self.client_id}&client_secret={self.client_secret}" | ||
payload = "" | ||
headers = { | ||
'Content-Type': 'application/json', | ||
'Accept': 'application/json' | ||
} | ||
response = requests.request("POST", url, headers=headers, data=payload) | ||
print("-----------向百度获取 access_token API 发起请求了-----------") | ||
access_token = response.json() | ||
access_token.update({"time": datetime.now().strftime("%Y-%m-%d")}) | ||
with open('access_token.json', 'w') as f: | ||
json.dump(access_token, f) | ||
return access_token | ||
|
||
def get_access_token(self): | ||
if os.path.exists('access_token.json'): | ||
with open('access_token.json', 'r') as f: | ||
data = json.load(f) | ||
time = data.get("time") | ||
if time and (datetime.now() - datetime.strptime(time, '%Y-%m-%d')).days >= 29: | ||
return self.create_access_token() | ||
return data | ||
return self.create_access_token() | ||
|
||
def text_to_audio(self, text: str, index: int): | ||
url = "https://tsn.baidu.com/text2audio" | ||
text = text.encode('utf8') | ||
FORMATS = {3: "mp3", 4: "pcm", 5: "pcm", 6: "wav"} | ||
FORMAT = FORMATS[6] | ||
data = { | ||
# 合成的文本,文本长度必须小于1024GBK字节。建议每次请求文本不超过120字节,约为60个汉字或者字母数字。 | ||
"tex": text, | ||
# access_token | ||
"tok": self.get_access_token().get("access_token"), | ||
# 用户唯一标识,用来计算UV值。建议填写能区分用户的机器 MAC 地址或 IMEI 码,长度为60字符以内 | ||
"cuid": hex(uuid.getnode()), | ||
# 客户端类型选择,web端填写固定值1 | ||
"ctp": "1", | ||
# 固定值zh。语言选择,目前只有中英文混合模式,填写固定值zh | ||
"lan": "zh", | ||
# 语速,取值0-15,默认为5中语速 | ||
"spd": 5, | ||
# 音调,取值0-15,默认为5中语调 | ||
"pit": 5, | ||
# 音量,基础音库取值0-9,精品音库取值0-15,默认为5中音量(取值为0时为音量最小值,并非为无声) | ||
"vol": 5, | ||
# (基础音库) 度小宇=1,度小美=0,度逍遥(基础)=3,度丫丫=4 | ||
# (精品音库) 度逍遥(精品)=5003,度小鹿=5118,度博文=106,度小童=110,度小萌=111,度米朵=103,度小娇=5 | ||
"per": 5003, | ||
# 3为mp3格式(默认); 4为pcm-16k;5为pcm-8k;6为wav(内容同pcm-16k); 注意aue=4或者6是语音识别要求的格式,但是音频内容不是语音识别要求的自然人发音,所以识别效果会受影响。 | ||
"aue": FORMAT | ||
} | ||
data = urllib.parse.urlencode(data) | ||
response = requests.post(url, data) | ||
if response.status_code == 200: | ||
result_str = response.content | ||
save_file = str(index) + '.' + FORMAT | ||
audio = file_path + "audio" | ||
if not os.path.isdir(audio): | ||
os.mkdir(audio) | ||
audio_path = f'{audio}/' + save_file | ||
with open(audio_path, 'wb') as of: | ||
of.write(result_str) | ||
return audio_path | ||
else: | ||
return False | ||
|
||
|
||
if __name__ == '__main__': | ||
m = Main() | ||
m.text_to_audio("叶无名是一个少林寺的俗家弟子,他天资聪颖,博览群书,精通天文地理和阴阳八卦", "叶无名") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
#!/usr/bin/python | ||
# -*- coding: UTF-8 -*- | ||
# @author:anning | ||
# @email:[email protected] | ||
# @time:2023/08/01 09:33 | ||
# @file:__init__.py.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
#!/usr/bin/python | ||
# -*- coding: UTF-8 -*- | ||
# @author:anning | ||
# @email:[email protected] | ||
# @time:2023/08/01 18:08 | ||
# @file:__init__.py.py |
Oops, something went wrong.