资讯热点聚焦探测工具行业公司行情市场招标地区设计咨询信息滚动工程

您的位置：首页>聚焦 > >正文

云端炼丹,算力白嫖,基于Colab用So-vits制作AI川普演唱《国际歌》

2023-05-17 00:10:56 来源：刘悦技术分享

人工智能AI技术早已深入到人们生活的每一个角落，君不见AI孙燕姿的歌声此起彼伏，不绝于耳，但并不是每个人都拥有一块N卡，没有GPU的日子总是不好过的，但是没关系，山人有妙计，本次我们基于Google的Colab免费云端服务器来搭建深度学习环境，制作AI特朗普，让他高唱《国际歌》。

Colab（全名Colaboratory ），它是Google公司的一款基于云端的基础免费服务器产品，可以在B端，也就是浏览器里面编写和执行Python代码，非常方便，贴心的是，Colab可以给用户分配免费的GPU进行使用，对于没有N卡的朋友来说，这已经远远超出了业界良心的范畴，简直就是在做慈善事业。

配置Colab

Colab是基于Google云盘的产品，我们可以将深度学习的Python脚本、训练好的模型、以及训练集等数据直接存放在云盘中，然后通过Colab执行即可。

【资料图】

首先访问Google云盘：drive.google.com

随后点击新建，选择关联更多应用：

接着安装Colab即可：

至此，云盘和Colab就关联好了，现在我们可以新建一个脚本文件my_sovits.ipynb文件，键入代码：

hello colab

随后，按快捷键 ctrl + 回车，即可运行代码：

这里需要注意的是，Colab使用的是基于Jupyter Notebook的ipynb格式的Python代码。

Jupyter Notebook是以网页的形式打开，可以在网页页面中直接编写代码和运行代码，代码的运行结果也会直接在代码块下显示。如在编程过程中需要编写说明文档，可在同一个页面中直接编写，便于作及时的说明和解释。

随后设置一下显卡类型：

接着运行命令，查看GPU版本：

!/usr/local/cuda/bin/nvcc --version!nvidia-smi

程序返回：

nvcc: NVIDIA (R) Cuda compiler driverCopyright (c) 2005-2022 NVIDIA CorporationBuilt on Wed_Sep_21_10:33:58_PDT_2022Cuda compilation tools, release 11.8, V11.8.89Build cuda_11.8.r11.8/compiler.31833905_0Tue May 16 04:49:23 2023       +-----------------------------------------------------------------------------+| NVIDIA-SMI 525.85.12    Driver Version: 525.85.12    CUDA Version: 12.0     ||-------------------------------+----------------------+----------------------+| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC || Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. ||                               |                      |               MIG M. ||===============================+======================+======================||   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 || N/A   65C    P8    13W /  70W |      0MiB / 15360MiB |      0%      Default ||                               |                      |                  N/A |+-------------------------------+----------------------+----------------------+                                                                               +-----------------------------------------------------------------------------+| Processes:                                                                  ||  GPU   GI   CI        PID   Type   Process name                  GPU Memory ||        ID   ID                                                   Usage      ||=============================================================================||  No running processes found                                                 |+-----------------------------------------------------------------------------+

这里建议选择Tesla T4的显卡类型，性能更突出。

至此Colab就配置好了。

配置So-vits

下面我们配置so-vits环境，可以通过pip命令安装一些基础依赖：

!pip install pyworld==0.3.2!pip install numpy==1.23.5

注意jupyter语言是通过叹号来运行命令。

注意，由于不是本地环境，有的时候colab会提醒：

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/Collecting numpy==1.23.5  Downloading numpy-1.23.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.1 MB)     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 17.1/17.1 MB 80.1 MB/s eta 0:00:00Installing collected packages: numpy  Attempting uninstall: numpy    Found existing installation: numpy 1.22.4    Uninstalling numpy-1.22.4:      Successfully uninstalled numpy-1.22.4Successfully installed numpy-1.23.5WARNING: The following packages were previously imported in this runtime:  [numpy]You must restart the runtime in order to use newly installed versions.

此时numpy库需要重启runtime才可以导入操作。

重启runtime后，需要再重新安装一次，直到系统提示依赖已经存在：

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/Requirement already satisfied: numpy==1.23.5 in /usr/local/lib/python3.10/dist-packages (1.23.5)

随后，克隆so-vits项目，并且安装项目的依赖：

import osimport glob!git clone https://github.com/effusiveperiscope/so-vits-svc -b eff-4.0os.chdir("/content/so-vits-svc")# install requirements one-at-a-time to ignore exceptions!cat requirements.txt | xargs -n 1 pip install --extra-index-url https://download.pytorch.org/whl/cu117!pip install praat-parselmouth!pip install ipywidgets!pip install huggingface_hub!pip install pip==23.0.1 # fix pip version for fairseq install!pip install fairseq==0.12.2!jupyter nbextension enable --py widgetsnbextensionexisting_files = glob.glob("/content/**/*.*", recursive=True)!pip install --upgrade protobuf==3.9.2!pip uninstall -y tensorflow!pip install tensorflow==2.11.0

安装好依赖之后，定义一些前置工具方法：

os.chdir("/content/so-vits-svc") # force working-directory to so-vits-svc - this line is just for safety and is probably not requiredimport tarfileimport osfrom zipfile import ZipFile# taken from https://github.com/CookiePPP/cookietts/blob/master/CookieTTS/utils/dataset/extract_unknown.pydef extract(path):    if path.endswith(\".zip\"):        with ZipFile(path, "r") as zipObj:           zipObj.extractall(os.path.split(path)[0])    elif path.endswith(\".tar.bz2\"):        tar = tarfile.open(path, \"r:bz2\")        tar.extractall(os.path.split(path)[0])        tar.close()    elif path.endswith(\".tar.gz\"):        tar = tarfile.open(path, \"r:gz\")        tar.extractall(os.path.split(path)[0])        tar.close()    elif path.endswith(\".tar\"):        tar = tarfile.open(path, \"r:\")        tar.extractall(os.path.split(path)[0])        tar.close()    elif path.endswith(\".7z\"):        import py7zr        archive = py7zr.SevenZipFile(path, mode="r")        archive.extractall(path=os.path.split(path)[0])        archive.close()    else:        raise NotImplementedError(f\"{path} extension not implemented.\")# taken from https://github.com/CookiePPP/cookietts/tree/master/CookieTTS/_0_download/scripts# megatools download urlswin64_url = \"https://megatools.megous.com/builds/builds/megatools-1.11.1.20230212-win64.zip\"win32_url = \"https://megatools.megous.com/builds/builds/megatools-1.11.1.20230212-win32.zip\"linux_url = \"https://megatools.megous.com/builds/builds/megatools-1.11.1.20230212-linux-x86_64.tar.gz\"# download megatoolsfrom sys import platformimport osimport urllib.requestimport subprocessfrom time import sleepif platform == \"linux\" or platform == \"linux2\":        dl_url = linux_urlelif platform == \"darwin\":    raise NotImplementedError("MacOS not supported.")elif platform == \"win32\":        dl_url = win64_urlelse:    raise NotImplementedError ("Unknown Operating System.")dlname = dl_url.split(\"/\")[-1]if dlname.endswith(\".zip\"):    binary_folder = dlname[:-4] # remove .zipelif dlname.endswith(\".tar.gz\"):    binary_folder = dlname[:-7] # remove .tar.gzelse:    raise NameError("downloaded megatools has unknown archive file extension!")if not os.path.exists(binary_folder):    print("\"megatools\" not found. Downloading...")    if not os.path.exists(dlname):        urllib.request.urlretrieve(dl_url, dlname)    assert os.path.exists(dlname), "failed to download."    extract(dlname)    sleep(0.10)    os.unlink(dlname)    print(\"Done!\")binary_folder = os.path.abspath(binary_folder)def megadown(download_link, filename=".", verbose=False):    \"\"\"Use megatools binary executable to download files and folders from MEGA.nz .\"\"\"    filename = " --path \""+os.path.abspath(filename)+"\"" if filename else \"\"    wd_old = os.getcwd()    os.chdir(binary_folder)    try:        if platform == \"linux\" or platform == \"linux2\":            subprocess.call(f"./megatools dl{filename}{\" --debug http\" if verbose else \"\"} {download_link}", shell=True)        elif platform == \"win32\":            subprocess.call(f"megatools.exe dl{filename}{\" --debug http\" if verbose else \"\"} {download_link}", shell=True)    except:        os.chdir(wd_old) # don"t let user stop download without going back to correct directory first        raise    os.chdir(wd_old)    return filenameimport urllib.requestfrom tqdm import tqdmimport gdownfrom os.path import existsdef request_url_with_progress_bar(url, filename):    class DownloadProgressBar(tqdm):        def update_to(self, b=1, bsize=1, tsize=None):            if tsize is not None:                self.total = tsize            self.update(b * bsize - self.n)        def download_url(url, filename):        with DownloadProgressBar(unit="B", unit_scale=True,                                 miniters=1, desc=url.split("/")[-1]) as t:            filename, headers = urllib.request.urlretrieve(url, filename=filename, reporthook=t.update_to)            print(\"Downloaded to \"+filename)    download_url(url, filename)def download(urls, dataset="", filenames=None, force_dl=False, username="", password="", auth_needed=False):    assert filenames is None or len(urls) == len(filenames), f\"number of urls does not match filenames. Expected {len(filenames)} urls, containing the files listed below.
{filenames}\"    assert not auth_needed or (len(username) and len(password)), f\"username and password needed for {dataset} Dataset\"    if filenames is None:        filenames = [None,]*len(urls)    for i, (url, filename) in enumerate(zip(urls, filenames)):        print(f\"Downloading File from {url}\")        #if filename is None:        #    filename = url.split(\"/\")[-1]        if filename and (not force_dl) and exists(filename):            print(f\"{filename} Already Exists, Skipping.\")            continue        if "drive.google.com" in url:            assert "https://drive.google.com/uc?id=" in url, "Google Drive links should follow the format \"https://drive.google.com/uc?id=1eQAnaoDBGQZldPVk-nzgYzRbcPSmnpv6\".
Where id=XXXXXXXXXXXXXXXXX is the Google Drive Share ID."            gdown.download(url, filename, quiet=False)        elif "mega.nz" in url:            megadown(url, filename)        else:            #urllib.request.urlretrieve(url, filename=filename) # no progress bar            request_url_with_progress_bar(url, filename) # with progress barimport huggingface_hubimport osimport shutilclass HFModels:    def __init__(self, repo = \"therealvul/so-vits-svc-4.0\",             model_dir = \"hf_vul_models\"):        self.model_repo = huggingface_hub.Repository(local_dir=model_dir,            clone_from=repo, skip_lfs_files=True)        self.repo = repo        self.model_dir = model_dir        self.model_folders = os.listdir(model_dir)        self.model_folders.remove(".git")        self.model_folders.remove(".gitattributes")    def list_models(self):        return self.model_folders    # Downloads model;    # copies config to target_dir and moves model to target_dir    def download_model(self, model_name, target_dir):        if not model_name in self.model_folders:            raise Exception(model_name + \" not found\")        model_dir = self.model_dir        charpath = os.path.join(model_dir,model_name)        gen_pt = next(x for x in os.listdir(charpath) if x.startswith(\"G_\"))        cfg = next(x for x in os.listdir(charpath) if x.endswith(\"json\"))        try:          clust = next(x for x in os.listdir(charpath) if x.endswith(\"pt\"))        except StopIteration as e:          print(\"Note - no cluster model for \"+model_name)          clust = None        if not os.path.exists(target_dir):            os.makedirs(target_dir, exist_ok=True)        gen_dir = huggingface_hub.hf_hub_download(repo_id = self.repo,            filename = model_name + \"/\" + gen_pt) # this is a symlink                if clust is not None:          clust_dir = huggingface_hub.hf_hub_download(repo_id = self.repo,              filename = model_name + \"/\" + clust) # this is a symlink          shutil.move(os.path.realpath(clust_dir), os.path.join(target_dir, clust))          clust_out = os.path.join(target_dir, clust)        else:          clust_out = None        shutil.copy(os.path.join(charpath,cfg),os.path.join(target_dir, cfg))        shutil.move(os.path.realpath(gen_dir), os.path.join(target_dir, gen_pt))        return {\"config_path\": os.path.join(target_dir,cfg),            \"generator_path\": os.path.join(target_dir,gen_pt),            \"cluster_path\": clust_out}# Example usage# vul_models = HFModels()# print(vul_models.list_models())# print(\"Applejack (singing)\" in vul_models.list_models())# vul_models.download_model(\"Applejack (singing)\",\"models/Applejack (singing)\")    print(\"Finished!\")

这些方法可以帮助我们下载、解压和加载模型。

音色模型下载和线上推理

接着将特朗普的音色模型和配置文件进行下载，下载地址是：

https://huggingface.co/Nardicality/so-vits-svc-4.0-models/tree/main/Trump18.5k

随后模型文件放到项目的models文件夹，配置文件则放入config文件夹。

接着将需要转换的歌曲上传到和项目平行的目录中。

运行代码：

import osimport globimport jsonimport copyimport loggingimport iofrom ipywidgets import widgetsfrom pathlib import Pathfrom IPython.display import Audio, displayos.chdir("/content/so-vits-svc")import torchfrom inference import infer_toolfrom inference import slicerfrom inference.infer_tool import Svcimport soundfileimport numpy as npMODELS_DIR = \"models\"def get_speakers():  speakers = []  for _,dirs,_ in os.walk(MODELS_DIR):    for folder in dirs:      cur_speaker = {}      # Look for G_****.pth      g = glob.glob(os.path.join(MODELS_DIR,folder,"G_*.pth"))      if not len(g):        print(\"Skipping \"+folder+\", no G_*.pth\")        continue      cur_speaker[\"model_path\"] = g[0]      cur_speaker[\"model_folder\"] = folder      # Look for *.pt (clustering model)      clst = glob.glob(os.path.join(MODELS_DIR,folder,"*.pt"))      if not len(clst):        print(\"Note: No clustering model found for \"+folder)        cur_speaker[\"cluster_path\"] = \"\"      else:        cur_speaker[\"cluster_path\"] = clst[0]      # Look for config.json      cfg = glob.glob(os.path.join(MODELS_DIR,folder,"*.json"))      if not len(cfg):        print(\"Skipping \"+folder+\", no config json\")        continue      cur_speaker[\"cfg_path\"] = cfg[0]      with open(cur_speaker[\"cfg_path\"]) as f:        try:          cfg_json = json.loads(f.read())        except Exception as e:          print(\"Malformed config json in \"+folder)        for name, i in cfg_json[\"spk\"].items():          cur_speaker[\"name\"] = name          cur_speaker[\"id\"] = i          if not name.startswith("."):            speakers.append(copy.copy(cur_speaker))    return sorted(speakers, key=lambda x:x[\"name\"].lower())logging.getLogger("numba").setLevel(logging.WARNING)chunks_dict = infer_tool.read_temp(\"inference/chunks_temp.json\")existing_files = []slice_db = -40wav_format = "wav"class InferenceGui():  def __init__(self):    self.speakers = get_speakers()    self.speaker_list = [x[\"name\"] for x in self.speakers]    self.speaker_box = widgets.Dropdown(        options = self.speaker_list    )    display(self.speaker_box)    def convert_cb(btn):      self.convert()    def clean_cb(btn):      self.clean()    self.convert_btn = widgets.Button(description=\"Convert\")    self.convert_btn.on_click(convert_cb)    self.clean_btn = widgets.Button(description=\"Delete all audio files\")    self.clean_btn.on_click(clean_cb)    self.trans_tx = widgets.IntText(value=0, description="Transpose")    self.cluster_ratio_tx = widgets.FloatText(value=0.0,       description="Clustering Ratio")    self.noise_scale_tx = widgets.FloatText(value=0.4,       description="Noise Scale")    self.auto_pitch_ck = widgets.Checkbox(value=False, description=      "Auto pitch f0 (do not use for singing)")    display(self.trans_tx)    display(self.cluster_ratio_tx)    display(self.noise_scale_tx)    display(self.auto_pitch_ck)    display(self.convert_btn)    display(self.clean_btn)  def convert(self):    trans = int(self.trans_tx.value)    speaker = next(x for x in self.speakers if x[\"name\"] ==           self.speaker_box.value)    spkpth2 = os.path.join(os.getcwd(),speaker[\"model_path\"])    print(spkpth2)    print(os.path.exists(spkpth2))    svc_model = Svc(speaker[\"model_path\"], speaker[\"cfg_path\"],       cluster_model_path=speaker[\"cluster_path\"])        input_filepaths = [f for f in glob.glob("/content/**/*.*", recursive=True)     if f not in existing_files and      any(f.endswith(ex) for ex in [".wav",".flac",".mp3",".ogg",".opus"])]    for name in input_filepaths:      print(\"Converting \"+os.path.split(name)[-1])      infer_tool.format_wav(name)      wav_path = str(Path(name).with_suffix(".wav"))      wav_name = Path(name).stem      chunks = slicer.cut(wav_path, db_thresh=slice_db)      audio_data, audio_sr = slicer.chunks2audio(wav_path, chunks)      audio = []      for (slice_tag, data) in audio_data:          print(f"#=====segment start, "              f"{round(len(data)/audio_sr, 3)}s======")                    length = int(np.ceil(len(data) / audio_sr *              svc_model.target_sample))                    if slice_tag:              print("jump empty segment")              _audio = np.zeros(length)          else:              # Padding \"fix\" for noise              pad_len = int(audio_sr * 0.5)              data = np.concatenate([np.zeros([pad_len]),                  data, np.zeros([pad_len])])              raw_path = io.BytesIO()              soundfile.write(raw_path, data, audio_sr, format=\"wav\")              raw_path.seek(0)              _cluster_ratio = 0.0              if speaker[\"cluster_path\"] != \"\":                _cluster_ratio = float(self.cluster_ratio_tx.value)              out_audio, out_sr = svc_model.infer(                  speaker[\"name\"], trans, raw_path,                  cluster_infer_ratio = _cluster_ratio,                  auto_predict_f0 = bool(self.auto_pitch_ck.value),                  noice_scale = float(self.noise_scale_tx.value))              _audio = out_audio.cpu().numpy()              pad_len = int(svc_model.target_sample * 0.5)              _audio = _audio[pad_len:-pad_len]          audio.extend(list(infer_tool.pad_array(_audio, length)))                res_path = os.path.join("/content/",          f"{wav_name}_{trans}_key_"          f"{speaker[\"name\"]}.{wav_format}")      soundfile.write(res_path, audio, svc_model.target_sample,          format=wav_format)      display(Audio(res_path, autoplay=True)) # display audio file    pass  def clean(self):     input_filepaths = [f for f in glob.glob("/content/**/*.*", recursive=True)     if f not in existing_files and      any(f.endswith(ex) for ex in [".wav",".flac",".mp3",".ogg",".opus"])]     for f in input_filepaths:       os.remove(f)inference_gui = InferenceGui()

此时系统会自动在根目录，也就是content下寻找音乐文件，包含但不限于wav、flac、mp3等等，随后根据下载的模型进行推理，推理之前会自动对文件进行背景音分离以及降噪和切片等操作。

推理结束之后，会自动播放转换后的歌曲。

结语

如果是刚开始使用Colab，默认分配的显存是15G左右，完全可以胜任大多数训练和推理任务，但是如果经常用它挂机运算，能分配到的显卡配置就会渐进式地降低，如果需要长时间并且相对稳定的GPU资源，还是需要付费订阅Colab pro服务，另外Google云盘的免费使用空间也是15G，如果模型下多了，导致云盘空间不足，运行代码也会报错，所以最好定期清理Google云盘，以此保证深度学习任务的正常运行。

标签：