{"id":415419,"date":"2024-08-08T11:32:30","date_gmt":"2024-08-08T03:32:30","guid":{"rendered":"https:\/\/www.idc.net\/help\/?p=415419"},"modified":"2024-08-08T11:32:30","modified_gmt":"2024-08-08T03:32:30","slug":"%e7%9c%9f%e5%bf%ab%ef%bc%81%e5%87%a0%e5%88%86%e9%92%9f%e5%b0%b1%e6%8a%8a%e8%a7%86%e9%a2%91%e8%af%ad%e9%9f%b3%e8%af%86%e5%88%ab%e4%b8%ba%e6%96%87%e6%9c%ac%e4%ba%86%ef%bc%8c%e4%b8%8d%e5%88%b010%e8%a1%8c","status":"publish","type":"post","link":"https:\/\/idc.net\/help\/415419\/","title":{"rendered":"\u771f\u5feb\uff01\u51e0\u5206\u949f\u5c31\u628a\u89c6\u9891\u8bed\u97f3\u8bc6\u522b\u4e3a\u6587\u672c\u4e86\uff0c\u4e0d\u523010\u884c\u4ee3\u7801"},"content":{"rendered":"<p>\u5c06\u97f3\u89c6\u9891\u6587\u4ef6\u4e2d\u7684\u97f3\u9891\u8f6c\u4e3a\u6587\u5b57\u5185\u5bb9\uff0c\u8fd9\u4e2a\u9700\u6c42\u653e\u5230\u4e24\u5e74\u524d\u8fd8\u4e0d\u5927\u597d\u5b9e\u73b0\uff0c\u4f46\u662f\u653e\u5230\u4eca\u5929\uff0c\u51e0\u5206\u949f\u5c31\u89e3\u51b3\u4e86\u3002<\/p>\n<p>\u542c\u8bf4\u6709\u7684\u516c\u53f8\u4e3a\u4e86\u6293\u53d6\u8bad\u7ec3\u6570\u636e\uff0c\u5df2\u7ecf\u628a\u6296\u97f3\u3001\u5feb\u624b\u8fd9\u4e9b\u77ed\u89c6\u9891\u5e73\u53f0\u4e0a\u7684\u89c6\u9891\u6252\u4e86\u4e2a\u904d\uff0c\u7136\u540e\u5c06\u5176\u4e2d\u7684\u97f3\u9891\u63d0\u53d6\u6210\u6587\u672c\uff0c\u7528\u4f5c\u5927\u6570\u636e\u6a21\u578b\u7684\u8bad\u7ec3\u8bed\u6599\u3002<\/p>\n<p>\u5982\u679c\u4f60\u6709\u5c06\u89c6\u9891\u6216\u97f3\u9891\u6587\u4ef6\u8f6c\u6587\u5b57\u7684\u9700\u8981\uff0c\u4e0d\u59a8\u8bd5\u4e00\u8bd5\u4eca\u5929\u63d0\u4f9b\u7684\u8fd9\u4e2a\u5f00\u6e90\u65b9\u6848\u3002\u6bd4\u5982\u641c\u7d22\u5f71\u89c6\u53f0\u8bcd\u51fa\u73b0\u7684\u65f6\u95f4\u70b9\u3002<\/p>\n<p>\u8bdd\u4e0d\u591a\u8bf4\uff0c\u8fdb\u5165\u6b63\u9898\u3002<\/p>\n<h2>Whisper<\/h2>\n<p>\u8fd9\u4e2a\u65b9\u6848\u5c31\u662f OpenAI \u5f00\u6e90\u7684 Whisper\uff0c\u5f53\u7136\u662f\u7528 Python \u5199\u7684\u4e86\uff0c\u53ea\u9700\u8981\u7b80\u5355\u5b89\u88c5\u51e0\u4e2a\u5305\uff0c\u7136\u540e\u51e0\u884c\u4ee3\u7801\u4e00\u5199\uff0c\u7a0d\u7b49\u7247\u523b\uff08\u6839\u636e\u4f60\u7684\u673a\u5668\u6027\u80fd\u548c\u97f3\u89c6\u9891\u957f\u5ea6\u4e0d\u4e00\uff09\uff0c\u6700\u7ec8\u7684\u6587\u672c\u5185\u5bb9\u5c31\u51fa\u6765\u4e86\uff0c\u5c31\u662f\u8fd9\u4e48\u7b80\u5355\u3002<\/p>\n<p>GitHub \u4ed3\u5e93\u5730\u5740\uff1ahttps:\/\/github.com\/openai\/whisper<\/p>\n<h2>Fast-Whisper<\/h2>\n<p>\u867d\u7136\u5df2\u7ecf\u5f88\u7b80\u5355\u4e86\uff0c\u4f46\u662f\u5bf9\u4e8e\u7a0b\u5e8f\u5458\u6765\u8bf4\u8fd8\u662f\u4e0d\u591f\u7b80\u6d01\uff0c\u6bd5\u7adf\u7a0b\u5e8f\u5458\u90fd\u5f88\u201c\u61d2\u201d\uff0cWhisper \u867d\u8bf4\u5b89\u88c5\u548c\u8c03\u7528\u5df2\u7ecf\u5f88\u7b80\u5355\u4e86\uff0c\u4f46\u8fd8\u662f\u9700\u8981\u72ec\u7acb\u5b89\u88c5 PyTorch \u3001ffmpeg \u751a\u81f3 Rust\u3002<\/p>\n<p>\u4e8e\u662f\uff0c\u5c31\u6709\u4e86\u66f4\u5feb\u3001\u66f4\u7b80\u6d01\u7684 Fast-Whisper\u3002Fast-Whisper \u5e76\u4e0d\u662f\u7b80\u5355\u5c01\u88c5\u4e86\u4e00\u4e0b Whisper\uff0c\u800c\u662f\u662f\u4f7f\u7528 CTranslate2 \u91cd\u65b0\u5b9e\u73b0 OpenAI \u7684 Whisper \u6a21\u578b\uff0cCTranslate2 \u662f Transformer \u6a21\u578b\u7684\u5feb\u901f\u63a8\u7406\u5f15\u64ce\u3002<\/p>\n<p>\u603b\u7ed3\u4e00\u4e0b\uff0c\u4e5f\u5c31\u662f\u6bd4 Whisper \u66f4\u5feb\uff0c\u5b98\u65b9\u7684\u8bf4\u6cd5\u662f\u6bd4 Whisper \u5feb\u4e86 4-8 \u500d\u3002\u4e0d\u4ec5\u80fd\u652f\u6301 GPU \uff0c\u8fd8\u80fd\u652f\u6301 CPU\uff0c\u8fde\u6211\u8fd9\u53f0\u7834 Mac \u4e5f\u80fd\u7528\u3002<\/p>\n<p>GitHub \u4ed3\u5e93\u5730\u5740\uff1ahttps:\/\/github.com\/SYSTRAN\/faster-whisper<\/p>\n<p>\u4f7f\u7528\u8d77\u6765\u5c31\u4e24\u6b65\u3002<\/p>\n<ol>\n<li>\u5b89\u88c5\u4f9d\u8d56\u5305<\/li>\n<\/ol>\n<pre><code>pip install faster-whisper<\/code><\/pre>\n<ol>\n<li>\u5199\u4ee3\u7801\uff0c<\/li>\n<\/ol>\n<pre><code>from faster_whisper import WhisperModel\r\n\r\nmodel_size = \"large-v3\"\r\n\r\n# Run on GPU with FP16\r\nmodel = WhisperModel(model_size, device=\"cuda\", compute_type=\"float16\")\r\n\r\n# or run on GPU with INT8\r\n# model = WhisperModel(model_size, device=\"cuda\", compute_type=\"int8_float16\")\r\n# or run on CPU with INT8\r\n# model = WhisperModel(model_size, device=\"cpu\", compute_type=\"int8\")\r\n\r\nsegments, info = model.transcribe(\"audio.mp3\", beam_size=5)\r\n\r\nprint(\"Detected language '%s' with probability %f\" % (info.language, info.language_probability))\r\n\r\nfor segment in segments:\r\n    print(\"[%.2fs -&gt; %.2fs] %s\" % (segment.start, segment.end, segment.text))<\/code><\/pre>\n<p>\u6ca1\u9519\uff0c\u5c31\u662f\u8fd9\u4e48\u7b80\u5355\u3002<\/p>\n<h2>\u80fd\u505a\u4ec0\u4e48\u5462<\/h2>\n<p>\u6b63\u597d\u6709\u4e2a\u670b\u53cb\u60f3\u505a\u77ed\u89c6\u9891\uff0c\u53d1\u4e00\u4e9b\u9e21\u6c64\u6587\u5b66\u7684\u89c6\u9891\uff0c\u9e21\u6c64\u5c31\u6765\u81ea\u4e8e\u4e00\u4e9b\u540d\u5bb6\u8bbf\u8c08\u7684\u89c6\u9891\u3002\u4f46\u662f\uff0c\u4ed6\u53c8\u4e0d\u60f3\u628a\u5b8c\u6574\u7684\u89c6\u9891\u770b\u4e00\u904d\uff0c\u5c31\u60f3\u7528\u6700\u5feb\u7684\u65b9\u5f0f\u628a\u6587\u672c\u5185\u5bb9\u5f04\u4e0b\u6765\uff0c\u7136\u540e\u8bfb\u6587\u5b57\uff0c\u56e0\u4e3a\u8bfb\u6587\u5b57\u8981\u6bd4\u770b\u4e00\u7bc7\u89c6\u9891\u5feb\u7684\u591a\uff0c\u800c\u4e14\u8fd8\u53ef\u4ee5\u641c\u7d22\u3002<\/p>\n<p>\u6211\u5c31\u8bf4\uff0c\u8fde\u5b8c\u6574\u7684\u770b\u4e00\u7bc7\u89c6\u9891\u7684\u8654\u8bda\u4e4b\u5fc3\u90fd\u6ca1\u6709\uff0c\u80fd\u7ecf\u8425\u597d\u8d26\u53f7\u5417\u3002<\/p>\n<p>\u4e8e\u662f\u6211\u7ed9\u4ed6\u505a\u4e86\u4e00\u4e2a\uff0c\u5c31\u662f\u7528\u7684 Fast-Whisper\u3002<\/p>\n<h3>\u5ba2\u6237\u7aef<\/h3>\n<p>\u5ba2\u6237\u7aef\u7528 Swift \uff0c\u53ea\u652f\u6301 Mac \u7aef\u3002<\/p>\n<ol>\n<li>\u9009\u5219\u4e00\u4e2a\u89c6\u9891\uff1b<\/li>\n<li>\u7136\u540e\u70b9\u51fb\u300c\u63d0\u53d6\u6587\u672c\u300d\uff0c\u8fd9\u65f6\u4f1a\u8c03\u7528 Python \u63a5\u53e3\uff0c\u9700\u8981\u7b49\u5f85\u4e00\u6bb5\u65f6\u95f4\uff1b<\/li>\n<li>\u52a0\u8f7d\u89e3\u6790\u51fa\u7684\u6587\u672c\u4ee5\u53ca\u51fa\u73b0\u7684\u5f00\u59cb\u3001\u622a\u6b62\u65f6\u95f4\uff1b<\/li>\n<li>\u9009\u4e86\u4e00\u4e2a\u5f00\u59cb\u65f6\u95f4\u548c\u4e00\u4e2a\u7ed3\u675f\u4e8b\u4ef6\uff1b<\/li>\n<li>\u70b9\u51fb\u300c\u5bfc\u51fa\u300d\u6309\u94ae\uff0c\u89c6\u9891\u7247\u6bb5\u5c31\u5bfc\u51fa\u4e86\uff1b<\/li>\n<\/ol>\n<p>\uff0c\u65f6\u957f00:10<\/p>\n<h3>\u670d\u52a1\u7aef<\/h3>\n<p>\u670d\u52a1\u7aef\u5f53\u7136\u5c31\u662f Python \uff0c\u7136\u540e\u7528 Flask \u5305\u88c5\u4e00\u4e0b\uff0c\u5bf9\u5916\u653e\u5f00\u63a5\u53e3\u3002<\/p>\n<pre><code>from flask import Flask, request, jsonify\r\nfrom faster_whisper import WhisperModel\r\n\r\napp = Flask(__name__)\r\n\r\nmodel_size = \"large-v2\"\r\nmodel = WhisperModel(model_size, device=\"cpu\", compute_type=\"int8\")\r\n\r\n\r\n@app.route('\/transcribe', methods=['POST'])\r\ndef transcribe():\r\n    # Get the file path from the request\r\n    file_path = request.json.get('filePath')\r\n\r\n    # Transcribe the file\r\n    segments, info = model.transcribe(file_path, beam_size=5, initial_prompt=\"\u7b80\u4f53\")\r\n    segments_copy = []\r\n    with open('segments.txt', 'w') as file:\r\n        for segment in segments:\r\n            line = \"%.2fs|%.2fs|[%.2fs -&gt; %.2fs]|%s\" % (\r\n                segment.start, segment.end, segment.start, segment.end, segment.text)\r\n            segments_copy.append(line)\r\n            file.write(line + '\\n')\r\n\r\n    # Prepare the response\r\n    response_data = {\r\n        \"language\": info.language,\r\n        \"language_probability\": info.language_probability,\r\n        \"segments\": []\r\n    }\r\n\r\n    for segment in segments_copy:\r\n        response_data[\"segments\"].append(segment)\r\n\r\n    return jsonify(response_data)\r\n\r\n\r\nif __name__ == '__main__':\r\n    app.run(debug=False)<\/code><\/pre>\n<p>\u4ee5\u4e0a\u5c31\u662f\u4e2a\u629b\u7816\u5f15\u7389\u7684\u5c0f\u5de5\u5177\uff0c\u7559\u7740\u81ea\u5df1\u7528\u7528\u4e5f\u8db3\u591f\u4e86\u3002<\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u5c06\u97f3\u89c6\u9891\u6587\u4ef6\u4e2d\u7684\u97f3\u9891\u8f6c\u4e3a\u6587\u5b57\u5185\u5bb9\uff0c\u8fd9\u4e2a\u9700\u6c42\u653e\u5230\u4e24\u5e74\u524d\u8fd8\u4e0d\u5927\u597d\u5b9e\u73b0\uff0c\u4f46\u662f\u653e\u5230\u4eca\u5929\uff0c\u51e0\u5206\u949f\u5c31\u89e3\u51b3\u4e86\u3002 \u542c\u8bf4\u6709\u7684\u516c\u53f8 [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[7023],"tags":[],"class_list":["post-415419","post","type-post","status-publish","format-standard","hentry","category-chatgpt"],"_links":{"self":[{"href":"https:\/\/idc.net\/help\/wp-json\/wp\/v2\/posts\/415419","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/idc.net\/help\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/idc.net\/help\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/idc.net\/help\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/idc.net\/help\/wp-json\/wp\/v2\/comments?post=415419"}],"version-history":[{"count":0,"href":"https:\/\/idc.net\/help\/wp-json\/wp\/v2\/posts\/415419\/revisions"}],"wp:attachment":[{"href":"https:\/\/idc.net\/help\/wp-json\/wp\/v2\/media?parent=415419"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/idc.net\/help\/wp-json\/wp\/v2\/categories?post=415419"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/idc.net\/help\/wp-json\/wp\/v2\/tags?post=415419"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}