Huawei voice synthesis service provides users with real-time, replaceable and multi tone voice playback experience

How to obtain information in time and facilitate reading when you can't operate your mobile phone or always stare at your mobile phone? Listening with your ears is a good way. Huawei machine learning service The speech synthesis service adopts deep neural network technology to provide highly anthropomorphic, smooth and natural speech synthesis services. Developers can integrate this capability in novel reading, intelligent hardware and map navigation applications to provide users with real-time, replaceable and multi tone voice playback experience.

Speech synthesis for timely content delivery

The voice synthesis service supports the online conversion of text messages into voice output, and has been deployed worldwide. The advantages of this service are——

  • Multi language and multi timbre: support mixed reading and synthesis of Chinese, English and Chinese and English. There are two standard male pronunciation and six standard female pronunciation to choose from. The following is the tone audition:
  • Adjustable speech speed and volume: it supports multiple parameter configurations and can adjust the speaker's speech speed and volume according to the needs of the scene.
  • Flexible and rich integration methods: provide rapid integration of offline SDK and online SDK to fully meet the needs of speech synthesis in different scenarios.

Speech synthesis service can be applied to reading broadcast, news broadcast, virtual broadcast, map broadcast, information notification and other timely scenes. For example, it is inconvenient for users to keep looking at their mobile phones on the road when they are cycling or driving to use map navigation. Speech synthesis broadcasting can ensure clear expression and accurate arrival at the destination; In the scenarios of taxi software, catering call number and queuing software at the driver's end, order broadcasting is carried out through voice synthesis, so that users can easily obtain notification information; Popular electronic reading applications on the market provide voice broadcasting and listening functions. Users can easily "listen to books". Even when the screen is locked, you can continue to listen through voice broadcast to eliminate the restrictions of subway, bus, running and other reading environments. Some old people and children who are inconvenient to read can also solve the problems of unclear reading and emotional companionship through "listening to books".

In the field of intelligent hardware, speech synthesis services can be integrated into children's story machines, intelligent robots, tablet devices and other intelligent devices to make human-computer interaction more natural and friendly. For the content creators of the short video App, some voice effects can be synthesized by specifying words in the video application, which speeds up the short video production process.

Customize the timbre to meet the personalized needs of users

Recently, Huawei's voice synthesis service will be launched to customize the voice function. Users can record and synthesize their own voice into the application, making daily life and learning scenes such as listening to novels and navigation more interesting and friendly. Parents with children at home can also tell stories to their children in their own voice to release the fatigue of parenting and deepen parent-child interaction and companionship.

Development practice

Development preparation
For the configuration steps of Maven warehouse and SDK, please refer to the application development introduction in the developer website:

  1. Configure integrated SDK packages

    In application build.gradle In the document, dependencies Add inside TTS of SDK Dependency:
    // Import basic SDK
    implementation 'com.huawei.hms:ml-computer-voice-tts:'
    // Introduce offline speech synthesis bee speech package
    implementation 'com.huawei.hms:ml-computer-voice-tts-model-bee:'
    // Introduce offline speech synthesis eagle speech package
    implementation 'com.huawei.hms:ml-computer-voice-tts-model-eagle:'
  2. Configure androidmanifest xml

    open main In folder AndroidManifest.xml File, you can configure the network and read-write permissions according to the scenario and use needs<application>Add before
    <uses-permission android:name="android.permission.INTERNET" />
    <uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE" />
    <uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE" />
  3. Code development (online TTS)

3.1 create an activity interface customized by the application to select online or offline TTS through api_key or Access Token to set application authentication information

public class MainActivity extends AppCompatActivity {
    protected void onCreate(@Nullable Bundle savedInstanceState) {
        MLApplication.getInstance().setAccessToken("your access token");

3.2 create TTS configuration and TTS engine, and configure different parameters as required

MLTtsEngine mlTtsEngine;
MLTtsConfig mlConfigs;

mlConfigs = new MLTtsConfig()
        // Setting the language for synthesis.
        // Set the timbre.
        // Set the speech speed. Range: 0.2–4.0 1.0 indicates 1x speed.
        // Set the volume. Range: 0.2–4.0 1.0 indicates 1x volume.
        // set the synthesis mode.

mlTtsEngine = new MLTtsEngine(mlConfigs);
//Sets the volume of the built-in player.
Set callback (see 3).3)
// Pass the TTS callback to the TTS engine.

3.3 configure TTS callback to receive and process speech synthesis results

MLTtsCallback callback = new MLTtsCallback() {
    String task = "";

    String fileName = "audio_" + task;

    public void onError(String taskId, MLTtsError err) {
        String str = taskId + " " + err;

    public void onWarn(String taskId, MLTtsWarn warn) {
        String str = taskId + " Tips:" + warn;

    public void onRangeStart(String taskId, int start, int end) {
        String str = taskId + " onRangeStart [" + start + "," + end + "]";// + temp.get(taskId).substring(start);
        sendMsg(taskId + " onRangeStart[" + start + "," + end + "]");
        sendMsg1(taskId, start, end);

    public void onAudioAvailable(String taskId, MLTtsAudioFragment audioFragment, int offset,
        Pair<Integer, Integer> range, Bundle bundle) {
        if (!task.equals(taskId)) {
            task = taskId;
            fileName = "/sdcard/audio_" + task + ".pcm";
        writeTxtToFile(audioFragment.getAudioData(), fileName, true);

    public void onEvent(String taskId, int eventId, Bundle bundle) {
        StringBuffer stringBuffer = new StringBuffer();
        stringBuffer.append(taskId + " ");
        switch (eventId) {
            case MLTtsConstants.EVENT_PLAY_START:
                stringBuffer.append("onPlayStart ");
            case MLTtsConstants.EVENT_PLAY_STOP:
                stringBuffer.append("onPlayStop ");
            case MLTtsConstants.EVENT_PLAY_RESUME:
                stringBuffer.append("onPlayResume ");
            case MLTtsConstants.EVENT_PLAY_PAUSE:
                stringBuffer.append("onPlayPause ");
            case MLTtsConstants.EVENT_SYNTHESIS_COMPLETE:
                stringBuffer.append("onSynthesisComplete ");
            case MLTtsConstants.EVENT_SYNTHESIS_START:
                stringBuffer.append("onSynthesisStart ");
            case MLTtsConstants.EVENT_SYNTHESIS_END:
                stringBuffer.append("onSynthesisEnd ");
        Log.d(TAG, "onEvent: " + stringBuffer.toString());

3.4 call speak synthesis request and playback control

String id = mlTtsEngine.speak(text, MLTtsEngine.QUEUE_APPEND));


After the call, release the engine
if (mlTtsEngine != null) {
  1. Offline TTS

4.1 the offline function requires a new step of downloading the speaker model package

private MLLocalModelManager mLocalModelManager;
mLocalModelManager = MLLocalModelManager.getInstance();
MLTtsLocalModel mLocalModel = new MLTtsLocalModel.Factory('informant'
mLocalModelManager.isModelExist(mLocalModel).addOnSuccessListener(new OnSuccessListener<Boolean>() {
    public void onSuccess(Boolean aBoolean) {
        if (aBoolean) {
            mlTtsEngine.speak(text, MLTtsEngine.QUEUE_APPEND)
       } else {
}).addOnFailureListener(new OnFailureListener() {
    public void onFailure(Exception e) {
        Log.e(TAG, e.getMessage());

The method of downloading the model is:

private void downloadModel(final boolean needSpeak) {
    MLModelDownloadStrategy request = new MLModelDownloadStrategy.Factory().needWifi().create();

    MLModelDownloadListener modelDownloadListener = new MLModelDownloadListener() {
        public void onProcess(long alreadyDownLength, long totalLength) {
            showProcess(alreadyDownLength, "Model download is complete", totalLength);
    mLocalModelManager.downloadModel(mLocalModel, request, modelDownloadListener)
        .addOnSuccessListener(new OnSuccessListener<Void>() {
            public void onSuccess(Void aVoid) {
                Log.i(TAG, "downloadModel: " + mLocalModel.getModelName() + " success");
                showToast("downloadModel Success");
                if (needSpeak) {
        .addOnFailureListener(new OnFailureListener() {
            public void onFailure(Exception e) {
                Log.e(TAG, "downloadModel failed: " + e.getMessage());

Other uses are consistent with online TTS

Learn more > >

visit Official website of Huawei developer Alliance
obtain Development guidance document
Huawei mobile service open source warehouse address: GitHub,Gitee

Follow us and learn the latest technical information of HMS Core for the first time~

Keywords: Java

Added by SoulAssassin on Fri, 07 Jan 2022 11:27:04 +0200