Microphone recording

Recently, I am studying speech recognition, using Baidu's sdk. It is found that only the identified part, and I need to save the audio file, and realize the automatic generation of the audio file when there is sound incoming.

First code:

public class EngineeCore {

    String filePath = "E:\\voice\\voice_cache.wav";

    AudioFormat audioFormat;
    TargetDataLine targetDataLine;
    boolean flag = true;


private void stopRecognize() {
        flag = false;
        targetDataLine.stop();
        targetDataLine.close();
    }private AudioFormat getAudioFormat() {
        float sampleRate = 16000;
        // 8000,11025,16000,22050,44100
        int sampleSizeInBits = 16;
        // 8,16
        int channels = 1;
        // 1,2
        boolean signed = true;
        // true,false
        boolean bigEndian = false;
        // true,false
        return new AudioFormat(sampleRate, sampleSizeInBits, channels, signed, bigEndian);
    }// end getAudioFormat


    private void startRecognize() {
        try {
            // Get the specified audio format
            audioFormat = getAudioFormat();
            DataLine.Info dataLineInfo = new DataLine.Info(TargetDataLine.class, audioFormat);
            targetDataLine = (TargetDataLine) AudioSystem.getLine(dataLineInfo);

            // Create a thread to capture the microphone
            // data into an audio file and start the
            // thread running. It will run until the
            // Stop button is clicked. This method
            // will return after starting the thread.
            flag = true;
            new CaptureThread().start();
        } catch (Exception e) {
            e.printStackTrace();
        } // end catch
    }// end captureAudio method

    class CaptureThread extends Thread {
        public void run() {
            AudioFileFormat.Type fileType = null;
            File audioFile = new File(filePath);

            fileType = AudioFileFormat.Type.WAVE;
            //Weight of sound input
            int weight = 2;
            //Count to determine whether to stop
            int downSum = 0;

            ByteArrayInputStream bais = null;
            ByteArrayOutputStream baos = new ByteArrayOutputStream();
            AudioInputStream ais = null;
            try {
                targetDataLine.open(audioFormat);
                targetDataLine.start();
                byte[] fragment = new byte[1024];

                ais = new AudioInputStream(targetDataLine);
                while (flag) {

                    targetDataLine.read(fragment, 0, fragment.length);
                    //When the last bit of an array is greater than weight Start to store bytes (with sound input), once started, it is no longer necessary to judge the last bit
                    if (Math.abs(fragment[fragment.length-1]) > weight || baos.size() > 0) {
                        baos.write(fragment);
                        System.out.println("Guard:"+fragment[0]+",End:"+fragment[fragment.length-1]+",lenght"+fragment.length);
                        //Judge whether the voice stops
                        if(Math.abs(fragment[fragment.length-1])<=weight){
                            downSum++;
                        }else{
                            System.out.println("Reset odd numbers");
                            downSum=0;
                        }
　　　　　　　　　　　　　　　//If the count is more than 20, there is no sound in this period (the value can also be changed)
                        if(downSum>20){
                            System.out.println("Stop typing");
                            break;
                        }

                    }
                }

                //Get recording input stream
                audioFormat = getAudioFormat();
                byte audioData[] = baos.toByteArray();
                bais = new ByteArrayInputStream(audioData);
                ais = new AudioInputStream(bais, audioFormat, audioData.length / audioFormat.getFrameSize());
                //Define the final saved file name
                System.out.println("Start generating voice file");
                AudioSystem.write(ais, AudioFileFormat.Type.WAVE, audioFile);
                downSum = 0;
                stopRecognize();

            } catch (Exception e) {
                e.printStackTrace();
            } finally {
                //Closed flow

                try {
                    ais.close();
                    bais.close();
                    baos.reset();
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }

        }// end run
    }// end inner class CaptureThread

Next test

    public static void main(String args[]) {
        EngineeCore engineeCore = new EngineeCore();

            engineeCore.startRecognize();

    }

When a higher sound is passed into the microphone, the absolute value of the first or last bit of the byte array read by targetDataLine will be larger (the position depends on some parameters in the audio format, such as bigEndian). The incoming volume is low, and the absolute value will be reduced

Recording begins. Audio data read from targetDataLine is saved in ByteArrayOutputStream. When the volume has been lower than the weight value for a period of time, it is considered that there is no sound coming in, and the recording ends. Get the byte array from ByteArrayOutputStream,

The conversion to audio is saved in a local file.

Note: the byte array read from targetDataLine cannot be directly used for speech recognition such as Baidu. It needs to be converted into an audio file first, and then read the byte array generated by the audio file before it can be used for speech recognition.

Keywords: Java Fragment SDK

Added by nashsaint on Thu, 02 Apr 2020 05:09:23 +0300

Programming VIP

Microphone recording

Popular Keywords