Microphone recording

Recently, I am studying speech recognition, using Baidu's sdk. It is found that only the identified part, and I need to save the audio file, and realize the automatic generation of the audio file when there is sound incoming.

First code:

public class EngineeCore {

    String filePath = "E:\\voice\\voice_cache.wav";

    AudioFormat audioFormat;
    TargetDataLine targetDataLine;
    boolean flag = true;

private void stopRecognize() { flag = false; targetDataLine.stop(); targetDataLine.close(); }private AudioFormat getAudioFormat() { float sampleRate = 16000; // 8000,11025,16000,22050,44100 int sampleSizeInBits = 16; // 8,16 int channels = 1; // 1,2 boolean signed = true; // true,false boolean bigEndian = false; // true,false return new AudioFormat(sampleRate, sampleSizeInBits, channels, signed, bigEndian); }// end getAudioFormat private void startRecognize() { try { // Get the specified audio format audioFormat = getAudioFormat(); DataLine.Info dataLineInfo = new DataLine.Info(TargetDataLine.class, audioFormat); targetDataLine = (TargetDataLine) AudioSystem.getLine(dataLineInfo); // Create a thread to capture the microphone // data into an audio file and start the // thread running. It will run until the // Stop button is clicked. This method // will return after starting the thread. flag = true; new CaptureThread().start(); } catch (Exception e) { e.printStackTrace(); } // end catch }// end captureAudio method class CaptureThread extends Thread { public void run() { AudioFileFormat.Type fileType = null; File audioFile = new File(filePath); fileType = AudioFileFormat.Type.WAVE; //Weight of sound input int weight = 2; //Count to determine whether to stop int downSum = 0; ByteArrayInputStream bais = null; ByteArrayOutputStream baos = new ByteArrayOutputStream(); AudioInputStream ais = null; try { targetDataLine.open(audioFormat); targetDataLine.start(); byte[] fragment = new byte[1024]; ais = new AudioInputStream(targetDataLine); while (flag) { targetDataLine.read(fragment, 0, fragment.length); //When the last bit of an array is greater than weight Start to store bytes (with sound input), once started, it is no longer necessary to judge the last bit if (Math.abs(fragment[fragment.length-1]) > weight || baos.size() > 0) { baos.write(fragment); System.out.println("Guard:"+fragment[0]+",End:"+fragment[fragment.length-1]+",lenght"+fragment.length); //Judge whether the voice stops if(Math.abs(fragment[fragment.length-1])<=weight){ downSum++; }else{ System.out.println("Reset odd numbers"); downSum=0; }
               //If the count is more than 20, there is no sound in this period (the value can also be changed)
if(downSum>20){ System.out.println("Stop typing"); break; } } } //Get recording input stream audioFormat = getAudioFormat(); byte audioData[] = baos.toByteArray(); bais = new ByteArrayInputStream(audioData); ais = new AudioInputStream(bais, audioFormat, audioData.length / audioFormat.getFrameSize()); //Define the final saved file name System.out.println("Start generating voice file"); AudioSystem.write(ais, AudioFileFormat.Type.WAVE, audioFile); downSum = 0; stopRecognize(); } catch (Exception e) { e.printStackTrace(); } finally { //Closed flow try { ais.close(); bais.close(); baos.reset(); } catch (IOException e) { e.printStackTrace(); } } }// end run }// end inner class CaptureThread

Next test

    public static void main(String args[]) {
        EngineeCore engineeCore = new EngineeCore();



When a higher sound is passed into the microphone, the absolute value of the first or last bit of the byte array read by targetDataLine will be larger (the position depends on some parameters in the audio format, such as bigEndian). The incoming volume is low, and the absolute value will be reduced

Recording begins. Audio data read from targetDataLine is saved in ByteArrayOutputStream. When the volume has been lower than the weight value for a period of time, it is considered that there is no sound coming in, and the recording ends. Get the byte array from ByteArrayOutputStream,

The conversion to audio is saved in a local file.

Note: the byte array read from targetDataLine cannot be directly used for speech recognition such as Baidu. It needs to be converted into an audio file first, and then read the byte array generated by the audio file before it can be used for speech recognition.

Keywords: Java Fragment SDK

Added by nashsaint on Thu, 02 Apr 2020 05:09:23 +0300