Java Sound API – Capturing Microphone

1. Overview

In this article, we’ll see how to capture a microphone and record incoming audio in Java to save it to a WAV file. To capture the incoming sound from a microphone, we use the Java Sound API, part of the Java ecosystem.

The Java Sound API is a powerful API to capture, process, and playback audio and consists of 4 packages. We’ll focus on the javax.sound.sampled package that provides all the interfaces and classes needed to capturing incoming audio.

2. What Is the TargetDataLine?

The TargetDataLine is a type of DataLine object which we use the capture and read audio-related data, and it captures data from audio capture devices like microphones. The interface provides all the methods necessary for reading and capturing data, and it reads the data from the target data line’s buffer.

We can invoke the AudioSystem’s getLine() method and provide it the DataLine.Info object, which provides all the transport-control methods for audio. The Oracle documentation explains in detail how the Java Sound API works.

Let’s go through the steps we need to capture audio from a microphone in Java.

3. Steps to Capture Sound

To save captured audio, Java supports the: AU, AIFF, AIFC, SND, and WAVE file formats. We’ll be using the WAVE (.wav) file format to save our files.

The first step in the process is to initialize the AudioFormat instance. The AudioFormat notifies Java how to interpret and handle the bits of information in the incoming sound stream. We use the following AudioFormat class constructor in our example:

AudioFormat(AudioFormat.Encoding encoding, float sampleRate, int sampleSizeInBits, int channels, int frameSize, float frameRate, boolean bigEndian)

After that, we open a DataLine.Info object. This object holds all the information related to the data line (input). Using the DataLine.Info object, we can create an instance of the TargetDataLine, which will read all the incoming data into an audio stream. For generating the TargetDataLine instance, we use the AudioSystem.getLine() method and pass the DataLine.Info object:

line = (TargetDataLine) AudioSystem.getLine(info);

The line is a TargetDataLine instance, and the info is the DataLine.Info instance.

Once created, we can open the line to read all the incoming sounds. We can use an AudioInputStream to read the incoming data. In conclusion, we can write this data into a WAV file and close all the streams.

To understand this process, we’ll look at a small program to record input sound.

4. Example Application

To see the Java Sound API in action, let’s create a simple program. We will break it down into three sections, first building the AudioFormat, second building the TargetDataLine, and lastly, saving the data as a file.

4.1. Building the AudioFormat

The AudioFormat class defines what kind of data can be captured by the TargetDataLine instance. So, the first step is to initialize the AudioFormat class instance even before we open a new data line. The App class is the main class of the application and makes all the calls. We define the properties of the AudioFormat in a constants class called ApplicationProperties. We build the AudioFormat instance bypassing all the necessary parameters:

public static AudioFormat buildAudioFormatInstance() {
    ApplicationProperties aConstants = new ApplicationProperties();
    AudioFormat.Encoding encoding = aConstants.ENCODING;
    float rate = aConstants.RATE;
    int channels = aConstants.CHANNELS;
    int sampleSize = aConstants.SAMPLE_SIZE;
    boolean bigEndian = aConstants.BIG_ENDIAN;

    return new AudioFormat(encoding, rate, sampleSize, channels, (sampleSize / 8) * channels, rate, bigEndian);
}

Now that we have our AudioFormat ready, we can move ahead and build the TargetDataLine instance.

4.2. Building the TargetDataLine

We use the TargetDataLine class to read audio data from our microphone. In our example, we get and run the TargetDataLine in the SoundRecorder class. The getTargetDataLineForRecord() method builds the TargetDataLine instance.

We read and processed audio input and dumped it in the AudioInputStream object. The way we create a TargetDataLine instance is:

private TargetDataLine getTargetDataLineForRecord() {
    TargetDataLine line;
    DataLine.Info info = new DataLine.Info(TargetDataLine.class, format);
    if (!AudioSystem.isLineSupported(info)) {
        return null;
    }
    line = (TargetDataLine) AudioSystem.getLine(info);
    line.open(format, line.getBufferSize());
    return line;
}

4.3. Building and Filling the AudioInputStream

So far in our example, we have created an AudioFormat instance and applied it to the TargetDataLine, and opened the data line to read audio data. We have also created a thread to help autorun the <em>SoundRecorder</em> instance. We first build a byte output stream when the thread runs and then convert it to an AudioInputStream instance. The parameters we need for building the AudioInputStream instance are:

int frameSizeInBytes = format.getFrameSize();
int bufferLengthInFrames = line.getBufferSize() / 8;
final int bufferLengthInBytes = bufferLengthInFrames * frameSizeInBytes;

Notice in the above code we have reduced the bufferSize by 8. We do so to make the buffer and the array the same length so that the recorder can then deliver the data to the line as soon as it is read.

Now that we have initialized all the parameters needed, the next step is to build the byte output stream. The next step is to convert the output stream generated (sound data captured) to an AudioInputStream instance.

buildByteOutputStream(out, line, frameSizeInBytes, bufferLengthInBytes);
this.audioInputStream = new AudioInputStream(line);

setAudioInputStream(convertToAudioIStream(out, frameSizeInBytes));
audioInputStream.reset();

Before we set the InputStream, we’ll build the byte OutputStream:

public void buildByteOutputStream(final ByteArrayOutputStream out, final TargetDataLine line, int frameSizeInBytes, final int bufferLengthInBytes) throws IOException {
    final byte[] data = new byte[bufferLengthInBytes];
    int numBytesRead;

    line.start();
    while (thread != null) {
        if ((numBytesRead = line.read(data, 0, bufferLengthInBytes)) == -1) {
            break;
        }
        out.write(data, 0, numBytesRead);
    }
}

We then convert the byte Outstream to an AudioInputStream as:

public AudioInputStream convertToAudioIStream(final ByteArrayOutputStream out, int frameSizeInBytes) {
    byte audioBytes[] = out.toByteArray();
    ByteArrayInputStream bais = new ByteArrayInputStream(audioBytes);
    AudioInputStream audioStream = new AudioInputStream(bais, format, audioBytes.length / frameSizeInBytes);
    long milliseconds = (long) ((audioInputStream.getFrameLength() * 1000) / format.getFrameRate());
    duration = milliseconds / 1000.0;
    return audioStream;
}

4.4. Saving the AudioInputStream to a Wav File

We have created and filled in the AudioInputStream and stored it as a member variable of the SoundRecorder class. We will retrieve this AudioInputStream in the App class by using the SoundRecorder instance getter property and pass it to the WaveDataUtil class:

wd.saveToFile("/SoundClip", AudioFileFormat.Type.WAVE, soundRecorder.getAudioInputStream());

The WaveDataUtil class has the code to convert the AudioInputStream into a .wav file:

AudioSystem.write(audioInputStream, fileType, myFile);

5. Conclusion

This article showed a quick example of using the Java Sound API to capture and record audio using a microphone. The entire code for this tutorial is available over on GitHub.

REST with Spring

Learn Spring Security ▼▲

Learn Spring Security Core

Learn Spring Security OAuth

Learn Spring

Learn Spring Data JPA

Persistence

REST

Security

Full Archive

Baeldung Ebooks

About Baeldung

Write for Baeldung