Microsoft provides an audio transcription feature for the online version of Word that converts audio (recorded or uploaded from a file) directly to text, and even separates the text based on the speaker. Here’s how to use the feature.
To transcribe audio with Word, you must be a Microsoft 365 premium subscriber. If you have the free version and you try to use the feature, you’ll be met with a message asking you to subscribe.
Record and Transcribe Live Audio
You can have Word transcribe audio that you record directly within Word. Sign in to Microsoft 365, and open Word. In the “Home” tab, click the arrow next to “Dictate” and then select “Transcribe” from the menu that appears.
If this is your first time using the feature, you’ll need to give Microsoft permission to access your microphone.
The “Transcribe” pane will open in the right-hand side of the window. Select “Start Recording.”
Once selected, the timer will start. Now, you’ll want to begin speaking. You won’t see the transcription happen live as you’re speaking because Microsoft found that to be a bit too distracting during its testing.
After you’re finished, click the “Pause” button and then select “Save and Transcribe Now.”
It may take a few minutes for Word to finish transcribing the audio recording and uploading it to OneDrive.
Once this is done, you’ll see the transcription appear in the same pane you recorded the audio. Each section will have a timestamp, the speaker’s name, and the transcribed text. Microsoft automatically separates the text by the speaker.
If Word detects multiple speakers, you’ll see “Speaker 1,” Speaker 2,” and so on. If Word can’t detect multiple speakers, you’ll just see “Speaker.”
You might notice that the transcript doesn’t perfectly reflect the recorded audio accurately. You can edit a section of the transcript by hovering your mouse over the incorrect text and then selecting the pen icon.
Now you can edit the transcription found in this section. You can also edit the name of the speaker, as well as every instance where the speaker (i.e., Speaker 1 or Speaker 2) appears by ticking the box next to “Change All Speaker.” When you’re finished, click the checkmark.
If necessary, you can use the playback controls to revisit the audio recording. This is necessary if the transcript is long, and you can’t remember exactly who said what. Here’s the function of each button, from left to right:
- Playback speed
- Fast forward
When you’re finished editing the transcript, you can add it to the document by selecting the “Add All To Document” button at the bottom of the pane.
Once selected, the audio recording and the content of the transcript will appear in the document.
Upload and Transcribe an Audio File
If you already have an audio file that you want to transcribe, you can upload it to Word. Sign in to Microsoft 365, and open Word. In the “Home” tab, click the arrow next to “Dictate” and then select “Transcribe” from the menu that appears.
The “Transcribe” pane will open in the right-hand side of the window. Select “Upload Audio.” You can upload these audio file types:
File Explorer (Finder for Mac) will open. Navigate to the location of the audio file, select it, and then click “Open.”
Microsoft will begin transcribing the audio file. Depending on the size of the file, this could take quite a bit of time.
Once Microsoft finishes transcribing the audio file, the text will appear in the pane.
If you face the same issue with your audio file, you can edit the text by hovering over the section and clicking the “Pen” icon. If you need to hear the audio again, you can do so by using the audio controls.
Next, edit the name of the speaker (and each instance that the speaker appears by ticking the “Change All Speaker” box) and the text from that section. When finished, click the “Checkmark.”
Once you’ve edited the content of the transcript, click “Add All To Document.”
The audio file and text of the transcript will be added to the Word document.
While not perfect, this feature can potentially save you a lot of time, especially if the speaker in the audio is speaking clearly.