All posts

Introducing Topic Detection Feature

Introducing Topic Detection Feature

Today, we are very excited to announce that Deepgram has officially launched the Topic Detection feature as part of our speech understanding offerings. Deepgram's Topic Detection is based on an unsupervised topic modeling technique that enables developers and customers to detect the most important and relevant topics that are referenced in the conversations.

Turn Recorded Audio Into Insights

Having not enough data isn't a significant problem anymore. In fact, over 2.5 quintillion bytes of data get created every day. However, one of the biggest challenges customers face today is finding insights, organizing, tagging, and leveraging the data relevant to brands, prospects, and customers to deliver a fantastic experience to their end users.

Topic Detection in ASR and NLU has become one of the must-have features. Developers require advanced solutions to perform a deeper analysis of their audio data based on detected topics and subjects to optimize resources, automate workflow, extract insights, improve search capabilities and enhance end users' experience.

  • Support the Quality Assurance team to analyze conversations based on discussed topics, identify trends and patterns, and improve overall customer experience.

  • Categorize and tag conversations, meetings, and podcasts based on identified topics to enhance search and recommendation capabilities.

  • Extract meaningful and actionable insights from conversations and audio data based on discussed topics and recurring themes.

Identify over 350 topics

Deepgram's Topic Detection feature identifies patterns and generates key topics along with the output text, confidence score for each topic, and word positions to identify segments of speech. Deepgram's Topic Detection is based on Topic Modeling which is an unsupervised machine learning technique to cluster generated text based on the detected topics. It supports over 350 topics. Topic Extraction can be enabled using detect_topics=true and is supported for English language and pre-recorded audio and is available for both our on-prem and hosted customers.

Implement Topic Detection with Deepgram

To implement Topic Detection from audio recordings, all you need to do is add detect_topics=true in your API call.

Topic Detection

curl --request POST \
--url 'https://api.deepgram.com/v1/listen?detect_topics=true&punctuate=true&tier=enhanced' \
--header 'Authorization: Token YOUR_DEEPGRAM_API_KEY' \
--header 'content-type: audio/mp3' \
--data-binary '@podcast.mp3' \
const fs = require('fs')
const { Deepgram } = require('@deepgram/sdk')
// Your Deepgram API Key
const deepgramApiKey = 'YOUR_DEEPGRAM_API_KEY'
const file = 'YOUR_FILE_LOCATION'
const mimetype = 'YOUR_FILE_MIME_TYPE'
const deepgram = new Deepgram(deepgramApiKey)
const audio = fs.readFileSync(file)
const source = {
    buffer: audio,
    mimetype: mimetype,
}
deepgram.transcription
  .preRecorded(source, {
    detect_topics: true,
  })
  .then((response) => {
    console.dir(response, { depth: null })
    // Write only the transcript to the console
    //console.dir(response.results.channels[0].alternatives[0].transcript, { depth: null });
  })
  .catch((err) => {
    console.log(err)
  })
from deepgram import Deepgram
import asyncio, json
DEEPGRAM_API_KEY = 'YOUR_DEEPGRAM_API_KEY'
FILE = 'YOUR_FILE_LOCATION'
MIMETYPE = 'YOUR_FILE_MIME_TYPE'
async def main():
  deepgram = Deepgram(DEEPGRAM_API_KEY)
  audio = open(FILE, 'rb')
  source = {
    'buffer': audio,
    'mimetype': MIMETYPE
  }
  response = await asyncio.create_task(
    deepgram.transcription.prerecorded(
      source,
      {
        'detect_topics': True
      }
    )
  )
  print(json.dumps(response, indent=4))
  # Write only the transcript to the console
  #print(response["results"]["channels"][0]["alternatives"][0]["transcript"])
try:
  # If running in a Jupyter notebook, Jupyter is already running an event loop, so run main with this line instead:
  #await main()
  asyncio.run(main())
except Exception as e:
  exception_type, exception_object, exception_traceback = sys.exc_info()
  line_number = exception_traceback.tb_lineno
  print(f'line {line_number}: {exception_type} - {e}')
```

## Topic Detection Results

When the file is finished processing, you'll receive a sample JSON response that has the following basic structure:

```bash
"topics": [
  {
    "topics": [
      {
        "topic": "renewable energy",
        "confidence": 0.80515814
      },
      {
        "topic": "climate change",
        "confidence": 0.51437885
      }
    ],
    "text": "Even Greenpeace underestimated the rise of solar. When one of the world's largest environmental advocacy groups released an optimistic industry analysis called the energy revolution in twenty ten. It was far more ambitious than any government predictions, and it still got it wrong. Greenpeace estimated that by twenty twenty, the world would have three hundred and thirty five thousand megawatts of installed solar photovoltaic capacity…...",
    "start_word": 0,
    "end_word": 135
  }
]

Topic Detection Results

When the file is finished processing, you'll receive a sample JSON response that has the following basic structure:

"topics": [
  {
    "topics": [
      {
        "topic": "renewable energy",
        "confidence": 0.80515814
      },
      {
        "topic": "climate change",
        "confidence": 0.51437885
      }
    ],
    "text": "Even Greenpeace underestimated the rise of solar. When one of the world's largest environmental advocacy groups released an optimistic industry analysis called the energy revolution in twenty ten. It was far more ambitious than any government predictions, and it still got it wrong. Greenpeace estimated that by twenty twenty, the world would have three hundred and thirty five thousand megawatts of installed solar photovoltaic capacity…...",
    "start_word": 0,
    "end_word": 135
  }
]

Developers can take the outputs from the API that performs Topic Identification to build downstream workflows, generate tags based on topics, power analytics tools, build search and recommendation capabilities, or integrate with other applications.

To learn more about our API, please see the Topic Detection page in our documentation. We welcome your feedback, please share it with us at Product Feedback.

If you have any feedback about this post, or anything else around Deepgram, we'd love to hear from you. Please let us know in our GitHub discussions .

More with these tags:

Share your feedback

Thank you! Can you tell us what you liked about it? (Optional)

Thank you. What could we have done better? (Optional)

We may also want to contact you with updates or questions related to your feedback and our product. If don't mind, you can optionally leave your email address along with your comments.

Thank you!

We appreciate your response.