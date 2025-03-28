Summary Home Assistant can use Google Gemini to describe who is at the door based on a snapshot from the video doorbell.

You'll need to install LLM Vision, get a Google Gemini API key, and have the Home Assistant app for notifications.

LLM Vision can be used for other purposes, such as keeping a count of parked cars.

When AI chatbots first appeared, they were limited to text inputs. The only way to get a response out of a chatbot was to type a response into it. Nowadays, however, many AI models are multimodal, meaning they can handle much more than just text.

You can now use AI to analyze images, for example, generating detailed descriptions of what the image contains. It's possible to get Home Assistant to harness this ability to describe who is at the door based on an image from your video doorbell, with often hilarious results.

What You'll Need

If you're already using Home Assistant and have your smart video doorbell connected, then you've probably got most of the things you need set up. You need to be running Home Assistant with your video doorbell added via an integration. Home Assistant will take a snapshot of your video doorbell and pass it to Google Gemini for analysis. This will then generate a description of who is at your door.