One feature added to Microsoft’s AI Copilot in the Edge browser this week is the ability to generate text summaries of videos. But Edge Copilot’s time-saving feature is still fairly limited and only works on pre-processed videos or those with subtitles, as Mikhail Parakhin, Microsoft’s CEO of advertising and web services, explained.
As spotted by MSPowerUser, Parakhin writes, “In order for it to work, we need to pre-process the video. If the video has subtitles – we can always fallback on that, if it does not and we didn’t preprocess it yet – then it won’t work,” in response to a question.
In order for it to work, we need to pre-process the video. If the video has subtitles – we can always fallback on that, if it does not and we didn’t preprocess it yet – then it won’t work.
— Mikhail Parakhin (@MParakhin) December 7, 2023
In other words, on its own Edge Copilot doesn’t so much summarize videos as it summarizes the text transcripts of the videos. Copilot can also perform a similar function throughout Microsoft 365, including summarizing Teams video meetings and calls for customer service agents — and in both cases, the audio needs to be transcribed first by Microsoft. Copilot on Microsoft Stream can also summarize any video, but again, it requires users to generate a written transcript.
The conversation started after designer Pietro Schirano posted a screen recording of Edge Copilot summarizing a YouTube video about the GTA VI trailer. In this case, Copilot appeared to be doing its job perfectly. The user in the recording presses the “Generate video summary” button in the Copilot sidebar, and mere seconds later, Copilot churns one out, complete with highlights and timestamps.
Of course, many platforms, including YouTube and Vimeo, can automatically generate transcripts and subtitles — if users enable the feature. After The Verge asked Parakhin on X if we could assume most publicly available videos (i.e. YouTube) weren’t pre-processed, he replied: “Should work for most videos.”
Copilot is just the latest example of the generative AI race Microsoft is competing in with Google (and others). Last month, Google upgraded the YouTube extension for its Bard chatbot to enable it to summarize the content of a video and surface specific information from it. Just this week, Google announced a major Gemini update that has its own issues — the company’s editing may have misrepresented some of the AI’s capabilities in a demo, and it doesn’t always have its facts straight.
Parakhin has been candid about the various stages of Copilot’s evolution on social media. While on a plane on Tuesday morning, the machine learning expert posted on X: “Adding ability for Edge Copilot to use information in videos – on a flight.”