holaphil / The future of video search

The following post is an original copy of a post I wrote back in May of 2012. These topics were later addressed during a panel I hosted later that month at NYC Internet Week 2012.

“The guts of every video on the internet has been utterly overlooked and ignored by search engines.”

“Don’t judge a book by the cover.” is one of many lessons we've all heard at some point in our lives. The lesson is that you are making a mistake prejudging the worth or value of something simply by its outward appearance alone.

Getting Past the Cover

In a webinar given by Marc Gosschalk, Sr. Analyst, Product Management, comScore Europe he states, “The value of the online video marketplace is booming, with the role of video in the consumer lifecycle becoming increasingly recognized. According to comScore’s Video Matrix, 34.7 million people watched online video content for an average of nearly 17 hours each in November (2010).”

Marc goes on to state, “As the online video market grows, the need for visibility, transparency, and accurate measurement is now more critical than ever.”

So let’s talk about video and visibility because I think the lack of visibility represents a problem for all parties involved – content producers, platforms, and search engines alike. Nielsen reports from last May show Americans setting another record streaming over 15 billion videos for the month. Total online viewers also increased by nearly 3% from April to top 145 million unique viewers. By 2013, Cisco predicts 90% of all Web traffic will be generated by video! With this explosion of online video content, why does the level of viewer engagement seem to be inversely proportionate to its growth? Is this due to a consistently decreasing attention span of the viewer or because many of us have no clue what’s inside most videos (i.e. what’s coming next)?

Some Fuzzy Math

Let’s assume that 90% of those 15 billion videos are on average 30 seconds to three minutes in length, and the rest are three to 30 minutes in length. IF I’ve done my math correctly, that leaves us with approximately 2,902,500,000,000 (yes, trillion) seconds of video content. Let’s say that some new action or moment occurs in most videos every 10 seconds and that leaves us with 290,250,000,000 (billion) missed opportunities to search within a video. The guts of every video on the internet has been utterly overlooked and ignored by search engines. We’re left with basic details that have been manually tagged: Title, Description and Keywords — i.e. the book cover.

Searching for Solutions

From my perspective, there are three possible solutions:

Transcription – Converting speech to text and matching the text with the corresponding moment within a video.
Video Chaptering – Calling attention to moments within a video and labeling each with a title, comment, and/or keyword.
Image Recognition – Recognizing changes and/or objects within a video.

Neither solution exists inside a vacuum; Each solution can play a significant role in the future of video search.

Mark Gosschalk was right, video needs visibility. Search by simply tagging keywords and associating them with the entire video is not the best solution to video search. So what is the alternative? What companies are currently trying to solve this problem? How do we spotlight each of those moments (frames) within content and show search sites (i.e. Google, Bing) that it's what is inside that counts?