You’re using a public version of DrugPatentWatch with 5 free searches available | Register to unlock more free searches. CREATE FREE ACCOUNT

Last Updated: March 28, 2024

Details for Patent: 9,436,876


✉ Email this page to a colleague

« Back to Dashboard


Title:Video segmentation techniques
Abstract: A video segmentation system can be utilized to automate segmentation of digital video content. Features corresponding to visual, audio, and/or textual content of the video can be extracted from frames of the video. The extracted features of adjacent frames are compared according to a similarity measure to determine boundaries of a first set of shots or video segments distinguished by abrupt transitions. The first set of shots is analyzed according to certain heuristics to recognize a second set of shots distinguished by gradual transitions. Key frames can be extracted from the first and second set of shots, and the key frames can be used by the video segmentation system to group the first and second set of shots by scene. Additional processing can be performed to associate metadata, such as names of actors or titles of songs, with the detected scenes.
Inventor(s): Carlson; Adam (Seattle, WA), Gray; Douglas Ryan (Redwood City, CA), Kulkarni; Ashutosh Vishwas (Bellevue, WA), Taylor; Colin Jon (Orinda, CA)
Assignee:
Filing Date:Dec 19, 2014
Application Number:14/577,277
Claims:1. A computing device, comprising: a processor; memory including instructions that, upon being executed by the processor, cause the computing device to: obtain a digital video; extract a respective pyramid of histograms for at least a subset of frames of the digital video; determine a first plurality of shots of the digital video by: determining respective cosine similarity between the respective pyramid of histograms of adjacent frames of the digital video; and comparing the respective cosine similarity between the respective pyramid of histograms of the adjacent frames to a similarity threshold; determine a second plurality of shots of the digital video by analyzing the first plurality of shots; extract one or more respective key frames for each of the plurality of first shots and the plurality of second shots; generate a graph of the digital video using the respective key frames as nodes of the graph and a respective cost, based at least in part on time and visual similarity, as edges of the graph; and determine a plurality of sub-graphs by performing a minimum cut algorithm on the graph, the plurality of sub-graphs corresponding to scenes of the digital video.

2. The computing device of claim 1, wherein the instructions, upon being executed, to cause the computing device to detect the second plurality of shots include causing the computing device to: determine that at least one shot of the first plurality of shots meets a time threshold; determine that the respective cosine similarity between a first frame of the at least one shot and a second frame of the at least one shot meets a dissimilarity threshold; and determine that a similarity matrix of at least a subset of frames of the at least one shot corresponds to a dissolve pattern.

3. The computing device of claim 2, wherein the instructions, upon being executed, to cause the computing device to determine that the similarity matrix of the subset of frames of the at least one shot corresponds to the dissolve pattern includes causing the computing device to: generate the dissolve pattern; slide the dissolve pattern along a diagonal of the similarity matrix; and match the dissolve pattern to at least one portion of the diagonal.

4. A computer-implemented method for segmenting a video, comprising: obtaining one or more respective features for each frame of a plurality of frames of a video; determining one or more first shots of the video by analyzing similarity between the respective features for adjacent frames of the video by: determining respective cosine similarity between the respective features for the adjacent frames; and comparing the respective cosine similarity between the respective features for the adjacent frames to a similarity threshold; determining one or more second shots of the video by analyzing the first shots; generating a graph of the video, the graph comprising nodes corresponding to the first shots and the second shots, and edges corresponding to a respective cost between the nodes; and determining one or more groupings of the first shots and the second shots by performing one or more cuts of the graphs.

5. The computer-implemented method of claim 4, wherein obtaining the respective features for each frame includes: determining a first histogram for the frame; determining a first plurality of histograms for first portions of the frame; and determining a second plurality of histograms for second portions of the frame.

6. The computer-implemented method of claim 4, wherein determining the second shots includes: determining that at least one shot of the one or more first shots meets a time threshold; determining that a similarity metric between a first frame of the at least one shot and a second frame of the at least one shot meets a dissimilarity threshold; and determining that a similarity matrix of at least a subset of frames of the at least one shot corresponds to a dissolve pattern.

7. The computer-implemented method of claim 4, wherein determining the one or more groupings of the first shots and the second shots includes: obtaining one or more respective key frames for each of the first shots and the second shots, wherein the nodes of the graph correspond to the respective key frames.

8. The computer-implemented method of claim 7, wherein the respective cost is based on a function of time and visual similarity.

9. The computer-implemented method of claim 4, further comprising: obtaining one or more audio features corresponding to the video, wherein the one or more groupings are further based at least in part on the one or more audio features.

10. The computer-implemented method of claim 4, further comprising: obtaining one or more text features corresponding to the video, wherein the one or more groupings are further based at least in part on the one or more text features.

11. The computer-implemented method of claim 4, further comprising: detecting at least one face in at least one grouping of the one or more groupings; determining an identity of the at least one face; and associating the identity with the at least one grouping.

12. The computer-implemented method of claim 4, further comprising: detecting music in at least one grouping of the one or more groupings; determining a title of the music; and associating the title with the at least one grouping.

13. The computer-implemented method of claim 4, further comprising: determining textual data corresponding to at least one grouping of the one or more groupings; and associating the textual data with the at least one grouping.

14. The computer-implemented method of claim 4, further comprising: analyzing visual content of at least one shot of the one or more first shots; and classifying the at least one shot as one of a dissolve shot, a blank shot, a card credit, a rolling credit, an action shot, or a static shot.

15. A non-transitory computer-readable storage medium comprising instructions that, upon being executed by a processor of a computing device, cause the computing device to: obtain one or more respective features for each frame of a plurality of frames of a video; determine one or more first shots of the video by analyzing similarity between the respective features for adjacent frames of the video by: determining respective cosine similarity between the respective features for the adjacent frames; and comparing the respective cosine similarity between the respective features for the adjacent frames to a similarity threshold; determine one or more second shots of the video by analyzing the first shots; and generate a graph of the video, the graph comprising nodes corresponding to the first shots and the second shots, and edges corresponding to a respective cost between the nodes; and determine one or more groupings of the first shots and the second shots by performing one or more cuts of the graph.

16. The non-transitory computer-readable storage medium of claim 15, wherein the instructions, upon being executed, further cause the computing device to: associate metadata with at least one grouping of the one or more groupings; and enable a user to navigate to the at least one grouping based on the metadata.

17. The non-transitory computer-readable storage medium of claim 16, wherein the metadata corresponds to at least one of an identity of an actor appearing in the at least one grouping, title of music playing in the at least one grouping, a representation of an object in the at least one grouping, a location corresponding to the at least one grouping, or textual data corresponding to the at least one grouping.

18. The non-transitory computer-readable storage medium of claim 15, wherein the instructions, upon being executed, further cause the computing device to: associate respective metadata with a plurality of the groupings; and extract at least one grouping of the plurality of the groupings based on the respective meta associated with the at least one grouping.

19. The non-transitory computer-readable storage medium of claim 15, wherein the instructions, upon being executed, further cause the computing device to: generate a second video by removing at least one grouping of the one or more groupings.

Make Better Decisions: Try a trial or see plans & pricing

Drugs may be covered by multiple patents or regulatory protections. All trademarks and applicant names are the property of their respective owners or licensors. Although great care is taken in the proper and correct provision of this service, thinkBiotech LLC does not accept any responsibility for possible consequences of errors or omissions in the provided data. The data presented herein is for information purposes only. There is no warranty that the data contained herein is error free. thinkBiotech performs no independent verification of facts as provided by public sources nor are attempts made to provide legal or investing advice. Any reliance on data provided herein is done solely at the discretion of the user. Users of this service are advised to seek professional advice and independent confirmation before considering acting on any of the provided information. thinkBiotech LLC reserves the right to amend, extend or withdraw any part or all of the offered service without notice.