19.4.5.7 Video Analysis -- Captions, Text, Video Text

Chapter Contents (Back)
Video Analysis. Captions. Video Indexing. See also Audio-Video Analysis for Indexing and Classification. For mostly general images (e.g. signs, etc.) See also Text Detection, Find Text in General Scenes, Scene Text, Color Documents.

Kim, H.K.,
Efficient Automatic Text Location Method and Content-Based Indexing and Structuring of Video Database,
JVCIR(7), No. 4, December 1996, pp. 336-344. 9704
BibRef

Jain, A.K.[Anil K.], Yu, B.[Bin],
Automatic Text Location in Images and Video Frames,
PR(31), No. 12, December 1998, pp. 2055-2076. BibRef 9812
Earlier:
WWW Version. ICPR98(Vol II: 1497-1499).
IEEE DOI Link 9808
BibRef

Viswanathan, M.[Mahesh], Beigi, H.S.M.[Homayoon S.M.], Dharanipragada, S.[Satya], Maali, F.[Fereydoun], Tritschler, A.[Alain],
Multimedia Document Retrieval Using Speech and Speaker Recognition,
IJDAR(2), No. 4, 1999, pp. xx-yy. 0008
BibRef

Li, H.P.[Hui-Ping], Doermann, D.[David], Kia, O.[Omid],
Automatic Text Detection and Tracking in Digital Video,
IP(9), No. 1, January 2000, pp. 147-156.
IEEE DOI Link 0001
BibRef
And: UMD--TR3962, December 1998. Neural Networks and Wavelets.
WWW Version.
WWW Version. BibRef

Li, H.P., Doermann, D.,
A Video Text Detection System Based on Automated Training,
ICPR00(Vol II: 223-226).
IEEE DOI Link 0009
BibRef

Doermann, D.[David], Li, H.P.[Hui-Ping],
Automatic Identification of Text in Digital Video Key Frames,
ICPR98(Vol I: 129-132).
IEEE DOI Link 9808
BibRef

Wu, V.[Victor], Manmatha, R.[Raghavan], Riseman, E.M.[Edward M.],
TextFinder: An Automatic System to Detect and Recognize Text in Images,
PAMI(21), No. 11, November 1999, pp. 1224-1229.
IEEE Abstract.
IEEE DOI Link 9912
BibRef
And:
TextFinder,
UMassCS TR 99-40, June, 1999.
Postscript Version. Extraction of the text for images (i.e. ads). BibRef

Wu, V., Manmatha, R.,
Extracting Text From Greyscale Images,
UMassCS TR 95-88, November, 1995.
Postscript Version. BibRef 9511

Wu, V., Manmatha, R., Riseman, E.M.,
Finding Text In Images,
UMassCS TR 97-09, February, 1997
Postscript Version. BibRef 9702

Zhong, Y.[Yu], Zhang, H.J.[Hong-Jiang], Jain, A.K.[Anil K.],
Automatic Caption Localization in Compressed Video,
PAMI(22), No. 4, April 2000, pp. 385-392.
IEEE Abstract.
IEEE DOI Link 0006
BibRef
Earlier: ICIP99(II:96-100).
IEEE Abstract. BibRef

Kim, K.I.[Kwang In], Jung, K.C.[Kee-Chul], Park, S.H.[Se Hyun], Kim, H.J.[Hang Joon],
Support vector machine-based text detection in digital video,
PR(34), No. 2, February 2001, pp. 527-529.
WWW Version. 0011
BibRef

Lee, C.W.[Chang Woo], Jung, K.C.[Kee-Chul], Kim, H.J.[Hang Joon],
Automatic text detection and removal in video sequences,
PRL(24), No. 15, November 2003, pp. 2607-2623.
WWW Version. 0308
See also Text scanner with text detection technology on image sequences. BibRef

Welsh, S.[Stephen], Conway, D.[Damian],
Encoding Video Narration as Text,
RealTimeImg(6), No. 5, October 2000, pp. 391-405. 0011
BibRef

Syeda-Mahmood, T.F., Srinivasan, S., Amir, A., Ponceleon, D., Blanchard, B., Petkovic, D.,
CueVideo: a system for cross-modal search and browse of video databases,
CVPR00(II: 786-787).
IEEE Abstract.
IEEE DOI Link 0403
BibRef

Crandall, D.[David], Antani, S.[Sameer], Kasturi, R.[Rangachar],
Extraction of special effects caption text events from digital video,
IJDAR(5), No. 2-3, April 2003, pp. 138-157.
HTML Version. 0308
BibRef
Earlier: A2, A1, A3:
Robust Extraction of Text in Video,
ICPR00(Vol I: 831-834).
IEEE DOI Link 0009
BibRef

Crandall, D., Kasturi, R.,
Robust detection of stylized text events in digital video,
ICDAR01(865-869).
IEEE DOI Link 0109
BibRef

Mariano, V.Y., Kasturi, R.,
Locating Uniform-colored Text in Video Frames,
ICPR00(Vol IV: 539-542).
IEEE DOI Link 0009
BibRef

Gandhi, T.[Tarak], Kasturi, R.[Rangachar], Antani, S.[Sameer],
Application of Planar Motion Segmentation for Scene Text Extraction,
ICPR00(Vol I: 445-449).
IEEE DOI Link 0009
BibRef

Wernicke, A.[Axel], Lienhart, R.[Rainer],
On the Segmentation of Text in Videos,
ICME00(WP5). 0007
BibRef

Kasturi, R., Gargi, U.[Ullas], Antani, S.[Sameer],
Indexing Text Events in Digital Video Databases,
ICPR98(Vol I: 916-918).
IEEE DOI Link 9808
BibRef

Adams, W.H., Iyengar, G.[Giridharan], Lin, C.Y.[Ching-Yung], Naphade, M.R.[Milind Ramesh], Neti, C.[Chalapathy], Nock, H.J.[Harriet J.], Smith, J.R.[John R.],
Semantic Indexing of Multimedia Content Using Visual, Audio, and Text Cues,
JASP(2003), No. 2, February 2003, pp. 170.
HTML Version. 0304
BibRef

Lyu, M.R., Song, J.[Jiqiang], Cai, M.[Min],
A comprehensive method for multilingual video text detection, localization, and extraction,
CirSysVideo(15), No. 2, February 2005, pp. 243-255.
IEEE Abstract. 0501
BibRef

de Jong, F.M.G., Westerveld, T., de Vries, A.P.,
Multimedia Search Without Visual Analysis: The Value of Linguistic and Contextual Information,
CirSysVideo(17), No. 3, March 2007, pp. 365-371.
IEEE DOI Link 0703
BibRef

Dimitrova, N.[Nevenka], Agnihotri, L.[Lalitha], Wei, G.[Gang],
Video Classification Using Object Tracking,
IJIG(1), No. 3, July 2001, pp. 487-505. 0107
BibRef

Martino, J.A.[Jacquelyn Annette], Dimitrova, N.[Nevenka], Elenbaas, J.H.[Jan Hermanus], Rutgers, J.[Job],
Histogram method for characterizing video content,
US_Patent6,473,095, Oct 29, 2002
WWW Version. BibRef 0210

Wei, G.[Gang], Agnihotri, L.[Lalitha], Dimitrova, N.[Nevenka],
TV Program Classification Based on Face and Text Processing,
ICME00(III: 1345-1348). 0007
BibRef

Agnihotri, L., Dimitrova, N.,
Text Detection for Video Analysis,
CBAIVL99(xx-yy). BibRef 9900

Wang, J.[Jian], Zhou, Y.H.[Yuan-Hua],
An Unsupervised Approach for Video Text Localization,
IEICE(E89-D), No. 4, April 2006, pp. 1582-1585.
WWW Version. 0604
BibRef

Wang, F.[Feng], Ngo, C.W.[Chong-Wah], Pong, T.C.[Ting-Chuen],
Structuring low-quality videotaped lectures for cross-reference browsing by video text analysis,
PR(41), No. 10, October 2008, pp. 3257-3269.
WWW Version. 0808
Topic detection; Video text analysis; Super-resolution reconstruction; Synchronization of lecture videos and electronic slides BibRef

Wang, Y.[Yong], Mei, T.[Tao], Gong, S.G.[Shao-Gang], Hua, X.S.[Xian-Sheng],
Combining global, regional and contextual features for automatic image annotation,
PR(42), No. 2, February 2009, pp. 259-266.
WWW Version. 0810
Global and regional features; Textual context; Cross media relevance model; Latent semantic analysis; Image annotation BibRef

Wang, Y.[Yong], Gong, S.G.[Shao-Gang],
Refining image annotation using contextual relations between words,
CIVR07(425-432).
WWW Version. 0707
BibRef

Mei, T.[Tao], Wang, Y.[Yong], Hua, X.S.[Xian-Sheng], Gong, S.G.[Shao-Gang], Li, S.P.[Shi-Peng],
Coherent image annotation by learning semantic distance,
CVPR08(1-8).
IEEE DOI Link 0806
BibRef

Jiang, Y.G.[Yu-Gang], Ngo, C.W.[Chong-Wah],
Visual word proximity and linguistics for semantic video indexing and near-duplicate retrieval,
CVIU(113), No. 3, March 2009, pp. 405-414.
Elsevier DOI Link
WWW Version. 0902
Visual ontology; Linguistic similarity; Soft-weighting; CEMD matching; Near-duplicate keyframe; Semantic concept BibRef

Wei, X.Y.[Xiao-Yong], Ngo, C.W.[Chong-Wah], Jiang, Y.G.[Yu-Gang],
Selection of Concept Detectors for Video Search by Ontology-Enriched Semantic Spaces,
MultMed(10), No. 6, October 2008, pp. 1085-1096.
IEEE DOI Link 0905
BibRef

Wei, X.Y.[Xiao-Yong], Jiang, Y.G.[Yu-Gang], Ngo, C.W.[Chong-Wah],
Concept-Driven Multi-Modality Fusion for Video Search,
CirSysVideo(21), No. 1, January 2011, pp. 62-73.
IEEE DOI Link 1103
BibRef

Jiang, Y.G.[Yu-Gang], Ngo, C.W.[Chong-Wah], Yang, J.[Jun],
Towards optimal bag-of-features for object categorization and semantic video retrieval,
CIVR07(494-501).
WWW Version. 0707
BibRef

Tsai, T.H.[Tsung-Han], Chen, Y.C.[Yung-Chien], Fang, C.L.[Chih-Lun],
2DVTE: A two-directional videotext extractor for rapid and elaborate design,
PR(42), No. 7, July 2009, pp. 1496-1510.
Elsevier DOI Link
WWW Version. 0903
Text detection; Localization; Extraction; Edge detection; Seed-filling BibRef

Zhao, X., Lin, K.H., Fu, Y., Hu, Y., Liu, Y., Huang, T.S.,
Text From Corners: A Novel Approach to Detect Text and Caption in Videos,
IP(20), No. 3, March 2011, pp. 790-799.
IEEE DOI Link 1103
BibRef


Vilaplana, V.[Veronica], Marques, F.[Ferran], Leon, M.[Miriam], Gasull, A.[Antoni],
Object detection and segmentation on a hierarchical region-based image representation,
ICIP10(3933-3936).
IEEE DOI Link 1009
BibRef
Earlier: A3, A1, A4, A2:
Caption text extraction for indexing purposes using a hierarchical region-based image model,
ICIP09(1869-1872).
IEEE DOI Link 0911
BibRef

Zhang, D.Q.[Dong-Qing], Bhagavathy, S.[Sitaram], Llach, J.[Joan],
Temporally consistent caption detection in videos using a spatiotemporal 3D method,
ICIP09(1881-1884).
IEEE DOI Link 0911
BibRef

Gupta, S.[Sonal], Mooney, R.J.[Raymond J.],
Using closed captions to train activity recognizers that improve video retrieval,
VCL-ViSU09(30-37).
IEEE DOI Link 0906
BibRef

Haubold, A.[Alexander], Natsev, A.P.[Apostol Paul],
Web-based information content and its application to concept-based video retrieval,
CIVR08(437-446). 0807
BibRef

Zhang, J.[Jing], Goldgof, D.[Dmitry], Kasturi, R.[Rangachar],
A new edge-based text verification approach for video,
ICPR08(1-4).
IEEE DOI Link 0812
BibRef

Bai, H.L.[Hong-Liang], Sun, J.[Jun], Naoi, S.[Satoshi], Katsuyama, Y.[Yutaka], Hotta, Y.[Yoshinobu], Fujimoto, K.[Katsuhito],
Video caption duration extraction,
ICPR08(1-4).
IEEE DOI Link 0812
BibRef

Chen, J.D.[Jin-Dong], Saund, E.[Eric], Wang, Y.Z.[Yi-Zhou],
Image objects and multi-scale features for annotation detection,
ICPR08(1-5).
IEEE DOI Link 0812
BibRef

Shivakumara, P.[Palaiahnakote], Huang, W.H.[Wei-Hua], Tan, C.L.[Chew Lim],
Efficient video text detection using edge features,
ICPR08(1-4).
IEEE DOI Link 0812
BibRef

Huang, W.H.[Wei-Hua], Shivakumara, P.[Palaiahnakote], Tan, C.L.[Chew Lim],
Detecting moving text in video using temporal information,
ICPR08(1-4).
IEEE DOI Link 0812
BibRef

Kim, D.[Daehyun], Sohn, K.H.[Kwang-Hoon],
Static text region detection in video sequences using color and orientation consistencies,
ICPR08(1-4).
IEEE DOI Link 0812
BibRef

Jung, C.K.[Cheol-Kon], Lee, S.Y.[Su Young], Kim, J.[Joongkyu],
Robust detection of key captions for sports video understanding,
ICIP08(2520-2523).
IEEE DOI Link 0810
BibRef

Aytar, Y.[Yusuf], Shah, M.[Mubarak], Luo, J.B.[Jie-Bo],
Utilizing semantic word similarity measures for video retrieval,
CVPR08(1-8).
IEEE DOI Link 0806
BibRef

Mensink, T.[Thomas], and Verbeek, J.[Jakob],
Improving People Search Using Query Expansions: How Friends Help to Find People,
ECCV08(II: 86-99).
Springer DOI Link 0810
On the web or captioned news images. BibRef

Mensink, T.[Thomas], Verbeek, J.[Jakob], Csurka, G.[Gabriela],
Learning structured prediction models for interactive image labeling,
CVPR11(833-840).
IEEE DOI Link 1106
BibRef
Earlier:
Trans Media Relevance Feedback for Image Autoannotation,
BMVC10(xx-yy).
HTML Version. 1009
BibRef

Kuettel, D.[Daniel], Guillaumin, M.[Matthieu], Ferrari, V.[Vittorio],
Combining Image-Level and Segment-Level Models for Automatic Annotation,
MMMod12(16-28).
Springer DOI Link 1201
BibRef

Guillaumin, M.[Matthieu], Mensink, T.[Thomas], Verbeek, J.[Jakob], Schmid, C.[Cordelia],
TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation,
ICCV09(309-316).
IEEE DOI Link 0909
BibRef
Earlier:
Automatic face naming with caption-based supervision,
CVPR08(1-8).
IEEE DOI Link 0806
BibRef

Mathe, S.[Stefan], Fazly, A.[Afsaneh], Dickinson, S.[Sven], Stevenson, S.[Suzanne],
Learning the abstract motion semantics of verbs from captioned videos,
SLAM08(1-8).
IEEE DOI Link 0806
BibRef

Stone, Z.[Zak], Zickler, T.E.[Todd E.], Darrell, T.J.[Trevor J.],
Autotagging Facebook: Social network context improves photo annotation,
InterNet08(1-8).
IEEE DOI Link 0806
BibRef

Aradhye, H., Myers, G.,
Exploiting Videotext Events for Improved Videotext Detection,
ICDAR07(894-898).
IEEE DOI Link 0709
BibRef

Wachenfeld, S.[Steffen], Fleischer, S.[Stefan], Jiang, X.Y.[Xiao-Yi],
A Multiple Classifier Approach for the Recognition of Screen-Rendered Text,
CAIP07(921-928).
Springer DOI Link 0708
BibRef

Wachenfeld, S.[Steffen], Klein, H.U.[Hans-Ulrich], Jiang, X.Y.[Xiao-Yi],
Recognition of Screen-Rendered Text,
ICPR06(II: 1086-1089).
IEEE DOI Link 0609
BibRef

Wang, Y.W.[Yao-Wei], Su, L.M.[Li-Min], Ye, Q.X.[Qi-Xiang],
A Robust Caption Detecting Algorithm on MPEG Compressed Video,
MCAM07(195-202).
Springer DOI Link 0706
BibRef

Wang, Y.K.[Yuan-Kai], Chen, J.M.[Jian-Ming],
Detecting Video Texts Using Spatial-Temporal Wavelet Transform,
ICPR06(IV: 754-757).
IEEE DOI Link 0609
BibRef

Ravulapalli, S.I.[Sun-Il], Sarkar, S.[Sudeep],
Association of Sound to Motion in Video using Perceptual Organization,
ICPR06(I: 1216-1219).
IEEE DOI Link 0609
BibRef

Velivelli, A.[Atulya], Huang, T.S.[Thomas S.],
Automatic Video Annotation by Mining Speech Transcripts,
SLAM06(115).
IEEE DOI Link 0609
BibRef

Su, Y.M.[Yih-Ming], Hsieh, C.H.[Chaur-Heh],
A Novel Caption Extraction Scheme for Various Sports Captions,
ICPR06(II: 1054-1057).
IEEE DOI Link 0609
BibRef

Jamieson, M.[Michael], Dickinson, S.[Sven], Stevenson, S.[Suzanne], Wachsmuth, S.[Sven],
Using Language to Drive the Perceptual Grouping of Local Image Features,
CVPR06(II: 2102-2109).
IEEE DOI Link 0606
Learning using features and captions. BibRef

Misra, C.[Chinmaya], Sural, S.[Shamik],
Content Based Image and Video Retrieval Using Embedded Text,
ACCV06(II:111-120).
Springer DOI Link 0601
BibRef

Natarajan, P., Elmieh, B., Schwartz, R., Makhoul, J.,
Videotext OCR using hidden Markov models,
ICDAR01(947-951).
IEEE DOI Link 0109
BibRef

Lefevre, S., Vincent, N.,
Caption localisation in video sequences by fusion of multiple detectors,
ICDAR05(I: 106-110).
IEEE DOI Link 0508
BibRef

Miyamori, H., Nakamura, S., Tanaka, K.,
Automatic Indexing of Broadcast Content Using its Live Chat on the Web,
ICIP05(III: 1248-1251).
IEEE DOI Link 0512
BibRef

Kidron, E.[Einat], Schechner, Y.Y.[Yoav Y.], Elad, M.[Michael],
Pixels that Sound,
CVPR05(I: 88-95).
IEEE DOI Link 0507
Combine images with the sounds. Not just talking faces. BibRef

Xie, L.X.[Le-Xing], Kennedy, L.S., Chang, S.F.[Shih-Fu], Divakaran, A., Sun, H., Lin, C.Y.,
Discovering meaningful multimedia patterns with audio-visual concepts and associated text,
ICIP04(IV: 2383-2386).
IEEE DOI Link 0505
BibRef

Kutics, A., Nakagawa, A., Arai, S., Tanaka, H., Ohtsuka, S.,
Relating words and image segments on multiple layers for effective browsing and retrieval,
ICIP04(IV: 2203-2206).
IEEE DOI Link 0505
BibRef

Nakagawa, A., Kutics, A., Tanaka, K., Nakajima, M.,
Combining words and object-based visual features in image retrieval,
CIAP03(354-359).
IEEE Abstract. 0310
BibRef

Kutics, A., Nakagawa, A., Nakajima, M.,
Image retrieval via connecting words to salient objects,
ICIP03(III: 17-20).
IEEE Abstract. 0312
BibRef

Declerck, T.[Thierry], Kuper, J.[Jan], Saggion, H.[Horacio], Samiotou, A.[Anna], Wittenburg, P.[Peter], Contreras, J.[Jesus],
Contribution of NLP to the Content Indexing of Multimedia Documents,
CIVR04(610-618).
WWW Version. 0505
BibRef

Wang, R.R.[Rong-Rong], Jin, W.[Wanjun], Wu, L.D.[Li-De],
A novel video caption detection approach using multi-frame integration,
ICPR04(I: 449-452).
IEEE DOI Link 0409
BibRef

Nakamura, A., Yamamoto, K.,
Caption text recognition in video frames by MAP matching,
ICDAR03(650-655).
IEEE Abstract. 0311
BibRef

Luo, B.[Bo], Tang, X.[Xiaoou], Liu, J.Z.[Jian-Zhuang], Zhang, H.J.[Hong-Jiang],
Video caption detection and extraction using temporal information,
ICIP03(I: 297-300).
IEEE Abstract. 0312
BibRef

Hauptmann, A.G., Jin, R., and Ng, T.D.,
Multi-modal information retrieval from broadcast video using OCR and speech recognition,
JCDL02(160-161); BibRef 0200

Cal, M.[Min], Song, J.Q.A.[Ji-Qi-Ang], Lyu, M.R.,
A new approach for video text detection,
ICIP02(I: 117-120).
IEEE Abstract. 0210
BibRef

Aradhye, H., and Dorai, C.,
Augmented Edit Distance Based Temporal Contiguity Analysis for Improved Videotext Recognition,
MMSP01(xx-yy). BibRef 0100

Dorai, C., Aradhye, H., and Shim, J.C.,
End-to-End Videotext Recognition for Multimedia Content Analysis,
ICME01(xx-yy)
PDF Version. BibRef 0100

Aradhye, H., Dorai, C., Shim, J.C.,
Study of Embedded Font Context and Kernel Space Methods for Improved Videotext Recognition,
ICIP01(II: 825-828).
IEEE Abstract. 0108
BibRef

Shim, J.C.[Jae-Chang], Dorai, C.[Chitra], Bolle, R.M.[Ruud M.],
Automatic Text Extraction from Video for Content-Based Annotation and Retrieval,
ICPR98(Vol I: 618-620).
IEEE DOI Link 9808
BibRef

Zhang, D.Q.[Dong-Qing], Rajendran, R.K., Chang, S.F.[Shih-Fu],
General and domain-specific techniques for detecting and recognizing superimposed text in video,
ICIP02(I: 593-596).
IEEE Abstract. 0210
BibRef

Zhang, D.Q.[Dong-Qing], Chang, S.F.[Shih-Fu],
A bayesian framework for fusing multiple word knowledge models in videotext recognition,
CVPR03(II: 528-533).
IEEE Abstract. 0307
BibRef

Sung, S.H.[Si-Hun], Chun, W.S.[Woo-Sung],
Knowledge-based numeric open caption recognition for live sportscast,
ICPR02(II: 822-825).
IEEE DOI Link 0211
BibRef

Hua, X.S.[Xian-Sheng], Yin, P.[Pei], Zhang, H.J.[Hong-Jiang],
Efficient video text recognition using multiple frame integration,
ICIP02(II: 397-400).
IEEE Abstract. 0210
BibRef

Mita, T., Hori, O.,
Improvement of video text recognition by character selection,
ICDAR01(1089-1093).
IEEE DOI Link 0109
BibRef

Miene, A., Hermes, T., Ioannidis, G.,
Extracting textual inserts from digital videos,
ICDAR01(1079-1083).
IEEE DOI Link 0109
BibRef

Lim, Y.K., Choi, S.H., Lee, S.W.,
Text Extraction in MPEG Compressed Video for Content-based Indexing,
ICPR00(Vol IV: 409-412).
IEEE DOI Link 0009
BibRef

Lazarescu, M.[Mihai], Venkatesh, S.[Svetha], Caelli, T.M.[Terry M.], West, G.A.W.[Geoff A.W.],
Combining NL Processing and Video Data to Query American Football,
ICPR98(Vol II: 1238-1240).
IEEE DOI Link 9808
BibRef

Cheung, C.H., and Po, L.M.,
Text-Driven Automatic Frame Generation Using MPEG-4 Synthetic/Natural Hybrid Coding for 2-D Head-and-Shoulder Scene,
ICIP97(II: 69-72).
IEEE DOI Link BibRef 9700

Srihari, R.K.,
Combining text and image information in content-based retrieval,
ICIP95(I: 326-329).
IEEE DOI Link 9510
BibRef

Chapter on Implementations and Applications, Databases, QBIC, Video Analysis, Hardware and Software, Inspection continues in
News Video Analysis, Cut Detection, Summaries, Indexing .


Last update:Feb 8, 2012 at 11:25:05