19.4.5.4.3 Audio-Video Analysis for Indexing and Classification

Chapter Contents (Back)
Audio Video. Image Database. Video Indexing. See also Video Analysis -- Captions, Text.

Saraceno, C.[Caterina], Leonardi, R.[Riccardo],
Indexing audiovisual databases through joint audio and video processing,
IJIST(9), No. 5, 1999, pp. 320-331. BibRef 9900
Earlier:
Identification of Successive Correlated Camera Shots Using Audio and Video Information,
ICIP97(III: 166-169).
IEEE DOI Link BibRef
And:
Audio-visual processing for scene change detection,
CIAP97(II: 124-131).
WWW Version. 9709
BibRef

Li, D.G.[Dong-Ge], Sethi, I.K.[Ishwar K.], Dimitrova, N.[Nevenka], McGee, T.[Tom],
Classification of general audio data for content-based retrieval,
PRL(22), No. 5, April 2001, pp. 533-544.
HTML Version. 0105
BibRef

Tsekeridou, S.[Sofia], Pitas, I.[Ioannis],
Content-based video parsing and indexing based on audio-visual interaction,
CirSysVideo(11), No. 4, April 2001, pp. 522-535.
IEEE Top Reference. 0104
BibRef
Earlier:
Speaker dependent video indexing based on audio-visual interaction,
ICIP98(I: 358-362).
IEEE DOI Link 9810
BibRef

Tsekeridou, S.[Sofia], Krinidis, S.[Stelios], Pitas, I.[Ioannis],
Scene Change Detection Based on Audio-Visual Analysis and Interaction,
WTRCV01(214). 0103
BibRef

Kyperountas, M., Kotropoulos, C., Pitas, I.[Ioannis],
Enhanced Eigen-Audioframes for Audiovisual Scene Change Detection,
MultMed(9), No. 4, 2007, pp. 785-797.
IEEE DOI Link 0905
BibRef

Gauvain, J.L.[Jean-Luc], Lamel, L.[Lori], Adda, G.[Gilles],
Audio Partitioning and Transcription for Broadcast Data Indexation,
MultToolApp(14), No. 2, June 2001, pp. 187-200. 0106
BibRef

Amir, A.[Arnon], Srinivasan, S.[Savitha], Efrat, A.[Alon],
Search the Audio, Browse the Video: A Generic Paradigm for Video Collections,
JASP(2003), No. 2, February 2003, pp. 209.
HTML Version. 0304
BibRef

Beal, M.J.[Matthew J.], Jojic, N.[Nebojsa], Attias, H.[Hagai],
A graphical model for audiovisual object tracking,
PAMI(25), No. 7, July 2003, pp. 828-836.
IEEE Abstract. IEEE Top Reference. 0307
BibRef
Earlier: A1, A3, A2:
Audio-Video Sensor Fusion with Probabilistic Graphical Models,
ECCV02(I: 736 ff.).
HTML Version. 0205
2 microphones and a camera. Track the moving object with clutter and noise. BibRef

Li, Y.[Ying], Narayanan, S.S.[Shrikanth S.], Kuo, C.C.J.[C.C. Jay],
Adaptive Speaker Identification with Audio-Visual Cues for Movie Content Analysis,
PRL(25), No. 7, May 2004, pp. 777-791.
WWW Version. 0405
BibRef

Li, Y.[Ying], Narayanan, S.S.[Shrikanth S.], Kuo, C.C.J.[C.C. Jay],
Content-Based Movie Analysis and Indexing Based on Audio-Visual Cues,
CirSysVideo(14), No. 8, August 2004, pp. 1073-1085.
IEEE Abstract. IEEE Top Reference. 0409
BibRef
Earlier:
Movie Content Analysis, Indexing and Skimming Via Multimodal Information,
VideoMining03(Chapter 5). BibRef

Li, Y.[Ying], Kuo, C.C.J.[C.C. Jay],
A robust video scene extraction approach to movie content abstraction,
IJIST(13), No. 5, 2003, pp. 236-244.
WWW Version. 0312
BibRef

Wu, P.[Peng], Li, Y.[Ying], Tretter, D.[Daniel],
Scalable video summarization,
US_Patent7,047,494, May 16, 2006
WWW Version. BibRef 0605

Gong, Y.H.[Yi-Hong],
Summarizing Audiovisual Contents of a Video Program,
JASP(2003), No. 2, February 2003, pp. 160.
HTML Version. 0304
BibRef

Gong, Y.H.[Yi-Hong], Liu, X.[Xin],
Method and system for segmentation, classification, and summarization of video images,
US_Patent7,016,540, Mar 21, 2006
WWW Version. BibRef 0603
And: US_Patent7,151,852, Dec 19, 2006
WWW Version. BibRef
And:
Video Shot Segmentation and Classification,
ICPR00(Vol I: 860-863).
IEEE DOI Link
HTML Version. 0009
BibRef

Wang, H.L.[Hua-Lu], Divakaran, A.[Ajay], Vetro, A.[Anthony], Chang, S.F.[Shih-Fu], Sun, H.F.[Hui-Fang],
Survey of compressed-domain features used in audio-visual indexing and analysis,
JVCIR(14), No. 2, June 2003, pp. 150-183.
WWW Version. 0306
Survey, Image Retrieval. BibRef

Naphade, M.R.[Milind R.],
On supervision and statistical learning for semantic multimedia analysis,
JVCIR(15), No. 3, September 2004, pp. 348-369.
WWW Version. 0711
Factor graphs; Sum product algorithm; Active learning; Hidden Markov models; Dynamic Bayesian networks; Support vector machines BibRef

Naphade, M.R., Kozintsev, I.V., Huang, T.S.,
A factor graph framework for semantic video indexing,
CirSysVideo(12), No. 1, January 2002, pp. 40-52.
IEEE Top Reference. 0202
BibRef

Naphade, M.R., Kozintsev, I.V., Huang, T.S., Ramchandran, K.,
A factor graph framework for semantic indexing and retrieval in video,
CBAIVL00(35-39). 0008
BibRef

Naphade, M.R.[Milind R.], Huang, T.S.[Thomas S.],
Detecting Semantic Concepts Using Context and Audio/Visual Features,
EventVideo01(92-98).
IEEE DOI Link 0106
BibRef
Earlier:
Recognizing High-level Audio-visual Concepts Using Context,
ICIP01(III: 46-49).
IEEE Abstract. IEEE Top Reference. 0108
BibRef
Earlier:
Semantic Video Indexing Using a Probabilistic Framework,
ICPR00(Vol III: 79-84).
IEEE DOI Link
HTML Version. 0009
BibRef
And:
A Probabilistic Framework for Semantic Indexing and Retrieval in Video,
ICME00(MP9). 0007
BibRef
And:
Inferring Semantic Concepts for Video Indexing and Retrieval,
ICIP00(Vol III: 766-769).
IEEE Abstract. IEEE Top Reference. 0008
BibRef

Naphade, M.R., Kristjansson, T., Frey, B.J., Huang, T.S.,
Probabilistic multimedia objects (multijects): a novel approach to video indexing and retrieval in multimedia systems,
ICIP98(III: 536-540).
IEEE DOI Link 9810
BibRef

Gillet, O., Essid, S., Richard, G.,
On the Correlation of Automatic Audio and Visual Segmentations of Music Videos,
CirSysVideo(17), No. 3, March 2007, pp. 347-355.
IEEE DOI Link 0703
BibRef

Xie, X., Lu, L., Jia, M., Li, H., Seide, F., Ma, W.Y.,
Mobile Search With Multimodal Queries,
PIEEE(96), No. 4, April 2008, pp. 589-601.
IEEE DOI Link 0804
Text, image, audio queries. BibRef

Matos, N.[Nuno], Pereira, F.[Fernando],
Automatic creation and evaluation of MPEG-7 compliant summary descriptions for generic audiovisual content,
SP:IC(23), No. 8, September 2008, pp. 581-598.
WWW Version. 0804
Automatic audiovisual summarization; Generic content; MPEG-7 summary description; Arousal modeling; Motion intensity; Shot cut density; Sound energy BibRef

Kiranyaz, S., Gabbouj, M.,
Generic content-based audio indexing and retrieval framework,
VISP(153), No. 3, June 2006, pp. 285-297.
WWW Version. 0608
See also Novel multimedia retrieval technique: progressive query (why wait?). BibRef

Monaci, G., Jost, P., Vandergheynst, P., Mailhe, B., Lesage, S., Gribonval, R.,
Learning Multimodal Dictionaries,
IP(16), No. 9, September 2007, pp. 2272-2283.
IEEE DOI Link 0709
Integrating audio-visual info. BibRef

Covell, M., Baluja, S., Fink, M.,
Detecting Ads in Video Streams Using Acoustic and Visual Cues,
Computer(39), No. 12, December 2006, pp. 135-137.
IEEE DOI Link 0612
BibRef

Zhang, T.[Tong],
Using background audio change detection for segmenting video,
US_Patent7,266,287, Sep 4, 2007
WWW Version. BibRef 0709

Kotti, M., Ververidis, D., Evangelopoulos, G., Panagakis, I., Kotropoulos, C., Maragos, P., Pitas, I.,
Audio-Assisted Movie Dialogue Detection,
CirSysVideo(18), No. 11, November 2008, pp. 1618-1627.
IEEE DOI Link 0811
BibRef

Cristani, M.[Marco], Bicego, M.[Manuele], Murino, V.[Vittorio],
Audio-Visual Event Recognition in Surveillance Video Sequences,
MultMed(9), No. 2, February 2007, pp. 257-267.
IEEE DOI Link 0905
BibRef
Earlier:
Audio-Visual Foreground Extraction for Event Characterization,
SLAM06(116).
IEEE DOI Link 0609
BibRef
Earlier:
Audio-Video Integration for Background Modelling,
ECCV04(Vol II: 202-213).
WWW Version. 0405
BibRef

Zeng, Z.H.[Zhi-Hong], Tu, J.L.[Ji-Lin], Liu, M.[Ming], Huang, T.S.[Thomas S.], Pianfetti, B.[Brian], Roth, D.[Dan], Levinson, S.[Stephen],
Audio-Visual Affect Recognition,
MultMed(9), No. 2, February 2007, pp. 424-428.
IEEE DOI Link 0905
BibRef

Zeng, Z.H.[Zhi-Hong], Tu, J.L.[Ji-Lin], Pianfetti, B.M., Huang, T.S.,
Audio-Visual Affective Expression Recognition Through Multistream Fused HMM,
MultMed(10), No. 4, June 2008, pp. 570-577.
IEEE DOI Link 0905
BibRef

Zeng, Z.H.[Zhi-Hong], Tu, J.L.[Ji-Lin], Pianfetti, B.[Brian], Liu, M.[Ming], Zhang, T.[Tong], Zhang, Z.Q.[Zhen-Qiu], Huang, T.S.[Thomas S.], Levinson, S.[Stephen],
Audio-Visual Affect Recognition through Multi-Stream Fused HMM for HCI,
CVPR05(II: 967-972).
IEEE DOI Link 0507
BibRef


Philippeau, J.[Jeremy], Pinquier, J.[Julien], Joly, P.[Philippe], Carrive, J.[Jean],
Dynamic organization of audiovisual database using a user-defined similarity measure based on low-level features,
ICIP08(33-36).
IEEE DOI Link 0810
BibRef

Zeng, Z.[Zhi], Liang, W.[Wei], Li, H.P.[He-Ping], Zhang, S.W.[Shu-Wu],
A Novel Video Classification Method Based on Hybrid Generative/Discriminative Models,
SSPR08(705-713).
Springer DOI Link 0812
Using audio. BibRef

Zhu, Y.Y.[Ying-Ying], Ming, Z.[Zhong], Huang, Q.A.[Qi-Ang],
SVM-Based Audio Classification for Content- Based Multimedia Retrieval,
MCAM07(474-482).
Springer DOI Link 0706
BibRef

Goldmann, L., Samour, A., Karaman, M., Sikora, T.,
Extracting High Level Semantics by Means of Speech, Audio, and Image Primitives in Surveillance Applications,
ICIP06(2397-2400). 0610

IEEE DOI Link BibRef

Luo, J.[Jie], Caputo, B.[Barbara], Zweig, A.[Alon], Bach, J.H.[Jörg-Hendrik], Anemüller, J.[Jörn],
Object Category Detection Using Audio-Visual Cues,
CVS08(xx-yy).
Springer DOI Link 0805
BibRef

Caputo, B., Wallraven, C., Nilsback, M.E.,
Object categorization via local kernels,
ICPR04(II: 132-135).
IEEE DOI Link 0409
BibRef

Divakaran, A., Peker, K.A., Radhakrishnan, R., Xiong, Z.Y.[Zi-You], Cabasson, R.,
Video Summarization using MPEG-7 Motion Activity and Audio Descriptors,
VideoMining03(Chapter 4). BibRef 0300

Schauer, C., Gross, H.M.,
A Computational Model of Early Auditory-Visual Integration,
DAGM03(362-369).
HTML Version. 0310
BibRef

Fu, T.Y.[Tie-Yan], Liu, X.X.[Xiao Xing], Liang, L.H.[Lu Hong], Pi, X.B.[Xiao-Bo], Nefian, A.V.,
A audio-visual speaker identification using coupled hidden Markov models,
ICIP03(III: 29-32).
IEEE Abstract. IEEE Top Reference. 0312
BibRef

Yemez, Y.[Yücel], Kanak, A., Erzin, E., Tekalp, A.M.,
Multimodal speaker identification with audio-video processing,
ICIP03(III: 5-8).
IEEE Abstract. IEEE Top Reference. 0312
BibRef

Sugano, M., Isaksson, R., Nakajima, Y., Yanagihara, H.,
Shot genre classification using compressed audio-visual features,
ICIP03(II: 17-20).
IEEE Abstract. IEEE Top Reference. 0312
BibRef

Moncrieff, S., Venkatesh, S., and Dorai, C.,
Horror film genre typing and scene labeling via audio analysis,
ICME03(I: 193-196). BibRef 0300

Moncrieff, S., Dorai, C., Venkatesh, S.,
Affect computing in film through sound energy dynamics,
ACMMM01(525-527). BibRef 0100

Wachsmuth, S., Sagerer, G.,
Integrated analysis of speech and images as a probabilistic decoding process,
ICPR02(II: 588-592).
IEEE DOI Link 0211
BibRef

Kulesh, V., Petrushin, V.A., Sethi, I.K.,
Video clip recognition using joint audio-visual processing model,
ICPR02(I: 500-503).
IEEE DOI Link 0211
BibRef

Miyamori, H.,
Improving accuracy in behaviour identification for content-based retrieval by using audio and video information,
ICPR02(II: 826-830).
IEEE DOI Link 0211
BibRef

de Santo, M., Percannella, G., Sansone, C., Vento, M.,
Classifying audio of movies by a multi-expert system,
CIAP01(386-391).
IEEE Top Reference. 0210
BibRef

Albiol, A., Torres, L., Delp, E.J.,
Video preprocessing for audiovisual indexing,
Southwest02(57-61).
IEEE Top Reference. 0208
BibRef

Bakker, E.M.[Erwin M.], Lew, M.S.[Michael S.],
Semantic Video Retrieval Using Audio Analysis,
CIVR02(271-277).
HTML Version. 0208
BibRef

Kim, K.[Kyungsu], Choi, J.[Junho], Kim, N.[Namjung], Kim, P.K.[Pan-Koo],
Extracting Semantic Information from Basketball Video Based on Audio-Visual Features,
CIVR02(278-288).
HTML Version. 0208
BibRef

Fisher, J.W.[John W.], Darrell, T.J.[Trevor J.],
Probabalistic Models and Informative Subspaces for Audiovisual Correspondence,
ECCV02(III: 592 ff.).
HTML Version. 0205
BibRef

Chu, S.M.[Stephen M.], Huang, T.S.[Thomas S.],
Audio-Visual Speech Fusion Using Coupled Hidden Markov Models,
MSCSAS07(1-2).
IEEE DOI Link 0706
BibRef

Naphade, M.R.[Milind R.], Garg, A.[Ashutosh], Huang, T.S.[Thomas S.],
Audio-Visual Event Detection using Duration Dependent Input Output Markov Models,
CBAIVL01(30).
IEEE DOI Link 0110
BibRef

Alatan, A.A.,
Automatic Multi-modal Dialogue Scene Indexing,
ICIP01(III: 374-377).
IEEE Abstract. IEEE Top Reference. 0108
BibRef

Smith, M.A.[Michael A.], Kanade, T.[Takeo],
Video Skimming and Characterization through the Combination of Image and Language Understanding Techniques,
CVPR97(775-781).
IEEE Abstract. IEEE Top Reference.
WWW Version. 9704
BibRef
And: DARPA97(357-366). BibRef
And: CMU-CS-TR-97-111, February 1997. Language from audio produce a skim.
Postscript Version. BibRef

Smith, M.A.[Michael A.], Kanade, T.[Takeo],
Video Skimming for Quick Browsing based on Audio and Image Characterization,
CMU-CS-TR-95-186, July 1995.
Postscript Version. BibRef 9507

Sundaram, H.[Hari], Chang, S.F.[Shih-Fu],
Video Scene Segmentation Using Video and Audio Features,
ICME00(TP10). 0007
BibRef

Smith, J.R.[John R.], Li, C.S.[Chung-Sheng],
Adaptive Synthesis in Progressive Retrieval of Audio-Visual Data,
ICME00(MP5). 0007
BibRef

Toklu, C., Liou, S.P.,
Image and Audio Sequence Visualization and Interaction Mechanisms for Structured Video Browsing and Editing,
ICIP00(Vol II: 263-266).
IEEE Abstract. IEEE Top Reference. 0008
BibRef

Jiang, H.[Hao], Lin, T.[Tong], Zhang, H.J.[Hong-Jiang],
Video Segmentation with the Assistance of Audio Content Analysis,
ICME00(WP5). 0007
BibRef

Sugano, M., Nakajima, Y., Yanagihara, H.,
Automated MPEG audio-video summarization and description,
ICIP02(I: 956-959).
IEEE Abstract. IEEE Top Reference. 0210
BibRef

Pandit, M., Kittler, J.V., Li, Y., Chilton, E.,
A Comparative Study of Different Segmentation Approaches for Audio Track Indexing,
ICPR00(Vol II: 467-470).
IEEE DOI Link
HTML Version. 0009
BibRef

Huang, J.C.[Jin-Cheng], Liu, Z.[Zhu], Yao, W.[Wang],
Integration of audio and visual information for content-based video segmentation,
ICIP98(III: 526-529).
IEEE DOI Link 9810
BibRef

Saraceno, C., Leonardi, R.,
Identification of story units in audio-visual sequences by joint audio and video processing,
ICIP98(I: 363-367).
IEEE DOI Link 9810
BibRef

Chapter on Implementations and Applications, Databases, QBIC, Video Analysis, Hardware and Software, Inspection continues in
Survey, Comparison, Evaluation, of Segmentation and Cut Detection .


Last update:Nov 16, 2009 at 19:35:14