23.2.2.2.2 Find Text in Documents

Chapter Contents (Back)
Document Analysis. Generally documents designed for text.

Fuller, P.[Paul],
Character reader,
US_Patent4,292,621, Sep 29, 1981
WWW Version. BibRef 8109

Amano, T.[Tomio],
Method for detecting character strings,
US_Patent5,033,104, Jul 16, 1991
WWW Version. Text in documents. BibRef 9107

Chen, S., Haralick, R.M., Phillips, I.T.,
Extraction of Text Words in Document Images Based on a Statistical Characterization,
JEI(5), No. 1, January 1996, pp. 25-36. BibRef 9601

Chen, F.R., Bloomberg, D.S., Wilcox, L.D.,
Detection and Location of Multicharacter Sequences in Lines of Imaged Text,
JEI(5), No. 1, January 1996, pp. 37-49. BibRef 9601
And:
Spotting Phrases in Lines of Imaged Text,
SPIE(2422), February 1995, pp. 256-269. BibRef

Aas, K.[Kjersti], Eikvil, L.[Line],
Text Page Recognition Using Grey-Level Features and Hidden Markov-Models,
PR(29), No. 6, June 1996, pp. 977-985.
WWW Version. 9606
BibRef

Aas, K.[Kjersti], Eikvil, L.[Line], Andersen, T.[Tove],
Text recognition from grey level images using hidden Markov models,
CAIP95(503-508).
Springer DOI Link 9509
BibRef

Shinghal, R.,
A Hybrid Algorithm for Contextual Text Recognition,
PR(16), No. 2, 1983, pp. 261-267.
WWW Version. 9611
BibRef

Lu, Z.Y.[Zhao-Yang],
Detection of text regions from digital engineering drawings,
PAMI(20), No. 4, April 1998, pp. 431-439.
IEEE DOI Link 0401
BibRef

Tan, C.L., Ng, P.O.,
Text Extraction Using Pyramid,
PR(31), No. 1, January 1998, pp. 63-72.
WWW Version. 9802
BibRef

Hwang, W.L.[Wen L.], Chang, F.[Fu],
Character extraction from documents using wavelet maxima,
IVC(16), No. 5, April 27 1998, pp. 307-315.
WWW Version. 0401
BibRef

Strouthopoulos, C., Papamarkos, N.,
Text Identification for Document Image Analysis Using a Neural Network,
IVC(16), No. 12-13, 24 August 1998, pp. 879-896.
WWW Version. BibRef 9808

Parodi, P.[Pietro], Fontana, R.[Roberto],
Efficient and flexible text extraction from document pages,
IJDAR(2), No. 2/3, 1999, pp. 67-79. 9912
BibRef

Parodi, P., Piccioli, G.,
An Efficient Preprocessing of Mixed-Content Document Images for OCR Systems,
ICPR96(III: 778-782).
IEEE DOI Link 9608
(Univ. di Genova, I) BibRef

Parodi, P.[Pietro], Piccioli, G.[Giulia],
A Fast and Flexible Statistical Method for Text Extraction in Document Pages,
CVPR96(619-624).
IEEE DOI Link BibRef 9600

Liang, J., Phillips, I.T., Haralick, R.M.,
Consistent Partition and Labelling of Text Blocks,
PAA(3), No. 2, 2000, pp. 196-208. 0010
BibRef

Xiao, Y.[Yi], Yan, H.[Hong],
Text region extraction in a document image based on the Delaunay tessellation,
PR(36), No. 3, March 2003, pp. 799-809.
WWW Version. 0301
See also Location of title and author regions in document images based on the Delaunay triangulation. BibRef

Nishida, H.[Hirobumi], Suzuki, T.[Takeshi],
Correcting Show-Through Effects on Scanned Color Document Images by Multiscale Analysis,
PR(36), No. 12, December 2003, pp. 2835-2847.
WWW Version. 0310
BibRef
Earlier:
Correcting Show-Through Effects on Document Images by Multiscale Analysis,
ICPR02(III: 65-68).
IEEE DOI Link 0211
See also Adaptive Inverse Halftoning for Scanned Document Images Through Multiresolution and Multiscale Analysis. BibRef

Kumar, S.I.[Sun-Il], Gupta, R., Khanna, N.[Nitin], Chaudhury, S.[Santanu], Joshi, S.D.[Shiv Dutt],
Text Extraction and Document Image Segmentation Using Matched Wavelets and MRF Model,
IP(16), No. 8, August 2007, pp. 2117-2128.
IEEE DOI Link 0709
BibRef
Earlier: A1, A3, A4, A5, Only:
Locating text in images using matched wavelets,
ICDAR05(II: 595-599).
IEEE DOI Link 0508
BibRef

Mukherjee, D.[Debargha],
Enhancing text-like edges in digital images,
US_Patent7,433,535, Oct 7, 2008
WWW Version. BibRef 0810

Liu, Z.Y.[Zong-Yi], Zhou, H.N.[Han-Ning], Yang, N.[Ning],
Semi-supervised learning for text-line detection,
PRL(31), No. 11, 1 August 2010, pp. 1260-1273.
Elsevier DOI Link 1008
Document segmentation; Semi-supervised learning; Text-line detection; Language adaptiveness BibRef

Zhao, M.[Ming], Li, S.T.[Shu-Tao], Kwok, J.[James],
Text detection in images using sparse representation with discriminative dictionaries,
IVC(28), No. 12, December 2010, pp. 1590-1599.
Elsevier DOI Link 1003
Text detection; Sparse representation; Discriminative dictionary BibRef

Marinai, S.[Simone],
Text retrieval from early printed books,
IJDAR(14), No. 2, June 2011, pp. 117-129.
WWW Version. 1106
BibRef

Peng, X.J.[Xu-Jun], Setlur, S.[Srirangaraj], Govindaraju, V.[Venu], Ramachandrula, S.[Sitaram],
Using a boosted tree classifier for text segmentation in hand-annotated documents,
PRL(33), No. 7, 1 May 2012, pp. 943-950.
Elsevier DOI Link 1203
Classification; Text separation; Document analysis; Decision tree BibRef

Peng, X.J.[Xu-Jun], Setlur, S.[Srirangaraj], Govindaraju, V.[Venu], Sitaram, R.[Ramachandrula],
Handwritten Text Separation from Annotated Machine Printed Documents Using Markov Random Fields,
IJDAR(16), No. 1, March 2013, pp. 1-16.
WWW Version. 1303
BibRef
Earlier:
Text Separation from Mixed Documents Using a Tree-Structured Classifier,
ICPR10(241-244).
IEEE DOI Link 1008
Award, ICPR. See also Preprocessing of Low-Quality Handwritten Documents Using Markov Random Fields. BibRef

Peng, X.J.[Xu-Jun], Setlur, S.[Srirangaraj], Govindaraju, V.[Venu], Sitaram, R.[Ramachandrula], Bhuvanagiri, K.[Kiran],
Markov Random Field Based Text Identification from Annotated Machine Printed Documents,
ICDAR09(431-435).
IEEE DOI Link 0907
BibRef

Pan, Z.T.[Zhao-Tai], Shen, H.F.[Hui-Feng], Lu, Y.[Yan], Li, S.P.[Shi-Peng], Yu, N.H.[Neng-Hai],
A Low-Complexity Screen Compression Scheme for Interactive Screen Sharing,
CirSysVideo(23), No. 6, 2013, pp. 949-960.
IEEE DOI Link 1307
BibRef
Earlier: A1, A2, A3, A5, A4:
A low-complexity screen compression scheme,
VCIP12(1-6).
IEEE DOI Link 1302
H.264 intra coding; multiple block modes Text vs. images. BibRef


Ha, S.J.[Seong Jong], Jin, B.[Bora], Cho, N.I.[Nam Ik],
Fast text line extraction in document images,
ICIP12(797-800).
IEEE DOI Link 1302
BibRef

Marder, M.[Mattias], Geva, A.[Amir], Ruan, Y.P.[Yao-Ping],
Lightweight searchable screen video recording,
VCIP12(1-6).
IEEE DOI Link 1302
Video monitoring of computer screens. BibRef

Zagoris, K.[Konstantinos], Pratikakis, I.[Ioannis], Antonacopoulos, A.[Apostolos], Gatos, B.[Basilis], Papamarkos, N.[Nikos],
Handwritten and Machine Printed Text Separation in Document Images Using the Bag of Visual Words Paradigm,
FHR12(103-108).
IEEE DOI Link 1302
BibRef

Lin, X.R.[Xiao-Rong], Guo, C.Y.[Chien-Yang], Chang, F.[Fu],
Classifying Textual Components of Bilingual Documents with Decision-Tree Support Vector Machines,
ICDAR11(498-502).
IEEE DOI Link 1111
BibRef

Wang, X.[Xiufei], Huang, L.[Lei], Liu, C.P.[Chang-Ping],
A Novel Method for Embedded Text Segmentation Based on Stroke and Color,
ICDAR11(151-155).
IEEE DOI Link 1111
BibRef

Fan, J.[Jian],
Text Segmentation of Consumer Magazines in PDF Format,
ICDAR11(794-798).
IEEE DOI Link 1111
BibRef

Dinh, T.N.[Toan Nguyen], Park, J.H.[Jong-Hyun], Lee, G.S.[Guee-Sang],
Text localization using image cues and text line information,
ICIP10(2261-2264).
IEEE DOI Link 1009
BibRef

Nirmala, S., Nagabhushan, P.,
Foreground Text Extraction in Color Document Images for Enhanced Readability,
PReMI09(387-392).
Springer DOI Link 0912
BibRef

Wang, X.F.[Xiu-Fei], Huang, L.[Lei], Liu, C.P.[Chang-Ping],
A New Block Partitioned Text Feature for Text Verification,
ICDAR09(366-370).
IEEE DOI Link 0907
Text vs. non-text, text line extraction. BibRef

Strouthopoulos, C.[Charalambos], Nikolaidis, A.[Athanasios],
A robust technique for text extraction in mixed-type binary documents,
ICPR08(1-4).
IEEE DOI Link 0812
BibRef

Ar, I.[Ilktan], Karsligil, M.E.[M. Elif],
Text Area Detection in Digital Documents Images Using Textural Features,
CAIP07(555-562).
Springer DOI Link 0708
BibRef

Nakano, Y., Kashio, K., Yoshida, T.,
HMM-Based Approach for Text Region Detection in Coded Video Bitstreams,
ICIP06(3209-3212).
IEEE DOI Link 0610
BibRef

Lucas, S.M.,
ICDAR 2005 text locating competition results,
ICDAR05(I: 80-84).
IEEE DOI Link 0508
BibRef

Coutinho, D.P.[David Pereira], Figueiredo, M.A.T.[Mário A.T.],
Information Theoretic Text Classification Using the Ziv-Merhav Method,
IbPRIA05(II:355).
Springer DOI Link 0509
BibRef

Zhang, X.[Xian], Zhu, X.Y.[Xiao-Yan],
A New Type of Feature: Loose N-Gram Feature in Text Categorization,
IbPRIA07(I: 378-385).
Springer DOI Link 0706
BibRef
Earlier:
Extended Bi-gram Features in Text Categorization,
IbPRIA05(II:379).
Springer DOI Link 0509
BibRef

Gllavata, J.[Julinda], Ewerth, R.[Ralph], Stefi, T.[Teuta], Freisleben, B.[Bernd],
Unsupervised Text Segmentation Using Color and Wavelet Features,
CIVR04(216-224).
WWW Version. 0505
BibRef

Gllavata, J., Ewerth, R., Freisleben, B.,
Text detection in images based on unsupervised classification of high-frequency wavelet coefficients,
ICPR04(I: 425-428).
IEEE DOI Link 0409
BibRef

Sabari Raju, S., Pati, P.B.[Peeta Basa], Ramakrishnan, A.G.,
Gabor filter based block energy analysis for text extraction from digital document images,
DIAL04(233-243).
IEEE DOI Link 0404
BibRef

Bai, Z.L.[Zhen-Long], Huo, Q.A.[Qi-Ang],
A goal-oriented verification-based approach for target text line extraction from a document image captured by a pen scanner,
ICPR04(II: 574-577).
IEEE DOI Link 0409
BibRef

Pinto, J.R.C.[João R. Caldas], Pina, P.[Pedro], Bandeira, L.[Lourenço], Pimentel, L.[Luís], Ramalho, M.[Mário],
Underline Removal on Old Documents,
ICIAR04(II: 226-233).
WWW Version. 0409
BibRef

Bai, Z.L.[Zhen-Long], Huo, Q.A.[Qi-Ang],
Underline detection and removal in a document image using multiple strategies,
ICPR04(II: 578-581).
IEEE DOI Link 0409
BibRef
Earlier:
An approach to extracting the target text line from a document image captured by a pen scanner,
ICDAR03(76-80).
IEEE Abstract. 0311
BibRef

Lu, Y.[Yue], Tan, C.L.,
Constructing area Voronoi diagram in document images,
ICDAR05(I: 342-346).
IEEE DOI Link 0508
BibRef

Lu, Y.[Yue], Wang, Z.[Zhe], Tan, C.L.[Chew Lim],
Word Grouping in Document Images Based on Voronoi Tessellation,
DAS04(147-157).
WWW Version. 0505
BibRef

Hu, Y.[Yi], Nagao, T.,
Matching of characters in scene images by using local shape feature vectors,
CIAP03(207-212).
IEEE Abstract. 0310
BibRef

Kim, E.Y.[Eun Yi], Chang, J.S.[Jae Sik], Kim, H.J.[Hang Joon],
Automatic text location using cluster-based template matching,
ICPR02(III: 423-426).
IEEE DOI Link 0211
BibRef

Bres, S.[Stéphane], Eglin, V.[Véronique], Gagneux, A.,
Unsupervised clustering of text entities in heterogeneous grey level documents,
ICPR02(III: 224-227).
IEEE DOI Link 0211
BibRef

Sin, B.K.[Bong-Kee], Kim, S.K.[Seon-Kyu], Cho, B.J.[Beom-Joon],
Locating characters in scene images using frequency features,
ICPR02(III: 489-492).
IEEE DOI Link 0211
BibRef

Kim, S.K.[Seon-Kyu], Sin, B.K.[Bong-Kee], Lee, S.W.[Seong-Whan],
Character spotting using image-based stochastic models,
ICDAR01(60-63).
IEEE DOI Link 0109
BibRef

Okun, O., Yan, Y.[Yu], Pietikainen, M.,
Robust text detection from binarized document images,
ICPR02(III: 61-64).
IEEE DOI Link 0211
BibRef

Pietkäinen, M.[Matti], Okun, O.[Oleg],
Edge-based method for text detection from complex document images,
ICDAR01(286-291).
IEEE DOI Link 0109
BibRef
And:
Text Extraction from Grey Scale Page Images by Simple Edge Detectors,
SCIA01(P-W3B). 0206
BibRef

Chen, X.R.[Xiang-Rong], Zhang, H.J.[Hong-Jiang],
Photo time-stamp detection and recognition,
ICDAR03(319-322).
IEEE Abstract. 0311
BibRef

Yuan, Q., Tan, C.L.,
Text extraction from gray scale document images using edge information,
ICDAR01(302-306).
IEEE DOI Link 0109
BibRef

Rennie, J.D.M.[Jason D. M.], Rifkin, R.[Ryan],
Improving Multiclass Text Classification with the Support Vector Machine,
MIT AIM-2001-026, October 2001.
WWW Version. 0205
BibRef

Rennie, J.D.M.[Jason D.M.],
Improving Multi-class Text Classification with Naive Bayes,
MIT AI-TR-2001-004, September 2001.
WWW Version. 0205
BibRef

Zagoris, K., Papamarkos, N., Chamzas, C.,
Web Document Image Retrieval System Based on Word Spotting,
ICIP06(477-480).
IEEE DOI Link 0610
BibRef

Strouthopoulos, C., Papamarkos, N., Atsalakis, A., Chamzas, C.,
Locating Text in Color Documents,
ICIP01(I: 1066-1069).
IEEE DOI Link 0108
BibRef

Dimov, D.,
Using an Exact Performance of Hough Transform for Image Text Segmentation,
ICIP01(I: 778-781).
IEEE DOI Link 0108
BibRef

Lin, L.[Lin], Tan, C.L.[Chew Lim],
Text extraction from name cards with complex design,
ICDAR05(II: 977-980).
IEEE DOI Link 0508
BibRef

Lee, H.J.[Hsi-Jian], Lee, S.H.[Shan-Hung],
Design of a Chinese name card understanding system,
ICDAR05(II: 981-985).
IEEE DOI Link 0508
BibRef

Park, T., Kim, D., Chung, K.,
Orientation and Scale Invariant Text Region Extraction in WWW Images,
MVA98(xx-yy). BibRef 9800

Li, Y., Jain, A.K.[Anil K.],
Classification of Text Documents,
ICPR98(Vol II: 1295-1297).
IEEE DOI Link 9808
BibRef

Li, J.[Jia], Gray, R.M.,
Text and picture segmentation by the distribution analysis of wavelet coefficients,
ICIP98(III: 790-794).
IEEE DOI Link 9810
BibRef

Zhou, J., Lopresti, D.P.,
Extracting Text from WWW Images,
ICDAR97(248-252).
IEEE DOI Link 9708
BibRef

Gao, J.B.[Jing-Bo], Li, X.Y.[Xin-You], Tang, Z.S.[Ze-Sheng],
Segmentation of stick text based on sub connected area analysis,
ICDAR97(417-421).
IEEE DOI Link 9708
BibRef

Cavnar, W., Trenkle, J.,
N-Gram-Based Text Categorization,
SDAIR94(161-169). BibRef 9400

Le Bourgeois, F., Bublinski, Z., Emptoz, H.,
A fast and efficient method for extracting text paragraphs and graphics from unconstrained documents,
ICPR92(II:272-276).
IEEE DOI Link 9208
BibRef

Filipski, A.,
Recognition of hand-lettered characters in the GTX 5000 drawing processor,
CVPR89(686-691).
IEEE DOI Link 0403
BibRef

Chapter on OCR, Document Analysis and Character Recognition Systems continues in
Text Detection, Find Text in General Scenes, Documents, Scene Text .


Last update:Nov 18, 2014 at 16:40:01