Page Segmentation, General Evaluations

Page Segmentation. Document Analysis. Application, Document Layout. More the segmentation part than analysis of the structure.

Nadler, M.,
Document Segmentation and Coding Techniques,
CVGIP(28), No. 2, November 1984, pp. 240-262.
WWW Version. Survey, Page Segmentation. BibRef 8411

Pavlidis, T.[Theo], Zhou, J.Y.[Jiang-Ying],
Page Segmentation and Classification,
GMIP(54), No. 6, November 1992, pp. 484-496. Survey, Page Segmentation. BibRef 9211

Pavlidis, T.[Theo],
Page Segmentation by White Streams,
ICDAR91(945-953). BibRef 9100

Zlatopolsky, A.A.,
Automated Document Segmentation,
PRL(15), No. 7, July 1994, pp. 699-704. BibRef 9407

Leng, G.W., Mital, D.P., Yong, T.S., Kang, T.K.,
A Differential-Processing Extraction Approach to Text and Image Segmentation,
EngAAI(7), No. 6, December 1994, pp. 639-651. BibRef 9412

Jain, A.K., Zhong, Y.,
Page Segmentation Using Texture Analysis,
PR(29), No. 5, May 1996, pp. 743-770.
WWW Version. 9605
Page segmentation using texture discrimination masks,
ICIP95(III: 308-311).
IEEE DOI Link 9510

Jain, A.K., Bhattacharjee, S.,
Text Segmentation Using Gabor Filters for Automatic Document Processing,
MVA(5), 1992, pp. 169-184. BibRef 9200

Jain, A.K., Bhattacharjee, S.K., Chen, Y.,
On texture in document images,
IEEE Abstract. 0403

Venkateswarlu, N.B., Boyle, R.D.,
New segmentation techniques for document image analysis,
IVC(13), No. 7, September 1995, pp. 573-583.
WWW Version. 0401

Shih, F.Y., Chen, S.S.,
Adaptive Document Block Segmentation and Classification,
SMC-B(26), No. 5, October 1996, pp. 797-802.
IEEE Top Reference. Segment based on run length smoothing. Then a rule-based classification into text, graphics, picture. BibRef 9610

Chen, S., Haralick, R.M., Phillips, I.T.,
Extraction of Text Lines and Text Blocks on Document Images Based on Statistical Modeling,
IJIST(7), No. 4, Winter 1996, pp. 343-356. 9612

Patel, D.,
Page Segmentation for Document Image-Analysis Using a Neural-Network,
OptEng(35), No. 7, July 1996, pp. 1854-1861. 9608

Patel, D., Stonham, T.J.,
Texture image classification and segmentation using RANK-order clustering,
IEEE DOI Link 9208

Payne, J.S., Stonham, T.J., Patel, D.,
Document segmentation using texture analysis,
IEEE DOI Link 9410

Etemad, K., Doermann, D., Chellappa, R.,
Multiscale Segmentation of Unstructured Document Pages Using Soft Decision Integration,
PAMI(19), No. 1, January 1997, pp. 92-96.
IEEE DOI Link 9702
Multiscale Document Page Segmentation Using Soft Decision Integration,
UMDTR3444, 1995.
WWW Version. BibRef
Page Segmentation Using Decision Integration and Wavelet Packets,
IEEE DOI Link Classify regions of the page image into text or images. BibRef

Etemad, K.[Kamran],
Multi-Scale Discriminant Analysis and Recognition of Signals and Images,
Ph.D.Thesis, April 1996. BibRef 9604 UMDTR3629. The goal is to find efficient multi-scale representations that yield maximum between-class separations and minimum within-class scatters.
WWW Version. Also for Faces. BibRef

Chen, J.L.,
A Simplified Approach to the HMM Based Texture Analysis and Its Application to Document Segmentation,
PRL(18), No. 10, October 1997, pp. 993-1007. 9802
Markov model texture analysis. BibRef

Kise, K.[Koichi], Sato, A.[Akinori], Iwata, M.[Motoi],
Segmentation of Page Images Using the Area Voronoi Diagram,
CVIU(70), No. 3, June 1998, pp. 370-382.
DOI Link For evaluation: See also Empirical Performance Evaluation Methodology and Its Application to Page Segmentation Algorithms. BibRef 9806

Hobby, J.D.[John D.],
Matching Document Images with Ground Truth,
IJDAR(1), No. 1, Spring 1998, pp. xx-yy. BibRef 9800
Earlier: ICDAR97(Tu-2B) 9708
In program, not in proceedings. BibRef

Cinque, L., Lombardi, L., Manzini, G.,
A Multiresolution Approach for Page Segmentation,
PRL(19), No. 2, February 1998, pp. 217-225. 9808
See also Shape-Description and Recognition by a Multiresolution Approach. BibRef

Cantoni, V., Cinque, L., Lombardi, L., Manzini, G.,
Page Segmentation Using a Pyramidal Architecture,
CAMP97(Session 6). BibRef 9700

Cinque, L., Levialdi, S., Lombardi, L., Tanimoto, S.,
Segmentation of page images having artifacts of photocopying and scanning,
PR(35), No. 5, May 2002, pp. 1167-1177.
WWW Version. 0202

Cinque, L., Forino, L., Levialdi, S., Lombardi, L., Tanimoto, S.,
Understanding the page logical structure,
IEEE DOI Link 9909

Cinque, L., Levialdi, S., Malizia, A., de Rosa, F.,
DAN: An Automatic Segmentation and Classification Engine for Paper Documents,
DAS02(491 ff.).
HTML Version. 0303

Cinque, L., Levialdi, S., Malizia, A.,
A system for the automatic layout segmentation and classification of digital documents,
IEEE Abstract. 0310

Liu, J.M., Tang, Y.Y.,
Distributed Autonomous Agents For Chinese Document Image Segmentation,
PRAI(12), No. 1, February 1998, pp. 97-118. 9806
See also Adaptive Image Segmentation With Distributed Behavior-Based Agents. BibRef

de Queiroz, R.L.,
Processing JPEG Compressed Images and Documents,
IP(7), No. 12, December 1998, pp. 1661-1672.
IEEE DOI Link 9812

de Queiroz, R.L.,
Processing JPEG-Compressed Images,
ICIP97(II: 334-337).
IEEE DOI Link BibRef 9700

de Queiroz, R.L., Eschbach, R.,
Fast Segmentation of the JPEG Compressed Documents,
JEI(7), No. 2, April 1998, pp. 367-377. 9807

de Queiroz, R.L., and Eschbach, R.,
Segmentation of Compressed Documents,
ICIP97(III: 70-73).
IEEE DOI Link BibRef 9700

de Queiroz, R.L.[Ricardo L.],
Compression of Compound Documents,
IEEE DOI Link BibRef 9900

Antonacopoulos, A.[Apostolos],
Page Segmentation Using the Description of the Background,
CVIU(70), No. 3, June 1998, pp. 350-369.
DOI Link BibRef 9806

Jain, A.K., Yu, B.,
Document Representation and Its Application to Page Decomposition,
PAMI(20), No. 3, March 1998, pp. 294-308.
IEEE DOI Link 9805
Generates a structured version of the document for editing, storage, retrieval, and analysis. Performs skew correction, segmentation, and labeling (text, table, image, drawing, and ruler). Some review of approaches. BibRef

Jain, A.K., Yu, B.,
Page segmentation using document model,
IEEE DOI Link 9708

Yang, J.C.Y.[James Ching-Yu], Tsai, W.H.[Wen-Hsiang],
Document image segmentation and quality improvement by moiré pattern analysis,
SP:IC(15), No. 9, July 2000, pp. 781-797.
WWW Version. 0008

Mao, S.[Song], Kanungo, T.[Tapas],
Empirical Performance Evaluation Methodology and Its Application to Page Segmentation Algorithms,
PAMI(23), No. 3, March 2001, pp. 242-256.
IEEE DOI Link 0103
Survey, Page Segmentation. Evaluation, Page Segmentation. Created separate test and training data, a computable performance metric, find optimal parameters for different algorithms, evaluate. Compare Voronoi (Kise) ( See also Segmentation of Page Images Using the Area Voronoi Diagram. ); Docstrum (O'Gorman) ( See also Document Spectrum for Page Layout Analysis, The. ); Caere (commercial system) ( See also Caere. ); (these 3 have about the same performance) Are better than ScanSoft (commercial system) ( See also ScanSoft. ); which is better than the older X-Y cut ( See also Prototype Document Image Analysis System for Technical Journals, A. ). Similar conclusion in later analysis: See also Performance Evaluation and Benchmarking of Six-Page Segmentation Algorithms. BibRef

Mao, S.[Song], Kanungo, T.[Tapas],
Software Architecture of PSET: A Page Segmentation Evaluation Toolkit,
IJDAR(4), No. 3, 2002, pp. 205-217.
HTML Version. 0205
Earlier: UMD--TR4190, September 2000.
WWW Version. Evaluation, Page Segmentation. BibRef

Mao, S.[Song], Kanungo, T.[Tapas],
A Methodology for Empirical Performance Evaluation of Page Segmentation Algorithms,
UMD--TR4093, December 1999.
WWW Version. BibRef 9912

Mao, S., Kanungo, T.,
Automatic Training of Page Segmentation Algorithms: An Optimization Approach,
ICPR00(Vol IV: 531-534).
IEEE DOI Link 0009

Kanungo, T., Mao, S.[Song],
Stochastic language models for style-directed layout analysis of document images,
IP(12), No. 5, May 2003, pp. 583-596.
IEEE DOI Link 0307

Amin, A.[Adnan], Shiu, R.[Ricky],
Page Segmentation And Classification Utilizing Bottom-up Approach,
IJIG(1), No. 2, April 2001, pp. 345-361. 0104

Deng, S.[Shulan], Latifi, S.[Shahram], Regentova, E.E.[Emma E.],
Document segmentation using polynomial spline wavelets,
PR(34), No. 12, December 2001, pp. 2533-2545.
WWW Version. 0110

Regentova, E.E., Latifi, S., Chen, D., Taghva, K., Yao, D.,
Document analysis by processing JBIG-encoded images,
IJDAR(7), No. 4, September 2005, pp. 260-272.
Springer DOI Link 0512

Diligenti, M.[Michelangelo], Frasconi, P.[Paolo], Gori, M.[Marco],
Hidden Tree Markov Models for Document Image Classification,
PAMI(25), No. 4, April 2003, pp. 520-524.
IEEE Abstract. 0304
Learning. Learn the concept of a set of documents of similar structure. BibRef

Diligenti, M., Gori, M., Maggini, M., Scarselli, F.,
Classification of HTML documents by Hidden Tree-Markov Models,
IEEE DOI Link 0109

Haji, M.M., Katebi, S.D.,
An Efficient Text Segmentation Technique Based on Naive Bayes Classifier,
GVIP(05), No. V7, 2005, pp. 21-30
HTML Version. BibRef 0500

Wang, Y.L.[Ya-Lin], Phillips, I.T.[Ihsin T.], Haralick, R.M.[Robert M.],
Document zone content classification and its performance evaluation,
PR(39), No. 1, January 2006, pp. 57-73.
WWW Version. 0512
Evaluation, Page Segmentation. BibRef
A Study on the Document Zone Content Classification Problem,
DAS02(212 ff.).
HTML Version. 0303
A method for document zone content classification,
ICPR02(III: 196-199).
IEEE DOI Link 0211
Earlier: A1, A3, A2:
Zone content classification and its performance evaluation,
IEEE DOI Link 0109
See also Table structure understanding and its performance evaluation. BibRef

Leydier, Y.[Yann], Le Bourgeois, F.[Frank], Emptoz, H.[Hubert],
Text search for medieval manuscript images,
PR(40), No. 12, December 2007, pp. 3552-3567.
WWW Version. 0709
Omnilingual Segmentation-Free Word Spotting for Ancient Manuscripts Indexation,
ICDAR05(I: 533-537).
IEEE DOI Link 0508
Serialized unsupervised classifier for adaptative color image segmentation: application to digitized ancient manuscripts,
ICPR04(I: 494-497).
IEEE DOI Link 0409
Word-spotting; Medieval manuscripts BibRef

Le Bourgeois, F.[Frank], Kaileh, H.[Hala],
Automatic Metadata Retrieval from Ancient Manuscripts,
WWW Version. 0505

Allier, B., Emptoz, H.,
Segmentation and typography extraction in document images using geodesic active regions,
ICPR04(I: 409-412).
IEEE DOI Link 0409

Leydier, Y.[Yann], Ouji, A.[Asma], Le Bourgeois, F.[Frank], Emptoz, H.[Hubert],
Towards an omnilingual word retrieval system for ancient manuscripts,
PR(42), No. 9, September 2009, pp. 2089-2105.
Elsevier DOI Link 0905
Document indexing; Word-spotting; Word retrieval; Ancient documents; Segmentation-free; Omnilingual BibRef

Ouji, A.[Asma], Leydier, Y.[Yann], Le Bourgeois, F.[Frank],
Chromatic / Achromatic Separation in Noisy Document Images,
IEEE DOI Link 1111

Shafait, F.[Faisal], Keysers, D.[Daniel], Breuel, T.M.[Thomas M.],
Performance Evaluation and Benchmarking of Six-Page Segmentation Algorithms,
PAMI(30), No. 6, June 2008, pp. 941-954.
IEEE DOI Link 0804
Survey, Page Segmentation. Evaluation, Page Segmentation. BibRef
Performance Comparison of Six Algorithms for Page Segmentation,
Springer DOI Link 0602
Pixel-Accurate Representation and Evaluation of Page Segmentation in Document Images,
ICPR06(I: 872-875).
IEEE DOI Link 0609
Also use the dummy program -- no segmentation for a minimum level. X-Y Cut ( See also Prototype Document Image Analysis System for Technical Journals, A. ), Run Length Smearing ( See also Document Analysis System. ), Whitespace Analysis ( See also Two Geometric Algorithms for Layout Analysis. ) and Constrained textline detection. The last two: Docstrum ( See also Document Spectrum for Page Layout Analysis, The. ), Voronoi ( See also Segmentation of Page Images Using the Area Voronoi Diagram. ). are generally the best choice. For similar analysis also see: See also Empirical Performance Evaluation Methodology and Its Application to Page Segmentation Algorithms. BibRef

Nagy, G.[George], Seth, S.C.[Sharad C.], Viswanathan, M.[Mahesh],
Comment: Projection Methods Require Black Border Removal,
PAMI(31), No. 4, April 2009, pp. 762-762.
IEEE DOI Link 0903
Flaw in page segmentation evaluation. See also Performance Evaluation and Benchmarking of Six-Page Segmentation Algorithms. Relative to evaluation of: See also Prototype Document Image Analysis System for Technical Journals, A. BibRef

Shafait, F.[Faisal], Keysers, D.[Daniel], Breuel, T.M.[Thomas M.],
Response to 'Projection Methods Require Black Border Removal',
PAMI(31), No. 4, April 2009, pp. 763-764.
IEEE DOI Link 0903
See also Performance Evaluation and Benchmarking of Six-Page Segmentation Algorithms. BibRef

Shafait, F.[Faisal], Breuel, T.M.[Thomas M.],
The Effect of Border Noise on the Performance of Projection-Based Page Segmentation Methods,
PAMI(33), No. 4, April 2011, pp. 846-851.
IEEE DOI Link 1103
Page segmentation usually sensitive to border noise. BibRef

Stamatopoulos, N.[Nikolaos], Gatos, B.[Basilis], Perantonis, S.J.[Stavros J.],
A method for combining complementary techniques for document image segmentation,
PR(42), No. 12, December 2009, pp. 3158-3168.
Elsevier DOI Link 0909
Document image segmentation; Combination method; Document image analysis; Segmentation BibRef

Coustaty, M.[Mickael], Pareti, R.[Rudolf], Vincent, N.[Nicole], Ogier, J.M.[Jean-Marc],
Towards historical document indexing: extraction of drop cap letters,
IJDAR(14), No. 3, September 2011, pp. 243-254.
WWW Version. 1109
Earlier: A1, A4, A2, A3:
Drop Caps Decomposition for Indexing a New Letter Extraction Method,
IEEE DOI Link 0907

Coustaty, M.[Mickal], Bouju, A.[Alain], Bertet, K.[Karell], Louis, G.[Georges],
Using Ontologies to Reduce the Semantic Gap between Historians and Image Processing Algorithms,
IEEE DOI Link 1111

Coustaty, M.[Mickael], Ogier, J.M.[Jean-Marc],
Discrimination of Old Document Images Using Their Style,
IEEE DOI Link 1111

Nguyen, T.T.H.[Thi Thuong Huyen], Coustaty, M.[Mickaël], Ogier, J.M.[Jean-Marc],
Bags of Strokes Based Approach for Classification and Indexing of Drop Caps,
IEEE DOI Link 1111

Nguyen, G.[Giap], Coustaty, M.[Mickael], Ogier, J.M.[Jean-Marc],
Stroke feature extraction for lettrine indexing,
IEEE DOI Link 1007

Jouili, S.[Salim], Coustaty, M.[Mickael], Tabbone, S.A.[Salvatore A.], Ogier, J.M.[Jean-Marc],
NAVIDOMASS: Structural-based Approaches Towards Handling Historical Documents,
IEEE DOI Link 1008

Liu, M.Y.[Meng-Yang], Li, C.S.[Chong-Shou], Zhu, W.B.[Wen-Bin], Lim, A.[Andrew],
A Morphology-Based Border Noise Removal Method for Camera-Captured Label Images,
Springer DOI Link 1404

Deryagin, D.,
Unified Performance Evaluation for OCR Zoning: Calculating Page Segmentation's Score, That Includes Text Zones, Tables and Non-text Objects,
IEEE DOI Link 1312
image segmentation BibRef

Lebourgeois, F., Drira, F., Gaceb, D., Duong, J.,
Fast Integral MeanShift: Application to Color Segmentation of Document Images,
IEEE DOI Link 1312
computational complexity BibRef

Stamatopoulos, N.[Nikolaos], Louloudis, G.[Georgios], Gatos, B.[Basilis],
Efficient Transcript Mapping to Ease the Creation of Document Image Segmentation Ground Truth with Text-Image Alignment,
IEEE DOI Link 1011
Ground truth creation. BibRef

Antonacopoulos, A.[Apostolos], Pletschacher, S.[Stefan], Bridson, D.[David], Papadopoulos, C.[Christos],
ICDAR 2009 Page Segmentation Competition,
IEEE DOI Link 0907

Antonacopoulos, A., Gatos, B., Bridson, D.,
Page Segmentation Competition,
IEEE DOI Link 0709
ICDAR2005 page segmentation competition,
ICDAR05(I: 75-79).
IEEE DOI Link 0508
ICDAR 2003 page segmentation competition,
IEEE Abstract. 0311

Peng, L.R.[Liang-Rui], Chen, M.[Ming], Liu, C.S.[Chang-Song], Ding, X.Q.[Xiao-Qing], Zheng, J.R.[Ji-Rong],
An automatic performance evaluation method for document page segmentation,
IEEE DOI Link 0109

Fumera, G., Pillai, I., Roli, F.,
Classification with reject option in text categorisation systems,
IEEE Abstract. 0310

Ma, H.F.[Huan-Feng], Doermann, D.,
Gabor filter based multi-class classifier for scanned document images,
IEEE Abstract. 0311

Allier, B.[Bénédicte], Emptoz, H.[Hubert],
Type extraction and character prototyping using gabor filters,
IEEE Abstract. 0311
Character prototyping in document images using Gabor filters,
ICIP03(I: 537-540).
IEEE Abstract. 0312
And: SCIA03(28-35).
WWW Version. 0310

Laurence, D.[Duffy], Le Bourgeois, F.[Frank], Emptoz, H.[Hubert],
Logical structure analysis by typographic characteristics extraction,
CIAP97(II: 639-646).
WWW Version. 9709

Allier, B., Duong, J., Gagneux, A., Mallet, P., Emptoz, H.,
Texture feature characterization for logical pre-labeling,
IEEE Abstract. 0311

Liu, L.J.[Li-Jie], Dong, Y.[Yan], Song, X.M.[Xiao-Mu], Fan, G.L.[Guo-Liang],
An entropy-based segmentation algorithm for computer-generated documentimages,
ICIP03(I: 541-544).
IEEE Abstract. 0312

Leedham, G., Yan, C.[Chen], Takru, K., Tan, J.H.N.[Joie Hadi Nata], Mian, L.[Li],
Comparison of some thresholding algorithms for text/background segmentation in difficult document images,
IEEE Abstract. 0311

Leedham, G., Varma, S., Patankar, A., Govindaraju, V.,
Separating text and background in degraded document images: A comparison of global thresholding techniques for multi-stage thresholding,
IEEE Top Reference. 0209

Kise, K., Miki, Y., Matsumoto, K.,
Stippling data on backgrounds of pages-toward seamless integration of paper and electronic documents,
IEEE Abstract. 0311

Kise, K., Yanagida, O., Takamatsu, S.,
Page Segmentation Based on Thinning of Background,
ICPR96(III: 788-792).
IEEE DOI Link 9608
(Osaka Prefecture Univ., J) BibRef

Kise, K., Yamaoka, M., Babaguchi, N., Tezuka, Y.,
Model based system for analyzing document images,
IEEE DOI Link 9208

Suvichakorn, A.[Aimamorn], Watcharabusaracum, S.[Sarin], Sinthupinyo, W.[Wasin],
Simple Layout Segmentation of Gray-Scale Document Images,
DAS02(245 ff.).
HTML Version. 0303

Caillault, E., Viard-Gaudin, C., Ahmad, A.R.,
MS-TDNN with global discriminant trainings,
ICDAR05(II: 856-860).
IEEE DOI Link 0508
NN HMM. BibRef

Golenzer, J., Viard-Gaudin, C., Lallican, P.M.,
Finding regions of interest in document images by planar HMM,
ICPR02(III: 415-418).
IEEE DOI Link 0211

Sivaramakrishnam, R., Phillips, I.T., Ha, J., Subramanium, S., Haralick, R.M.,
Zone Classification in a Document Using the Method of Feature Vector Generation,
ICDAR95(541-544). Pixel based, multiple classes. BibRef 9500

Cheng, H.[Hui], Fan, Z.G.[Zhi-Gang],
Background identification based segmentation and multilayer tree representation of document images,
ICIP02(III: 1005-1008).
IEEE DOI Link 0210

Blumenstein, M., Verma, B.,
Analysis of segmentation performance on the CEDAR benchmark database,
IEEE DOI Link 0109

Yang, Y.D.[Yu-Dong], Zhang, H.J.[Hong-Jiang],
HTML page analysis based on visual cues,
IEEE DOI Link 0109

Mukherjee, D.P.[Dipti Prasad], Acton, S.T.[Scott T.],
Document Page Segmentation using Multiscale Clustering,
IEEE DOI Link BibRef 9900

He, S., Abe, N.,
A Clustering-Based Approach to the Separation of Text Strings from Mixed Text/Graphics Documents,
ICPR96(III: 706-710).
IEEE DOI Link 9608
(National Univers. of Singapore, SGP) BibRef

Randen, T.[Trygve], and Husøy, J.H.[John Håkon],
Segmentation of text/image documents using texture approaches,
Proc. NOBIM-konferansen-94, Asker (Norway), June 1994, pp. 60-67.
HTML Version. BibRef 9406

Fischer, S., Amin, A., and Drivas, D.,
Segmentation of the Yellow Pages,
ICDAR95(605-609). BibRef 9500

Randriamasy, S., Vincent, L.,
Benchmarking Page Segmentation Algorithms,
IEEE Abstract. BibRef 9400

Higashino, J., Fujisawa, H., Nakano, Y., Ejiri, M.,
A Knowledge-Based Segmentation Method for Document Understanding,
ICPR86(745-748). Top-down layout analysis using FDL. BibRef 8600

Makino, H.,
Representation and Segmentation of Document Images,
CVPR839291-295). BibRef 8300

