23.2.2.2.1 Page Segmentation, General, Evaluations

Chapter Contents (Back)
Page Segmentation. Document Analysis. Application, Document Layout. More the segmentation part than analysis of the structure.

Nadler, M.,
Document Segmentation and Coding Techniques,
CVGIP(28), No. 2, November 1984, pp. 240-262. Survey, Page Segmentation. BibRef 8411

Pavlidis, T.[Theo], Zhou, J.Y.[Jiang-Ying],
Page Segmentation and Classification,
GMIP(54), No. 6, November 1992, pp. 484-496. Survey, Page Segmentation. BibRef 9211

Pavlidis, T.[Theo],
Page Segmentation by White Streams,
ICDAR91(945-953). BibRef 9100

Zlatopolsky, A.A.,
Automated Document Segmentation,
PRL(15), No. 7, July 1994, pp. 699-704. BibRef 9407

Leng, G.W., Mital, D.P., Yong, T.S., Kang, T.K.,
A Differential-Processing Extraction Approach to Text and Image Segmentation,
EngAAI(7), No. 6, December 1994, pp. 639-651. BibRef 9412

Jain, A.K., Zhong, Y.,
Page Segmentation Using Texture Analysis,
PR(29), No. 5, May 1996, pp. 743-770.
WWW Version. 9605 BibRef
Earlier:
Page segmentation using texture discrimination masks,
ICIP95(III: 308-311).
WWW Version. 9510 BibRef

Jain, A.K., Bhattacharjee, S.,
Text Segmentation Using Gabor Filters for Automatic Document Processing,
MVA(5), 1992, pp. 169-184. BibRef 9200

Jain, A.K., Bhattacharjee, S.K., Chen, Y.,
On texture in document images,
CVPR92(677-680).
IEEE Abstract. IEEE Top Reference. 0403 BibRef

Venkateswarlu, N.B., Boyle, R.D.,
New segmentation techniques for document image analysis,
IVC(13), No. 7, September 1995, pp. 573-583.
WWW Version. 0401 BibRef

Shih, F.Y., Chen, S.S.,
Adaptive Document Block Segmentation and Classification,
SMC-B(26), No. 5, October 1996, pp. 797-802.
IEEE Top Reference. Segment based on run length smoothing. Then a rule-based classification into text, graphics, picture. BibRef 9610

Chen, S., Haralick, R.M., Phillips, I.T.,
Extraction of Text Lines and Text Blocks on Document Images Based on Statistical Modeling,
IJIST(7), No. 4, Winter 1996, pp. 343-356. 9612 BibRef

Patel, D.,
Page Segmentation for Document Image-Analysis Using a Neural-Network,
OptEng(35), No. 7, July 1996, pp. 1854-1861. 9608 BibRef

Patel, D., Stonham, T.J.,
Texture image classification and segmentation using RANK-order clustering,
ICPR92(III:92-95).
WWW Version. 9208 BibRef

Payne, J.S., Stonham, T.J., Patel, D.,
Document segmentation using texture analysis,
ICPR94(B:380-382).
WWW Version. 9410 BibRef

Etemad, K., Doermann, D., Chellappa, R.,
Multiscale Segmentation of Unstructured Document Pages Using Soft Decision Integration,
PAMI(19), No. 1, January 1997, pp. 92-96.
IEEE Abstract. IEEE Top Reference.
WWW Version. 9702 BibRef
And:
Multiscale Document Page Segmentation Using Soft Decision Integration,
UMDTR3444, 1995.
WWW Version.
WWW Version. BibRef
Earlier:
Page Segmentation Using Decision Integration and Wavelet Packets,
ICPR94(B:345-349).
WWW Version. Classify regions of the page image into text or images. BibRef

Etemad, K.[Kamran],
Multi-Scale Discriminant Analysis and Recognition of Signals and Images,
Ph.D.Thesis, April 1996. BibRef 9604 UMDTR3629. The goal is to find efficient multi-scale representations that yield maximum between-class separations and minimum within-class scatters.
WWW Version.
WWW Version. Also for Faces. BibRef

Chen, J.L.,
A Simplified Approach to the HMM Based Texture Analysis and Its Application to Document Segmentation,
PRL(18), No. 10, October 1997, pp. 993-1007. 9802Markov model texture analysis. BibRef

Kise, K.[Koichi], Sato, A.[Akinori], Iwata, M.[Motoi],
Segmentation of Page Images Using the Area Voronoi Diagram,
CVIU(70), No. 3, June 1998, pp. 370-382.
WWW Version. For evaluation: See also Empirical Performance Evaluation Methodology and Its Application to Page Segmentation Algorithms. BibRef 9806

Hobby, J.D.[John D.],
Matching Document Images with Ground Truth,
IJDAR(1), No. 1, Spring 1998, pp. xx-yy. BibRef 9800
Earlier: ICDAR97(Tu-2B) 9708 BibRef

Cinque, L., Lombardi, L., Manzini, G.,
A Multiresolution Approach for Page Segmentation,
PRL(19), No. 2, February 1998, pp. 217-225. 9808 See also Shape-Description and Recognition by a Multiresolution Approach. BibRef

Cantoni, V., Cinque, L., Lombardi, L., Manzini, G.,
Page Segmentation Using a Pyramidal Architecture,
CAMP97(Session 6). BibRef 9700

Cinque, L., Levialdi, S., Lombardi, L., Tanimoto, S.,
Segmentation of page images having artifacts of photocopying and scanning,
PR(35), No. 5, May 2002, pp. 1167-1177.
WWW Version. 0202 BibRef

Cinque, L., Forino, L., Levialdi, S., Lombardi, L., Tanimoto, S.,
Understanding the page logical structure,
CIAP99(1003-1008).
WWW Version. 9909 BibRef

Cinque, L., Levialdi, S., Malizia, A., de Rosa, F.,
DAN: An Automatic Segmentation and Classification Engine for Paper Documents,
DAS02(491 ff.).
HTML Version. 0303 BibRef

Cinque, L., Levialdi, S., Malizia, A.,
A system for the automatic layout segmentation and classification of digital documents,
CIAP03(201-206).
IEEE Abstract. IEEE Top Reference. 0310 BibRef

Liu, J.M., Tang, Y.Y.,
Distributed Autonomous Agents For Chinese Document Image Segmentation,
PRAI(12), No. 1, February 1998, pp. 97-118. 9806 See also Adaptive Image Segmentation With Distributed Behavior-Based Agents. BibRef

de Queiroz, R.L.,
Processing JPEG Compressed Images and Documents,
IP(7), No. 12, December 1998, pp. 1661-1672.
WWW Version. 9812 BibRef

de Queiroz, R.L.,
Processing JPEG-Compressed Images,
ICIP97(II: 334-337).
WWW Version. BibRef 9700

de Queiroz, R.L., Eschbach, R.,
Fast Segmentation of the JPEG Compressed Documents,
JEI(7), No. 2, April 1998, pp. 367-377. 9807 BibRef

de Queiroz, R.L., and Eschbach, R.,
Segmentation of Compressed Documents,
ICIP97(III: 70-73).
WWW Version. BibRef 9700

de Queiroz, R.L.[Ricardo L.],
Compression of Compound Documents,
ICIP99(I:209-213).
IEEE Abstract. IEEE Top Reference. BibRef 9900

Antonacopoulos, A.[Apostolos],
Page Segmentation Using the Description of the Background,
CVIU(70), No. 3, June 1998, pp. 350-369.
WWW Version. BibRef 9806

Jain, A.K., Yu, B.,
Document Representation and Its Application to Page Decomposition,
PAMI(20), No. 3, March 1998, pp. 294-308.
IEEE Abstract. IEEE Top Reference.
WWW Version. 9805Generates a structured version of the document for editing, storage, retrieval, and analysis. Performs skew correction, segmentation, and labeling (text, table, image, drawing, and ruler). Some review of approaches. BibRef

Jain, A.K., Yu, B.,
Model-Based Document Representation: Application to Page Segmentation,
ICDAR97(Mo-2B) 9708 BibRef

Yang, J.C.Y.[James Ching-Yu], Tsai, W.H.[Wen-Hsiang],
Document image segmentation and quality improvement by moiré pattern analysis,
SP:IC(15), No. 9, July 2000, pp. 781-797.
WWW Version. 0008 BibRef

Mao, S.[Song], Kanungo, T.[Tapas],
Empirical Performance Evaluation Methodology and Its Application to Page Segmentation Algorithms,
PAMI(23), No. 3, March 2001, pp. 242-256.
IEEE Abstract. IEEE Top Reference.
WWW Version. 0103 Survey, Page Segmentation. Evaluation, Page Segmentation. Created separate test and training data, a computable performance metric, find optimal parameters for different algorithms, evaluate. Compare Voronoi (Kise) ( See also Segmentation of Page Images Using the Area Voronoi Diagram. ); Docstrum (O'Gorman) ( See also Document Spectrum for Page Layout Analysis, The. ); Caere (commercial system) ( See also Caere. ); (these 3 have about the same performance) Are better than ScanSoft (commercial system) ( See also ScanSoft. ); which is better than the older X-Y cut ( See also Prototype Document Image Analysis System for Technical Journals, A. ). Similar conclusion in later analysis: See also Performance Evaluation and Benchmarking of Six-Page Segmentation Algorithms. BibRef

Mao, S.[Song], Kanungo, T.[Tapas],
Software Architecture of PSET: A Page Segmentation Evaluation Toolkit,
IJDAR(4), No. 3, 2002, pp. 205-217.
HTML Version. 0205 BibRef
Earlier: UMD--TR4190, September 2000.
WWW Version.
WWW Version. Evaluation, Page Segmentation. BibRef

Mao, S.[Song], Kanungo, T.[Tapas],
A Methodology for Empirical Performance Evaluation of Page Segmentation Algorithms,
UMD--TR4093, December 1999.
WWW Version.
WWW Version.
WWW Version.
WWW Version. BibRef 9912

Mao, S., Kanungo, T.,
Automatic Training of Page Segmentation Algorithms: An Optimization Approach,
ICPR00(Vol IV: 531-534).
WWW Version.
HTML Version. 0009 BibRef

Kanungo, T., Mao, S.[Song],
Stochastic language models for style-directed layout analysis of document images,
IP(12), No. 5, May 2003, pp. 583-596.
WWW Version. 0307 BibRef

Amin, A., and Shiu, R.,
Page Segmentation and Classification Utilizing Bottom-Up Approach,
IJIG(2), No. 1, 2001, pp. 345-362. BibRef 0100

Deng, S.[Shulan], Latifi, S.[Shahram], Regentova, E.E.[Emma E.],
Document segmentation using polynomial spline wavelets,
PR(34), No. 12, December 2001, pp. 2533-2545.
WWW Version. 0110 BibRef

Regentova, E.E., Latifi, S., Chen, D., Taghva, K., Yao, D.,
Document analysis by processing JBIG-encoded images,
IJDAR(7), No. 4, September 2005, pp. 260-272.
WWW Version. 0512 BibRef

Diligenti, M.[Michelangelo], Frasconi, P.[Paolo], Gori, M.[Marco],
Hidden Tree Markov Models for Document Image Classification,
PAMI(25), No. 4, April 2003, pp. 520-524.
IEEE Abstract. IEEE Top Reference. 0304 Learning. Learn the concept of a set of documents of similar structure. BibRef

Diligenti, M., Gori, M., Maggini, M., Scarselli, F.,
Classification of HTML documents by Hidden Tree-Markov Models,
ICDAR01(849-853).
WWW Version. 0109 BibRef

Haji, M.M., Katebi, S.D.,
An Efficient Text Segmentation Technique Based on Naive Bayes Classifier,
GVIP(05), No. V7, 2005, pp. 21-30
HTML Version. BibRef 0500

Wang, Y.[Yalin], Phillips, I.T.[Ihsin T.], Haralick, R.M.[Robert M.],
Document zone content classification and its performance evaluation,
PR(39), No. 1, January 2006, pp. 57-73.
WWW Version. 0512 Evaluation, Page Segmentation. BibRef
Earlier:
A Study on the Document Zone Content Classification Problem,
DAS02(212 ff.).
HTML Version. 0303 BibRef
And:
A method for document zone content classification,
ICPR02(III: 196-199).
WWW Version. 0211 BibRef
Earlier: A1, A3, A2:
Zone content classification and its performance evaluation,
ICDAR01(540-544).
WWW Version. 0109 See also Table structure understanding and its performance evaluation. BibRef

Leydier, Y.[Yann], Le Bourgeois, F.[Frank], Emptoz, H.[Hubert],
Text search for medieval manuscript images,
PR(40), No. 12, December 2007, pp. 3552-3567.
WWW Version. 0709 BibRef
Earlier:
Omnilingual segmentation-free word spotting for ancient manuscripts indexation,
ICDAR05(I: 533-537).
WWW Version. 0508 BibRef
Earlier:
Serialized unsupervised classifier for adaptative color image segmentation: application to digitized ancient manuscripts,
ICPR04(I: 494-497).
WWW Version. 0409Word-spotting; Medieval manuscripts BibRef

Le Bourgeois, F.[Frank], Kaileh, H.[Hala],
Automatic Metadata Retrieval from Ancient Manuscripts,
DAS04(75-89).
WWW Version. 0505 BibRef

Allier, B., Emptoz, H.,
Segmentation and typography extraction in document images using geodesic active regions,
ICPR04(I: 409-412).
WWW Version. 0409 BibRef

Shafait, F.[Faisal], Keysers, D.[Daniel], Breuel, T.M.[Thomas M.],
Performance Evaluation and Benchmarking of Six-Page Segmentation Algorithms,
PAMI(30), No. 6, June 2008, pp. 941-954.
WWW Version. 0804 Survey, Page Segmentation. Evaluation, Page Segmentation. BibRef
Earlier:
Performance Comparison of Six Algorithms for Page Segmentation,
DAS06(368-379).
WWW Version. 0602 BibRef
And:
Pixel-Accurate Representation and Evaluation of Page Segmentation in Document Images,
ICPR06(I: 872-875).
WWW Version. 0609Also use the dummy program -- no segmentation for a minimum level. X-Y Cut ( See also Prototype Document Image Analysis System for Technical Journals, A. ), Run Length Smearing ( See also Document Analysis System. ), Whitespace Analysis ( See also Two Geometric Algorithms for Layout Analysis. ) and Constrained textline detection. The last two: Docstrum ( See also Document Spectrum for Page Layout Analysis, The. ), Voronoi ( See also Segmentation of Page Images Using the Area Voronoi Diagram. ). are generally the best choice. For similar analysis also see: See also Empirical Performance Evaluation Methodology and Its Application to Page Segmentation Algorithms. BibRef


Antonacopoulos, A., Gatos, B., Bridson, D.,
Page Segmentation Competition,
ICDAR07(1279-1283).
WWW Version. 0709 BibRef

Peng, L.R.[Liang-Rui], Chen, M.[Ming], Liu, C.S.[Chang-Song], Ding, X.Q.[Xiao-Qing], Zheng, J.R.[Ji-Rong],
An automatic performance evaluation method for document page segmentation,
ICDAR01(134-137).
WWW Version. 0109 BibRef

Fumera, G., Pillai, I., Roli, F.,
Classification with reject option in text categorisation systems,
CIAP03(582-587).
IEEE Abstract. IEEE Top Reference. 0310 BibRef

Ma, H.[Huanfeng], Doermann, D.,
Gabor filter based multi-class classifier for scanned document images,
ICDAR03(968-972).
IEEE Abstract. IEEE Top Reference. 0311 BibRef

Allier, B.[Bénédicte], Emptoz, H.[Hubert],
Type extraction and character prototyping using gabor filters,
ICDAR03(799-803).
IEEE Abstract. IEEE Top Reference. 0311 BibRef
And:
Character prototyping in document images using Gabor filters,
ICIP03(I: 537-540).
IEEE Abstract. IEEE Top Reference. 0312 BibRef
And: SCIA03(28-35).
WWW Version. 0310 BibRef

Laurence, D.[Duffy], Le Bourgeois, F.[Frank], Emptoz, H.[Hubert],
Logical structure analysis by typographic characteristics extraction,
CIAP97(II: 639-646).
WWW Version. 9709 BibRef

Allier, B., Duong, J., Gagneux, A., Mallet, P., Emptoz, H.,
Texture feature characterization for logical pre-labeling,
ICDAR03(567-571).
IEEE Abstract. IEEE Top Reference. 0311 BibRef

Liu, L.J.[Li-Jie], Dong, Y.[Yan], Song, X.M.[Xiao-Mu], Fan, G.L.[Guo-Liang],
An entropy-based segmentation algorithm for computer-generated documentimages,
ICIP03(I: 541-544).
IEEE Abstract. IEEE Top Reference. 0312 BibRef

Antonacopoulos, A., Gatos, B., Bridson, D.,
ICDAR2005 page segmentation competition,
ICDAR05(I: 75-79).
WWW Version. 0508 BibRef
Earlier:
ICDAR 2003 page segmentation competition,
ICDAR03(688-692).
IEEE Abstract. IEEE Top Reference. 0311 BibRef

Leedham, G., Yan, C.[Chen], Takru, K., Tan, J.H.N.[Joie Hadi Nata], Mian, L.[Li],
Comparison of some thresholding algorithms for text/background segmentation in difficult document images,
ICDAR03(859-864).
IEEE Abstract. IEEE Top Reference. 0311 BibRef

Leedham, G., Varma, S., Patankar, A., Govindaraju, V.,
Separating text and background in degraded document images: A comparison of global thresholding techniques for multi-stage thresholding,
FHR02(244-249).
IEEE Top Reference. 0209 BibRef

Kise, K., Miki, Y., Matsumoto, K.,
Stippling data on backgrounds of pages-toward seamless integration of paper and electronic documents,
ICDAR03(1213-1217).
IEEE Abstract. IEEE Top Reference. 0311 BibRef

Kise, K., Yanagida, O., Takamatsu, S.,
Page Segmentation Based on Thinning of Background,
ICPR96(III: 788-792).
WWW Version. 9608(Osaka Prefecture Univ., J) BibRef

Kise, K., Yamaoka, M., Babaguchi, N., Tezuka, Y.,
Model based system for analyzing document images,
ICPR92(II:647-650).
WWW Version. 9208 BibRef

Suvichakorn, A.[Aimamorn], Watcharabusaracum, S.[Sarin], Sinthupinyo, W.[Wasin],
Simple Layout Segmentation of Gray-Scale Document Images,
DAS02(245 ff.).
HTML Version. 0303 BibRef

Caillault, E., Viard-Gaudin, C., Ahmad, A.R.,
MS-TDNN with global discriminant trainings,
ICDAR05(II: 856-860).
WWW Version. 0508NN HMM. BibRef

Golenzer, J., Viard-Gaudin, C., Lallican, P.M.,
Finding regions of interest in document images by planar HMM,
ICPR02(III: 415-418).
WWW Version. 0211 BibRef

Sivaramakrishnam, R., Phillips, I.T., Ha, J., Subramanium, S., Haralick, R.M.,
Zone Classification in a Document Using the Method of Feature Vector Generation,
ICDAR95(541-544). Pixel based, multiple classes. BibRef 9500

Cheng, H.[Hui], Fan, Z.G.[Zhi-Gang],
Background identification based segmentation and multilayer tree representation of document images,
ICIP02(III: 1005-1008).
IEEE Abstract. IEEE Top Reference. 0210 BibRef

Blumenstein, M., Verma, B.,
Analysis of segmentation performance on the CEDAR benchmark database,
ICDAR01(1142-1146).
WWW Version. 0109 BibRef

Yang, Y.D.[Yu-Dong], Zhang, H.J.[Hong-Jiang],
HTML page analysis based on visual cues,
ICDAR01(859-864).
WWW Version. 0109 BibRef

Mukherjee, D.P.[Dipti Prasad], Acton, S.T.[Scott T.],
Document Page Segmentation using Multiscale Clustering,
ICIP99(I:234-238).
IEEE Abstract. IEEE Top Reference. BibRef 9900

He, S., Abe, N.,
A Clustering-Based Approach to the Separation of Text Strings from Mixed Text/Graphics Documents,
ICPR96(III: 706-710).
WWW Version. 9608(National Univers. of Singapore, SGP) BibRef

Randen, T.[Trygve], and Husøy, J.H.[John Håkon],
Segmentation of text/image documents using texture approaches,
Proc. NOBIM-konferansen-94, Asker (Norway), June 1994, pp. 60-67.
HTML Version. BibRef 9406

Fischer, S., Amin, A., and Drivas, D.,
Segmentation of the Yellow Pages,
ICDAR95(605-609). BibRef 9500

Randriamasy, S., Vincent, L.,
Benchmarking Page Segmentation Algorithms,
CVPR94(411-416).
IEEE Abstract. IEEE Top Reference. BibRef 9400

Higashino, J., Fujisawa, H., Nakano, Y., Ejiri, M.,
A Knowledge-Based Segmentation Method for Document Understanding,
ICPR86(745-748). Top-down layout analysis using FDL. BibRef 8600

Makino, H.,
Representation and Segmentation of Document Images,
CVPR839291-295). BibRef 8300

Chapter on OCR, Document Analysis and Character Recognition Systems continues in
Find Text in Documents .


Last update:Jun 25, 2008 at 13:37:57