23.2 Analysis Systems Applied to Documents

Chapter Contents (Back)
Document Analysis. Application, Document Analysis. For a comparison of some of these techniques See also Evaluation of Binarization Methods for Document Images.

Deutsch, S.,
A Notes on Some Statistics Concerning Typewritten of Printed Material,
IT(3), No. 2, June 1957, pp. 147-149. BibRef 5706

Schurmann, J., Bartneck, N., Bayer, T., Franke, J., Mandler, E., and Oberlander, M.,
Document Analysis: From Pixels to Contents,
PIEEE(80), No. 7, July 1992, pp. 1101-1119.
IEEE Top Reference. In Special issue on OCR. BibRef 9207

Bayer, T., Franke, J., Kressel, U., Mandler, E., Oberlaender, M., Schuermann, J.,
Towards the Understanding of Printed Documents,
SDIA92(xx-yy). BibRef 9200

Hershey, A.V.[Allen V.],
A Computer System for Scientific Typography,
CGIP(1), No. 4, December 1972, pp. 373-385.
WWW Version. BibRef 7212

Johnston, E.G.[Emily G.],
Printed Text Discrimination,
CGIP(3), No. 1, March 1974, pp. 83-89.
WWW Version. 0501
BibRef

Gard, R.L.,
Digital Picture Processing Techniques for the Publishing Industry,
CGIP(5), No. 2, June 1976, pp. 151-171.
WWW Version. BibRef 7606

Wong, K.Y., Casey, R.G., and Wahl, F.M.,
Document Analysis System,
IBMRD(26), No. 6, November 1982, pp. 647-656. BibRef 8211

Inagaki, K., Kato, T., Hiroshima, T., Sakai, T.,
MACSYM: A Hierarchical Parallel Image Processing System for Event Driven Pattern Understanding of Documents,
PR(17), No. 1, 1984, pp. 85-108.
WWW Version. BibRef 8400

Baird, H.S., and Thompson, K.,
Reading Chess,
PAMI(12), No. 6, June 1990, pp. 552-559.
IEEE Abstract. IEEE Top Reference.
WWW Version. BibRef 9006
Earlier: CVWS87(277-282). Skew Correction. Text Analysis. Using several basic ideas and techniques, this is a system to read the text of chess matches and get the meaning. 98% of the games are read correctly implying a much higher accuracy at the character/word level. Corrects for the skew of the printing. BibRef

Baird, H.S.[Henry S.], Fortune, S.J.[Steven J.], Jones, S.E.[Susan E.],
Image segmenting apparatus and methods,
US_Patent5,430,808, July 4, 1995.
WWW Version. BibRef 9507
Earlier: A1, A3, A2:
Image Segmentation by Shape-Directed Covers,
ICPR90(I: 820-825).
IEEE DOI Link Application, Document Analysis. BibRef

Srikantan, G., Srihari, S.N.,
A Study Relating Image Sampling Rate and Image Pattern Recognition,
CVPR94(709-712).
IEEE Abstract. IEEE Top Reference. BibRef 9400

Akiyama, T., and Hagita, N.,
Automated entry system for printed documents,
PR(23), No. 11, 1990, pp. 1141-1154.
WWW Version. Japanese and English, Headlines, text lines, graphics. BibRef 9000

Masuda, I., Hagita, N., Akiyama, T., Takahashi, T., Naito, S.,
Approach to Smart Document Reader System,
CVPR85(550-557). BibRef 8500

Brandt, J.W., Jain, A.K., Algazi, V.R.,
Medial Axis Representation and Encoding of Scanned Documents,
JVCIR(2), 1991, pp. 151-165. BibRef 9100

Story, G.A., O'Gorman, L., Fox, D., Schaper, L.L., and Jagadish, H.V.,
The RightPages Image-Based Electronic Library for Alerting and Browsing,
Computer(25), No. 9, September 1992, pp. 17-26. BibRef 9209

O'Gorman, L.,
Image and document processing techniques for the RightPages electronic library system,
ICPR92(II:260-263).
IEEE DOI Link 9208
BibRef

Dengel, A.R.[Andreas R.], Bleisinger, R., Hoch, R., Fein, F.[Frank], Hönes, F.[Frank],
From Paper to Office Document Standard Representation,
Computer(25), No. 7, July 1992, pp. 63-67. BibRef 9207

Dengel, A.R.[Andreas R.],
ANASTASIL: A System for Low-Level and High-Level Geometric Analysis of Printed Documents,
SDIA92(xx-yy). BibRef 9200

Maio, D., and Rizzi, S.,
MAP Learning and Clustering in Autonomous Systems,
PAMI(15), No. 12, December 1993, pp. 1286-1297.
IEEE Abstract. IEEE Top Reference.
WWW Version. BibRef 9312

Dengel, A.R., and Barth, G.,
High Level Document Analysis Guided by Geometric Aspects,
PRAI(2), No. 4, December 1988, pp. 641-656. Hierarchical document model, document tree. BibRef 8812

de Silva, G.L., Hull, J.J.,
Proper Noun Detection in Document Images,
PR(27), No. 2, February 1994, pp. 311-320.
WWW Version. BibRef 9402

Chen, F.R.[Francine R.], Bloomberg, D.S.[Dan S.],
Summarization of Imaged Documents without OCR,
CVIU(70), No. 3, June 1998, pp. 307-320.
WWW Version. BibRef 9806
Earlier:
Extraction of Indicative Summary Sentences from Imaged Documents,
ICDAR97(Mo-4B) 9708
BibRef
Earlier: A2, A1:
Document Image Summarization without OCR,
ICIP96(II: 229-232).
IEEE DOI Link BibRef

Chen, F.R.[Francine R.], Bloomberg, D.S.[Dan S.],
Extraction Of Thematically Relevant Text From Images,
SDAIR96(XX) Xerox Palo Alto Research Center. BibRef 9600

Spitz, A.L.[A. Lawrence], Wilcox, L.D.[Lynn D.],
Method and apparatus for classifying documents,
US_Patent5,414,781, May 9, 1995
WWW Version. BibRef 9505

Ozaki, M.[Masaharu],
Method and apparatus for document element classification by analysis of major white region geometry,
US_Patent5,574,802, Nov 12, 1996
WWW Version. BibRef 9611

McLean, G.F.,
Geometric Correction of Digitized Art,
GMIP(58), No. 2, March 1996, pp. 142-154. BibRef 9603

Yamashita, A., Amano, T., Hirayama, Y., Itoh, N., Katoh, S., Mano, T., and Toyokawa, K.,
A document recognition system and its applications,
IBMRD(40), No. 3, May 1996, pp. 341-352.
WWW Version. BibRef 9605

Maderlechner, G.[Gerd], Suda, P.[Peter], Bruckner, T.,
Classification of Documents by Form and Content,
PRL(18), No. 11-13, November 1997, pp. 1225-1231. 9806
BibRef

Nishida, H.[Hirobumi],
A Note on Practical Uses of Gray-Scale Image Analysis in Document Recognition,
PRL(19), No. 9, 31 July 1998, pp. 889-897. BibRef 9807

Nishida, H.[Hirobumi],
Boundary Extraction from Gray-Scale Document Images Based on Surface Data Structures,
GMIP(60), No. 1, January 1998, pp. 35-45. BibRef 9801
Earlier:
Boundary Feature Extraction From Gray-Scale Document Images,
ICDAR97(Mo-3B) 9708
BibRef

Chauvet, P., Lopez Krahe, J., Taflin, E., Maltre, H.,
System for an intelligent office document analysis, recognition and description,
SP(32), No. 1-2 1993, pp. 161-190. BibRef 9300

Kundu, S.[Sukhamay],
A better fitness measure of a text-document for a given set of keywords,
PR(33), No. 5, May 2000, pp. 841-848.
WWW Version. 0003
BibRef

Kenney, A.R.[Anne R.], and Rieger, O.Y.[Oya Y.], (editors)
Moving Theory into Practice: Digital Imaging for Libraries and Archives,
Mountain View, CA: Research Libraries Group2000. ISBN 0-9700225-0-6. A how-to book for moving to the digital world for documents. (Not for analysis of them.) BibRef 0001

Lee, W.L.[Win-Long], Fan, K.C.[Kuo-Chin],
Document image preprocessing based on optimal Boolean filters,
SP(80), No. 1, January 2000, pp. 45-55. 0005
BibRef

Caere,
Company Information.
WWW Version. Vendor, OCR. OCR, document analysis, etc.

ScanSoft,
Company Information.
WWW Version. OCR, Document analysis, etc. Vendor, OCR.

Wenzel, C.[Claudia], Maus, H.[Heiko],
Leveraging corporate context within knowledge-based document analysis and understanding,
IJDAR(3), No. 4, 2001, pp. 248-260.
HTML Version. 0106
BibRef

Chan, W.[Woei], Coghill, G.[George],
Text analysis using local energy,
PR(34), No. 12, December 2001, pp. 2523-2532.
WWW Version. 0110
BibRef

Chang, F.[Fu],
Retrieving Information from Document Images: Problems and Solutions,
IJDAR(4), No. 1, 2001, pp. 46-55.
HTML Version. 0111
BibRef

Le Cun, Y.L.[Yann L.], Bottou, L.[Leon], Bengio, Y.[Yoshua], Haffner, P.,
Gradient-Based Learning applied to Document Recognition,
PIEEE(86), No. 11, November 1998, pp. 2278-2324.
IEEE Top Reference. BibRef 9811

Aiello, M.[Marco], Monz, C.[Christof], Todoran, L.[Leon], Worring, M.[Marcel],
Document understanding for a broad class of documents,
IJDAR(5), No. 1, 2002, pp. 1-16.
HTML Version. 0211
BibRef

Juola, P.[Patrick],
Document categorization and evaluation via cross-entrophy,
US_Patent6,397,205, May 28, 2002
WWW Version. BibRef 0205

Klein, B.[Bertin], Dengel, A.R.[Andreas R.],
Problem-adaptable document analysis and understanding for high-volume applications,
IJDAR(6), No. 3, March 2004, pp. 167-180.
Springer DOI Link 0406
BibRef
Earlier: A2, A1:
smartFIX: A Requirements-Driven System for Document Analysis and Understanding,
DAS02(433 ff.).
HTML Version. 0303
BibRef

Dengel, A.R.,
Learning of Pattern-Based Rules for Document Classification,
ICDAR07(123-127).
IEEE DOI Link 0709
BibRef

ReadSoft International,
2007. Document processing, OCR.
WWW Version. Vendor, Document Analysis. Vendor, OCR.

Aseervatham, S.[Sujeevan], Bennani, Y.[Younes],
Semi-structured document categorization with a semantic kernel,
PR(42), No. 9, September 2009, pp. 2067-2076.
Elsevier DOI Link
WWW Version. 0905
Mercer kernel; Support vector machine; Text categorization; Semantic similarity; Semi-structured data BibRef


Chaudhury, K.[Krishnendu], Jain, A.[Ankur], Thirthala, S.[Sriram], Sahasranaman, V.[Vivek], Saxena, S.[Shobhit], Mahalingam, S.[Selvam],
Google Newspaper Search: Image Processing and Analysis Pipeline,
ICDAR09(621-625).
IEEE DOI Link 0907
Scanned older news papers. BibRef

Terasawa, K., Tanaka, Y.,
Locality Sensitive Pseudo-Code for Document Images,
ICDAR07(73-77).
IEEE DOI Link 0709
BibRef

Seki, M.[Minenobu], Fujio, M.[Masakazu], Nagasaki, T.[Takeshi], Shinjo, H.[Hiroshi], Marukawa, K.[Katsumi],
Information Management System Using Structure Analysis of Paper/Electronic Documents and Its Applications,
ICDAR07(689-693).
IEEE DOI Link 0709
BibRef

Boutemedjet, S.[Sabri], Ziou, D.[Djemel],
Visual Aspect: A Unified Content-Based Collaborative Filtering Model for Visual Document Recommendation,
ICIAR06(I: 685-696).
Springer DOI Link 0610
BibRef

Wen, D.[Di], Ding, X.Q.[Xiao-Qing],
Applying Preattentive Visual Guidance in Document Image Analysis,
IWICPAS06(328-338).
Springer DOI Link 0608
BibRef

Simske, S.J.[Steven J.], Arnabat, J.[Jordi],
Document Analysis System for Automating Workflows,
DAS06(588-592).
Springer DOI Link 0602
BibRef
Earlier:
User-Directed Analysis of Scanned Images,
DocEng03(212-221). November 2003. segmentation, pixel classification, scene analysis, text processing, document capture, zoning, click and select,
WWW Version. BibRef
And: TRHewlett-Packard Labs, TR-233, 2003.
HTML Version. BibRef

Fan, J., Lin, X., Simske, S.J.,
A comprehensive image processing suite for book re-mastering,
ICDAR05(I: 447-451).
IEEE DOI Link 0508
BibRef

Belaïd, A.[Abdel], Alusse, A.[André],
Toward File Consolidation by Document Categorization,
DAS06(437-448).
Springer DOI Link 0602
BibRef

Qiang, Q.[Qi], He, Q.[Qinming],
A Multiclass Classification Framework for Document Categorization,
DAS06(474-483).
Springer DOI Link 0602
BibRef

McCullough, J., Arnabat, J., Martinez, O., Dobbins, S.,
Commercial Quality Text: What Does it Take?,
DPP03(36-37).
WWW Version. BibRef 0300

Nagy, G., Lopresti, D.P.,
Interactive Document Processing and Digital Libraries,
DIAL06(2-11).
IEEE DOI Link 0604
BibRef

Lopresti, D.P.[Daniel P.],
Exploiting WWW Resources in Experimental Document Analysis Research,
DAS02(532 ff.).
HTML Version. 0303
BibRef

Baird, H.S.[Henry S.], Popat, K.[Kris],
Human Interactive Proofs and Document Image Analysis,
DAS02(507 ff.).
HTML Version. 0303
BibRef

Mikheev, A.[Artem], Vincent, L.[Luc], Hawrylycz, M.[Mike], Bottou, L.[Léon],
Electronic Document Publishing Using DjVu,
DAS02(480 ff.).
HTML Version. 0303
BibRef

Bottou, L.[Leon], Haffner, P.[Patrick], Le Cun, Y.L.[Yann L.],
Efficient conversion of digital documents to multilayer raster formats,
ICDAR01(444-448).
IEEE DOI Link 0109
BibRef

Bottou, L.[Leon], Haffner, P.[Patrick], Howard, P.G.[Paul G.], Le Cun, Y.L.[Yann L.],
Color Documents on the Web with DjVu,
ICIP99(I:239-243).
IEEE Abstract. IEEE Top Reference. BibRef 9900

Leung, C.C., Kwok, P.C.K., Chan, F.H.Y., Tsui, W.K.,
Normalization of contrast in document images using generalized fuzzy operator with least square method,
ICPR02(III: 115-118).
IEEE DOI Link 0211
BibRef

Fataicha, Y., Cheriet, M., Nie, J.Y., Suen, C.Y.,
Content analysis in document images: a scale space approach,
ICPR02(III: 335-338).
IEEE DOI Link 0211
BibRef

Spitz, A.L.,
Progress in document reconstruction,
ICPR02(I: 464-467).
IEEE DOI Link 0211
BibRef

Torkkola, K.,
Discriminative features for document classification,
ICPR02(I: 472-475).
IEEE DOI Link 0211
BibRef

Breuel, T.M., Janssen, W.C., Popat, K., Baird, H.S.,
Paper to PDA,
ICPR02(I: 476-479).
IEEE DOI Link 0211
BibRef

da Cunha Cavalcanti, G.D., de Barros Carvalho, E.C.,
An architecture for document management,
ICIP02(III: 973-976).
IEEE Abstract. IEEE Top Reference. 0210
BibRef

Harlfinger, D., Kotzabassi, S.,
Hidden in Greek Manuscripts,
ICIP01(Hidden in Greek Manuscripts). 0108
Invited Talk. Not in proceedings. BibRef

Redeke, I.,
Image and Graphic Reader,
ICIP01(I: 806-809).
IEEE Abstract. IEEE Top Reference. 0108
BibRef

Roussel, N., Hitz, O., Ingold, R.,
Web-based cooperative document understanding,
ICDAR01(368-373).
IEEE DOI Link 0109
BibRef

Zaghetto, A.[Alexandre], de Queiroz, R.L.[Ricardo L.],
Iterative pre- and post-processing for MRC layers of scanned documents,
ICIP08(1009-1012).
IEEE DOI Link 0810
BibRef

de Queiroz, R.L.,
Pre-Processing for MRC Layers of Scanned Images,
ICIP06(3093-3096). 0610

IEEE DOI Link BibRef
Earlier:
On Data-filling Algorithms for MRC Layers,
ICIP00(Vol II: 586-589).
IEEE Abstract. IEEE Top Reference. 0008
BibRef

Yamada, K., Ishikawa, K., Nakajima, N.,
A Method of Analyzing the Handling of Paper Documents in Motion Images,
ICPR00(Vol IV: 413-416).
IEEE DOI Link
HTML Version. 0009
BibRef

Yang, Y., Yan, H.,
A Robust Document Processing System Combining Image Segmentation with Content-based Document Compression,
ICPR00(Vol IV: 519-522).
IEEE DOI Link
HTML Version. 0009
BibRef

Pavlidis, T.,
A New Paper/computer Interface: Two-dimensional Symbologies,
ICPR00(Vol II: 145-151).
IEEE DOI Link
HTML Version. 0009
BibRef

Srihari, S.N., and Zack, G.W.,
Document Image Analysis,
ICPR86(434-436). BibRef 8600

Kim, W.Y., Yuan, P.,
A Practical Pattern Recognition System for Translation, Scale and Rotation Invariance,
CVPR94(391-396).
IEEE Abstract. IEEE Top Reference. BibRef 9400

Huttenlocher, D.P.[Daniel P.], Rucklidge, W.J.[William J.],
DigiPaper: A Versatile Color Document Image Representation,
ICIP99(I:219-223).
IEEE Abstract. IEEE Top Reference. BibRef 9900

Tayeb-Bey, S., Saidi, A., Emptoz, H.,
Analysis and Conversion of Documents,
ICPR98(Vol II: 1089-1091).
IEEE DOI Link 9808
BibRef

Nakajima, N.[Noboru], Tanaka, N.[Naoya], Yamada, K.[Keiji],
Document Reconstruction and Recognition from an Image Sequence,
ICPR98(Vol I: 922-925).
IEEE DOI Link 9808
BibRef

Li, Y., Lalonde, M., Reiher, E., Rizand, J.F., Zhu, C.J.,
A Knowledge-Based Image Understanding Environment for Document Processing,
ICDAR97(We-2C) 9708
BibRef

Kauniskangas, H.[Hannu], Sauvola, J.[Jaakko],
An Automated Defect Management for Document Images,
ICPR98(Vol II: 1288-1294).
IEEE DOI Link 9808
BibRef

Eglin, V.[Véronique], Emptoz, H.[Hubert],
Logarithmic Spiral Grid and Gaze Control for the Development of Strategies of Visual Segmentation on a Document,
ICDAR97(Poste) 9708
BibRef

Blaesius, K.H., Grawemeyer, B., John, I., Kuhn, N.,
Knowledge-Based Document Analysis,
ICDAR97(Poste) 9708
BibRef

Wenzel, C.,
Supporting Information Extraction from Printed Documents by Lexico-Semantic Pattern Matching,
ICDAR97(Poste) 9708
BibRef

Chang, F., Chiu, T.F., Chou, T.R., Lee, M.C., Lu, Y.C., Shuai, T.Y., Tan, T.M., Wu, J.J., Young, C.S.,
A Document Analysis and Recognition System,
ICDAR97(Poste) 9708
BibRef

Miyamoto, T., Ishitani, Y., Seino, K., Nakamura, T., Tanabe, Y.,
Analysis of Required Elements for Next-Generation Document Reader on the Basis of User Requirements,
ICDAR97(Tu-3C) 9708
BibRef

Bunke, H., Gonin, R., Moeri, D.,
A Tool For Versatile And User-Friendly Document Correction,
ICDAR97(Tu-3C) 9708
BibRef

Yamazaki, Y., Komatsu, N.,
A Proposal for a Text-Indicated Writer Verification Method,
ICDAR97(Poste) 9708
BibRef

Bapst, F., Zramdini, A.[Abdelwahab], Ingold, R.[Rolf],
A Scenario Model Advocating User-Driven Adaptive Document Recognition Systems,
ICDAR97(Poste) 9708
BibRef

Buddrus, F., Bellavia, M.,
Surfing on ODBMS (Maintaining WWW Documents with O2),
ICDAR97(Poste) 9708
BibRef

O'Keefe, S.E.M., and Austin, J.,
Document Feature Recognition using a Mesh of Associative Memories,
BMVC96(Poster Session 1). 9608
BibRef
Earlier:
Application of an Associative Memory to the Analysis of Document Fax Images,
BMVC94(xx-yy).
PDF Version. 9409
University of York BibRef

Yamada, M., and Hasuike, K.,
Document Image Processing Based on Enhanced Border Following Algorithm,
ICPR90(II: 231-236).
IEEE DOI Link BibRef 9000

Kubota, K., Iwaki, O., Arakawa, H.,
Document Understanding System,
ICPR84(612-614). BibRef 8400

Nagy, G., Seth, S.,
Hierarchical Representation of Optically Scanned Documents,
ICPR84(347-349). BibRef 8400

Doster, W.,
Different States of a Document's Content on Its Way from the Gutenbergian World to the Electronic World,
ICPR84(872-874). BibRef 8400

Moulinier, I.[Isabelle], Raskinis, G.[Gailius], Ganascia, J.G.[Jean-Gabriel],
Text Categorization: A Symbolic Approach,
SDAIR96(XX) University of Paris. Vtautas Magnus University. BibRef 9600

Kutlu, G.[Gokhan], Draper, B.A.[Bruce A.], Moss, E.B.[Eliot B.], Riseman, E.M.[Edward M.],
Support Tools for Visual Information Management,
SDAIR96(XX) University of Massachusetts.
Postscript Version. BibRef 9600

Searls, D.B., Taylor, S.L.,
Document Image Analysis Using Logic-Grammar-Based Syntactic Pattern Recognition,
SDIA92(xx-yy). 0905
BibRef

Chapter on OCR, Document Analysis and Character Recognition Systems continues in
Document Analysis Systems, General, Survey, Evaluation .


Last update:Nov 16, 2009 at 19:35:14