23.2.2 Form and Layout Analysis

Chapter Contents (Back)
Document Analysis. Form Analysis. Application, Document Layout.

23.2.2.1 Extract Data from Specific Forms

Chapter Contents (Back)
Document Analysis. Form Analysis.

Maderlechner, G.,
Symbolic Subtraction from Fixed Formatted Graphics and Text from Filled in Forms,
MVA(3), 1990, pp. 457-459. BibRef 9000

Taylor, S.L., Fritzson, R., and Pastor, J.A.,
Extraction of Data from Preprinted Forms,
MVA(5), 1992, pp. 211-222. BibRef 9200

Casey, R., Ferguson, D., Mohiuddin, K., and Walach, E.,
Intelligent Forms Processing System,
MVA(5), 1992, pp. 143-155. BibRef 9200

Garris, M.D., Dimmick, D.L.,
Form Design for High-Accuracy Optical Character-Recognition,
PAMI(18), No. 6, June 1996, pp. 653-656.
IEEE Abstract. IEEE Top Reference.
WWW Version. 9607 BibRef
Earlier:
Evaluating Form Designs for Optical Character Recognition,
NISTIR5364, February 1994. How to design a form to make it easier for OCR. BibRef

Garris, M.D.,
Correlated Run Length Algorithm (CURL) for Detecting Form Structure within Digitized Documents,
" SDAIR94(413-424). BibRef 9400

Lin, J.Y., Lee, C.W., Chen, Z.,
Identification of Business Forms Using Relationships Between Adjacent Frames,
MVA(9), No. 2, 1996, pp. 56-64.
HTML Version. 9609Use relations between frames (blocks of the form). Convert to a graph, then to a 1-D string for matching. BibRef

Yu, B.[Bin], Jain, A.K.,
A Generic System for Form Dropout,
PAMI(18), No. 11, November 1996, pp. 1127-1134.
IEEE Abstract. IEEE Top Reference.
WWW Version. 9612 BibRef
Earlier:
A Form Dropout System,
ICPR96(III: 701-705).
WWW Version. 9608(Michigan State Univ., USA) Getting the entered text out of the form. BibRef

Glasgow, B., Mandell, A., Binney, D., Ghemri, L., Fisher, D.,
Mita: An Information Extraction Approach to the Analysis of Free-Form Text in Life-Insurance Applications,
AIMag(19), No. 1, Spring 1998, pp. 59-71. 9804 BibRef

Fan, K.C., Lu, J.M., Chen, G.D.,
A Feature Point Clustering Approach to the Recognition of Form Documents,
PR(31), No. 9, September 1998, pp. 1205-1220.
WWW Version. 9808 BibRef

Tseng, L.Y., Chen, R.C.,
Recognition and Data Extraction of Form Documents Based on 3 Types of Line Segments,
PR(31), No. 10, October 1998, pp. 1525-1540.
WWW Version. 9808 BibRef
Earlier:
The Recognition of Form Documents Based on Three Types of Line Segments,
ICDAR97(Mo-2C) 9708 BibRef

Chen, J.L., Lee, H.J.,
An Efficient Algorithm for Form Structure Extraction Using Strip Projection,
PR(31), No. 9, September 1998, pp. 1353-1368.
WWW Version. 9808 BibRef
Earlier:
A Novel Form Structure Extraction Method Using Strip Projection,
ICPR96(III: 823-827).
WWW Version. 9608(National Chiao Tung Univ., ROC) BibRef

Chen, J.L.[Jiun-Lin], Lee, H.J.[Hsi-Jian],
Field data extraction for form document processing using a gravitation-based algorithm,
PR(34), No. 9, September 2001, pp. 1741-1750.
WWW Version. 0108 BibRef

Lee, H.J.[Hsi-Jian], Chen, J.L.[Jiun-Lin],
Field-Data Grouping for Form Document Processing Using a Gravitation-Based Algorithm,
ICPR98(Vol II: 1095-1097).
WWW Version. 9808 BibRef

Cracknell, C., Downton, A.C., Du, L.,
An object-oriented descriptive language to facilitate advanced handwritten form processing,
IVC(16), No. 12-13, 24 August 1998, pp. 843-853.
WWW Version. BibRef 9808
Earlier:
TABS: A New Software Framework for Document Image Processing, Analysis and Understanding,
ICDAR97(We-2C) 9708 BibRef
And:
Hierarchical recognition of structured hand-printed documents using rule-trees,
BMVC97(xx-yy).
HTML Version. 0209 BibRef

Du, L., Downton, A.C., Lucas, S.M., Al-Badr, B.,
Generalized Contextual Recognition of Hand-Printed Documents Using Semantic Trees With Lazy Evaluation,
ICDAR97(Mo-4B) 9708 BibRef

Cracknell, C., Downton, A.C.,
TABS: Script-Based Software Framework for Research in Image Processing, Analysis and Understanding,
VISP(145), No. 3, June 1998, pp. 194-202. 9808 BibRef

Downton, A.C.[Andy C.], Cracknell, C.,
Document Image Understanding of Handwritten Forms Using Rule-Trees,
ICPR98(Vol I: 936-938).
WWW Version. 9808 BibRef

Cesarini, F.[Francesca], Gori, M.[Marco], Marinai, S.[Simone], Soda, G.[Giovanni],
INFORMYS: A Flexible Invoice Like Form Reader System,
PAMI(20), No. 7, July 1998, pp. 730-745.
IEEE Abstract. IEEE Top Reference.
WWW Version. 9808Extract text from accounting documents. BibRef

Cesarini, F., Francesconi, E., Gori, M., Soda, G.,
Using Physical and Logical Constraints for Invoice Understanding,
PAA(3), No. 2, 2000, pp. 182-195. 0010 BibRef

Cesarini, F., Francesconi, E., Gori, M., Soda, G.,
Analysis and understanding of multi-class invoices,
IJDAR(6), No. 2, 2003, pp. 102-114.
WWW Version. 0310 BibRef

Cesarini, F., Francesconi, E., Gori, M., Marinai, S., Sheng, J.Q., Soda, G.,
Rectangle Labelling for an Invoice Understanding System,
ICDAR97(Tu-2B) 9708 BibRef

Ishitani, Y.,
Flexible and Robust Model Matching based on Association Graph for Form Image Understanding,
PAA(3), No. 2, 2000, pp. 104-119. 0010 BibRef

Ishitani, Y.,
Document transformation system from papers to XML data based on pivot XML document method,
ICDAR03(250-255).
IEEE Abstract. IEEE Top Reference. 0311 BibRef

Ishitani, Y.,
Model-based information extraction method tolerant of OCR errors for document images,
ICDAR01(908-915).
WWW Version. 0109 BibRef
Earlier:
Document Layout Analysis Based on Emergent Computation,
ICDAR97(Mo-2B) 9708 BibRef

Ming, D.[Delie], Liu, J.[Jian], Tian, J.[Jinwen],
Research on Chinese financial invoice recognition technology,
PRL(24), No. 1-3, January 2003, pp. 489-497.
HTML Version. 0211 BibRef

Ramdane, S.[Saïd], Taconet, B.[Bruno], Zahour, A.[Abderrazak],
Classification of forms with handwritten fields by planar hidden Markov models,
PR(36), No. 4, April 2003, pp. 1045-1060.
WWW Version. 0304 BibRef

Xi, D.[Dihua], Lee, S.W.[Seong-Whan],
Extraction of reference lines and items from form document images with complicated background,
PR(38), No. 2, February 2005, pp. 289-305.
WWW Version. 0411 BibRef
Earlier:
Reference line extraction from form documents with complicated backgrounds,
ICDAR03(1080-1084).
IEEE Abstract. IEEE Top Reference. 0311 BibRef

Romanowski, C.J., Nagi, R.,
On Comparing Bills of Materials: A Similarity/ Distance Measure for Unordered Trees,
SMC-A(35), No. 2, March 2005, pp. 249-260.
IEEE Abstract. IEEE Top Reference. 0501 BibRef

Vinciarelli, A.[Alessandro],
Noisy Text Categorization,
PAMI(27), No. 12, December 2005, pp. 1882-1895.
WWW Version. 0512 BibRef
Earlier: ICPR04(II: 554-557).
WWW Version. 0409 BibRef

Milewski, R.[Robert], Govindaraju, V.[Venu],
Binarization and cleanup of handwritten text from carbon copy medical form images,
PR(41), No. 4, April 2008, pp. 1308-1315.
WWW Version. 0801 BibRef
Earlier:
Extraction of Handwritten Text from Carbon Copy Medical Form Images,
DAS06(106-116).
WWW Version. 0602 BibRef
Earlier:
Medical word recognition using a computational semantic lexicon,
FHR02(401-406).
IEEE Top Reference. 0209 BibRef

Milewski, R., Setlur, S., Govindaraju, V.,
A lexicon reduction strategy in the context of handwritten medical forms,
ICDAR05(II: 1146-1150).
WWW Version. 0508 BibRef


Rosman, G., Tzadok, A., Tal, D.,
A New Physically Motivated Warping Model for Form Drop-Out,
ICDAR07(774-778).
WWW Version. 0709 BibRef

Mace, S.[Sebastian],
Context-Driven Constraint Multiset Grammars with Incremental Parsing for On-line Structured Document Interpretation,
ICDAR07(442-446).
WWW Version. 0709 BibRef

Bulacu, M.[Marius], van Koert, R.[Rutger], Schomaker, L.[Lambert], van der Zant, T.[Tijn],
Layout Analysis of Handwritten Historical Documents for Searching the Archive of the Cabinet of the Dutch Queen,
ICDAR07(357-361).
WWW Version. 0709 BibRef

Hamza, H.[Hatem], Belaid, Y.[Yolande], Belaid, A.[Abdel],
A Case-Based Reasoning Approach for Invoice Structure Extraction,
ICDAR07(327-331).
WWW Version. 0709 BibRef
And:
A Case-Based Reasoning Approach for Unknown Class Invoice Processing,
ICIP07(V: 353-356).
WWW Version. 0709 BibRef

Chen, S., Mao, S., Thoma, G.R.,
Simultaneous Layout Style and Logical Entity Recognition in a Heterogeneous Collection of Documents,
ICDAR07(118-122).
WWW Version. 0709 BibRef

Cao, H., Govindaraju, V.,
Vector Model Based Indexing and Retrieval of Handwritten Medical Forms,
ICDAR07(88-92).
WWW Version. 0709 BibRef

Cao, H.[Huaigu], Govindaraju, V.[Venu],
Handwritten Carbon Form Preprocessing Based on Markov Random Field,
CVPR07(1-7).
WWW Version. 0706 BibRef

Nagasaki, T.[Takeshi], Marukawa, K.[Katsumi], Kagehiro, T.[Tatsuhiko], Sako, H.[Hiroshi],
A Coupon Classification Method Based on Adaptive Image Vector Matching,
ICPR06(III: 280-283).
WWW Version. 0609 BibRef

Liu, D.[David], Chen, D.T.[Da-Tong], Chen, T.H.[Tsu-Han],
Unsupervised Image Layout Extraction,
ICIP06(1113-1116). 0610
WWW Version. BibRef
And:
Latent Layout Analysis for Discovering Objects in Images,
ICPR06(II: 468-471).
WWW Version. 0609 BibRef

Liu, D.[David], Chen, T.H.[Tsu-Han],
Semantic-Shift for Unsupervised Object Detection,
BP06(16).
WWW Version. 0609 BibRef

Taghva, K.[Kazem], Beckley, R.[Russell], Coombs, J.[Jeffrey],
The Effects of OCR Error on the Extraction of Private Information,
DAS06(348-357).
WWW Version. 0602 BibRef

Flaster, M.[Michael], Hillyer, B.[Bruce], Ho, T.K.[Tin Kam],
Exploratory Analysis System for Semi-structured Engineering Logs,
DAS06(291-301).
WWW Version. 0602 BibRef

Klein, B.[Bertin], Agne, S.[Stefan], Dengel, A.R.[Andreas R.],
On Benchmarking of Invoice Analysis Systems,
DAS06(312-323).
WWW Version. 0602 BibRef
Earlier:
Results of a Study on Invoice-Reading Systems in Germany,
DAS04(451-462).
WWW Version. 0505 BibRef

Agne, S., Dengel, A.R., Klein, B.,
Evaluating SEE-a benchmarking system for document page segmentation,
ICDAR03(634-638).
IEEE Abstract. IEEE Top Reference. 0311 BibRef

Tuganbaev, D., Pakhchanian, A., Deryagin, D.,
Universal data capture technology from semi-structured forms,
ICDAR05(I: 458-462).
WWW Version. 0508 BibRef

Shima, Y., Ohya, H., Yasuda, M.,
A form dropout method based on line-elimination and image-subtraction,
ICDAR05(I: 126-130).
WWW Version. 0508 BibRef

Viola, P.[Paul], Rinker, J.[James], Law, M.[Martin],
Automatic Fax Routing,
DAS04(484-495).
WWW Version. 0505 BibRef

Biagioli, C.[Carlo], Francesconi, E.[Enrico], Spinosa, P.[Pierluigi], Taddei, M.[Mirco],
XML Documents Within a Legal Domain: Standards and Tools for the Italian Legislative Environment,
DAS04(413-424).
WWW Version. 0505 BibRef

Cascini, G.[Gaetano], Fantechi, A.[Alessandro], Spinicci, E.[Emilio],
Natural Language Processing of Patents and Technical Documentation,
DAS04(508-520).
WWW Version. 0505 BibRef

Hadjar, K.[Karim], Ingold, R.[Rolf],
Physical Layout Analysis of Complex Structured Arabic Documents Using Artificial Neural Nets,
DAS04(170-178).
WWW Version. 0505 BibRef

Atkins, C.B.,
Adaptive photo collection page layout,
ICIP04(V: 2897-2900).
WWW Version. 0505 BibRef

Tam, V., Setiono, R., Santoso, A.,
Applying the conjugate gradient method for text document categorization,
ICPR04(II: 558-561).
WWW Version. 0409 BibRef

Belaid, Y., Belaid, A.,
Morphological tagging approach in document analysis of invoices,
ICPR04(I: 469-472).
WWW Version. 0409 BibRef

Belaid, A., Belaid, Y., Valverde, L.N., Kebairi, S.,
Adaptive technology for mail-order form segmentation,
ICDAR01(689-693).
WWW Version. 0109 BibRef

Downton, A.C., Lucas, S.M., Patoulas, G., Beccaloni, G.W., Scoble, M.J., Robinson, G.S.,
Computerising natural history card archives,
ICDAR03(354-358).
IEEE Abstract. IEEE Top Reference. 0311 BibRef

Downton, A.C., Tams, A.C., Wells, G.J., Holmes, A.C., Lucas, S.M., Beccaloni, G.W., Scoble, M.J., Robinson, G.S.,
Constructing Web-based legacy index card archives-architectural design issues and initial data acquisition,
ICDAR01(854-858).
WWW Version. 0109 BibRef

Sako, H., Seki, M., Furukawa, N., Ikeda, H., Imaizumi, A.,
Form reading based on form-type identification and form-data recognition,
ICDAR03(926-930).
IEEE Abstract. IEEE Top Reference. 0311 BibRef

Shimamura, T., Zhu, B.L.[Bi-Lan], Masuda, A., Onuma, M., Sakurada, T., Nakagawa, M.,
A prototype of an active form system,
ICDAR03(921-925).
IEEE Abstract. IEEE Top Reference. 0311 BibRef

Yan, H.P.[He-Ping], Wang, Z.[Zhiyan], Guo, S.[Sen],
An evaluation system for string extraction in the airline coupon project,
ICDAR05(II: 930-934).
WWW Version. 0508 BibRef

Zhao, S.[Shanheng], Wang, Z.[Zhiyan],
A high accuracy rate commercial flight coupon recognition system,
ICDAR03(82-86).
IEEE Abstract. IEEE Top Reference. 0311 BibRef

Thoma, G.R.[George R.], Ford, G.[Glenn], Le, D.[Daniel], Li, Z.R.[Zhi-Rong],
Text Verification in an Automated System for the Extraction of Bibliographic Data,
DAS02(423 ff.).
HTML Version. 0303 BibRef

Wnek, J.[Janusz],
Machine Learning of Generalized Document Templates for Data Extraction,
DAS02(457 ff.).
HTML Version. 0303 BibRef

Murshed, N.[Nabeel],
Automatic Reading of Traffic Tickets,
DAS02(66 ff.).
HTML Version. 0303 BibRef

Wong, W.S.[Wing Seong], Sherkat, N., Allen, T.,
Contextual focus for improved recognition of hand-filled forms,
ICDAR01(748-752).
WWW Version. 0109 BibRef
And:
Use of colour in form layout analysis,
ICDAR01(942-946).
WWW Version. 0109 BibRef

Hirano, T., Okada, Y., Yoda, F.,
Field extraction method from existing forms transmitted by facsimile,
ICDAR01(738-742).
WWW Version. 0109 BibRef

Shinjo, H., Hadano, E., Marukawa, K., Shima, Y., Sako, H.,
A recursive analysis for form cell recognition,
ICDAR01(694-698).
WWW Version. 0109 BibRef

Zheng, Y.F.[Ye-Feng], Liu, C.S.[Chang-Song], Ding, X.Q.[Xiao-Qing], Pan, S.Y.[Shi-Yan],
Form frame line detection with directional single-connected chain,
ICDAR01(699-703).
WWW Version. 0109 BibRef

Llados, J., Lumbreras, F., Chapaprieta, V., Queralt, J.,
ICAR: Identity Card Automatic Reader,
ICDAR01(470-474).
WWW Version. 0109 BibRef

Chhabra, A.K.,
Anatomy of a hand-filled form reader,
WACV94(195-204).
IEEE Abstract. IEEE Top Reference. 0403 BibRef

Fan, K.C.[Kuo-Chin], Wang, Y.K.[Yuan-Kai], Chang, M.L.[Mei-Lin],
Form Document Identification Using Line Structure Based Features,
ICDAR01(704-708).
WWW Version. 0109 BibRef
Earlier: A1, A3 Only: ICPR98(Vol II: 1098-1100).
WWW Version. 9808 BibRef

Trupin, E.[Eric], Ribert, A.[Arnaud], Diana, S., Heroux, P.,
Classification Method Study for Automatic Form Class Identification,
ICPR98(Vol I: 926-928).
WWW Version. 9808 BibRef

Duygulu, P.[Pinar], Dincel, E.[Ebru], Atalay, V.[Volkan],
A Heuristic Algorithm for Hierarchical Representation of Form Documents,
ICPR98(Vol I: 929-931).
WWW Version. 9808 BibRef

Shinjo, H., Nakashima, K., Koga, M., Marukawa, K., Shima, Y., Hadano, E.,
A Method for Connecting Disappeared Junction Patterns on Frame Lines in Form Documents,
ICDAR97(Poste) 9708 BibRef

Bayer, T.A., Mogg-Schneider, H.U.,
A Generic System for Processing Invoices,
ICDAR97(Poste) 9708 BibRef

Shamilian, J.H., Wood, T.L., Baird, H.S.,
A Retargetable Table Reader,
ICDAR97(Mo-3C) 9708 BibRef

Bohnacker, U., Schacht, J., Yuecel, T.,
Matching Form Lines Based on a Heuristic Search,
ICDAR97(Mo-2C) 9708 BibRef

Yoo, J.Y., Kim, M.K., Kwon, Y.B.,
Line Removal and Restoration of Handwritten Characters on the Form Documents,
ICDAR97(Mo-3B) 9708 BibRef

Mao, J.C.[Jian-Chang], Lorie, R., Mohiuddin, M.[Moidin],
A System for Automatically Reading IATA Flight Coupons,
ICDAR97(Mo-3C) 9708 BibRef

Mao, J.C.[Jian-Chang], Abayan, M., Mohiuddin, M.[Moidin],
A Model-Based Form Processing Sub-System,
ICPR96(III: 691-695).
WWW Version. 9608(IBM Almaden Res. Center, USA) BibRef

Mao, J.C.[Jian-Chang], Mohiuddin, K.,
Form dropout using distance transformation,
ICIP95(III: 328-331).
WWW Version. 9510 BibRef

Arai, H., Odaka, K.,
Form Processing Based on Background Region Analysis,
ICDAR97(Mo-3C) 9708 BibRef

Tang, Y.Y., Liu, J.,
Information Acquisition and Storage of Forms in Document Processing,
ICDAR97(Mo-3C) 9708 BibRef

Safari, R., Narasimhamurthi, N., Shridhar, M., Ahmadi, M.,
Form Registration: A Computer Vision Approach,
ICDAR97(Poste) 9708 BibRef

Aksak, I., Feist, C., Kiiko, V., Knoefel, R., Matsello, V., Oganovskij, V., Schlesinger, M., Schlesinger, D., Stanke, G.,
Extraction of filled-in data from colour forms,
CAIP97(98-105).
WWW Version. 9709 BibRef

Aksak, I., Feist, C., Kijko, V., Knoefel, R., Matsello, V., Oganovskij, V., Schlesinger, M., Schlesinger, D., Stanke, G.,
Detection of the objects with given shape on the grey-valued pictures,
CAIP97(551-558).
WWW Version. 9709 BibRef

Ting, A., Leung, M.,
Business Form Classification Using Strings,
ICPR96(II: 690-694).
WWW Version. 9608(School of Applied Science, SGP) BibRef

Kosiba, D., Kasturi, R.,
Automatic Invoice Interpretation: Invoice Structure Analysis,
ICPR96(III: 721-725).
WWW Version. 9608(The Pennsylvania State Univ., USA) BibRef

Hirayama, H.,
Analyzing Form Images By Using Line-Shared-Adjacent Cell Relations,
ICPR96(III: 768-772).
WWW Version. 9608(IBM Research, J) BibRef

Shimotsuji, S., Asano, M.,
Form Identification Based On Cell Structure,
ICPR96(III: 793-797).
WWW Version. 9608(Toshiba Corp., J) BibRef

Lorie, R., Riyaz, V., Truong, T.,
A System for Automated Data Entry from Forms,
ICPR96(III: 686-690).
WWW Version. 9608(IBM Almaden Res. Ctr., USA) BibRef

Garris, M.D.[Michael D.], Grother, P.J.[Patrick J.],
Generalized Form Registration Using Structure-Based Techniques,
SDAIR96(XX) National Institute of Standards and Technology. BibRef 9600

Ihle, T.[Torsten], Schirmer, H.[Helmut], Fuchs, S.[Siegfried],
Interpretation of printed forms for blind people,
CAIP95(550-555).
WWW Version. 9509 BibRef

Leedham, C.G., Monger, D.,
Evaluation of an Interactive Tool for Handwritten Form Description,
ICDAR95(1185-1188). BibRef 9500

Taylor, S.L.[S. Liebowitz], Fritzson, R.,
Registration and Region Extraction of Data from Forms,
ICPR92(I:173-176).
WWW Version. BibRef 9200

Latanzio, B., Garzotto, A.,
Reliable Recognition of Handwritten Marks in Checkboxes,
SDAIR96(XX) Swiss Life Information Systems Research. BibRef 9600

Turolla, E., Belaïd, Y., Belaïd, A.,
Line and cell searching in tables or forms,
CIAP95(509-514).
WWW Version. 9509 BibRef

Chapter on OCR, Document Analysis and Character Recognition Systems continues in
Specific Examples: Extract Titles, Table of Contents, Citation, Information from Papers and Books .


Last update:Jun 25, 2008 at 13:37:57