23.2.2.2.8 Document Layout, Structure Analysis, Web Documents, Online Documents

Chapter Contents (Back)
Document Analysis. Application, Document Layout.

Ashraf, F., Ozyer, T., Alhajj , R.,
Employing Clustering Techniques for Automatic Information Extraction From HTML Documents,
SMC-C(38), No. 5, September 2008, pp. 660-673.
IEEE DOI Link 0810
BibRef

Carullo, M.[Moreno], Binaghi, E.[Elisabetta], Gallo, I.[Ignazio],
An online document clustering technique for short web contents,
PRL(30), No. 10, 15 July 2009, pp. 870-876.
Elsevier DOI Link
WWW Version. 0906
Online clustering; Short documents analysis; Similarity measures BibRef

Carullo, M.[Moreno], Binaghi, E.[Elisabetta], Gallo, I.[Ignazio], Lamberti, N.[Nicola],
Clustering of short commercial documents for the web,
ICPR08(1-4).
IEEE DOI Link 0812
BibRef


Hassan, T.[Tamir],
User-Guided Wrapping of PDF Documents Using Graph Matching Techniques,
ICDAR09(631-635).
IEEE DOI Link 0907
PDF does not have the structure give by html. BibRef

Ghosh, S.[Saptarshi], Mitra, P.[Pabitra],
Combining content and structure similarity for XML document classification using composite SVM kernels,
ICPR08(1-4).
IEEE DOI Link 0812
BibRef

Hirano, T.[Takashi], Okano, Y.[Yuichi], Okada, Y.[Yasuhiro], Yoda, F.[Fumio],
Text and Layout Information Extraction from Document Files of Various Formats Based on the Analysis of Page Description Language,
ICDAR07(262-266).
IEEE DOI Link 0709
BibRef

Burget, R.,
Layout Based Information Extraction from HTML Documents,
ICDAR07(624-628).
IEEE DOI Link 0709
BibRef

Guo, H., Mahmud, J., Borodin, Y., Stent, A., Ramakrishnan, I.,
A General Approach for Partitioning Web Page Content Based on Geometric and Style Information,
ICDAR07(929-933).
IEEE DOI Link 0709
BibRef

Yoshida, M., Nakagawa, H.,
Web Document Parsing: A New Approach to Modeling Layout-Language Relations,
ICDAR07(203-207).
IEEE DOI Link 0709
BibRef

Ferilli, S., Biba, M., Basile, T.M.A., Esposito, F.,
Incremental machine learning techniques for document layout understanding,
ICPR08(1-4).
IEEE DOI Link 0812
BibRef

Esposito, F., Ferilli, S., di Mauro, N., Basile, T.M.A.,
Incremental Learning of First Order Logic Theories for the Automatic Annotations of Web Documents,
ICDAR07(1093-1097).
IEEE DOI Link 0709
BibRef

Esposito, F., Ferilli, S., Basile, T.M.A., di Mauro, N.,
Automatic Content-based Indexing of Digital Documents through Intelligent Processing Techniques,
DIAL06(204-219).
IEEE DOI Link 0604
BibRef
Earlier:
Intelligent document processing,
ICDAR05(II: 1100-1104).
IEEE DOI Link 0508
BibRef

Watai, Y.[Yasuyuki], Yamasaki, T.[Toshihiko], Aizawa, K.[Kiyoharu],
View-Based Web Page Retrieval using Interactive Sketch Query,
ICIP07(VI: 357-360).
IEEE DOI Link 0709
BibRef

Ma, J.C.[Jun-Chang], Gu, Z.M.[Zhi-Min],
A Shared Fragments Analysis System for Large Collections of Web Pages,
DAS06(390-401).
Springer DOI Link 0602
BibRef

Liu, W.Y.[Wen-Yin], Huang, G.[Guanglin], Liu, X.Y.[Xiao-Yue], Deng, X.[Xiaotie], Min, Z.[Zhang],
Phishing Web page detection,
ICDAR05(II: 560-564).
IEEE DOI Link 0508
BibRef

Feng, J., Haffner, P., Gilbert, M.,
A learning approach to discovering Web page semantic structures,
ICDAR05(II: 1055-1059).
IEEE DOI Link 0508
BibRef

Chao, H.[Hui], Lin, X.F.[Xiao Fan],
Capturing the layout of electronic documents for reuse in variable data printing,
ICDAR05(II: 940-944).
IEEE DOI Link 0508
BibRef

Chao, H.[Hui], Fan, J.[Jian],
Layout and Content Extraction for PDF Documents,
DAS04(213-224).
WWW Version. 0505
BibRef

Behera, A., Lalanne, D., Ingold, R.,
Enhancement of layout-based identification of low-resolution documents using geometrical color distribution,
ICDAR05(I: 468-472).
IEEE DOI Link 0508
BibRef

Mekhaldi, D.[Dalila], Lalanne, D.[Denis], Ingold, R.[Rolf],
From searching to browsing through multimodal documents linking,
ICDAR05(II: 924-928).
IEEE DOI Link 0508
BibRef
Earlier:
Unity Is Strength: Coupling Media for Thematic Segmentation,
DAS04(559-562).
WWW Version. 0505
BibRef

Rigamonti, M., Bloechle, J.L., Hadjar, K., Lalanne, D., Ingold, R.,
Towards a canonical and structured representation of PDF documents through reverse engineering,
ICDAR05(II: 1050-1054).
IEEE DOI Link 0508
BibRef

Hadjar, K., Rigamonti, M., Lalanne, D., Ingold, R.,
Xed: a new tool for extracting hidden structures from electronic documents,
DIAL04(212-224).
IEEE DOI Link 0404
BibRef

Hadjar, K., Ingold, R.,
Logical labeling of Arabic newspapers using artificial neural nets,
ICDAR05(I: 426-430).
IEEE DOI Link 0508
BibRef

Schenker, A.[Adam], Bunke, H.[Horst], Last, M.[Mark], Kandel, A.[Abraham],
A Graph-Based Framework for Web Document Mining,
DAS04(401-412).
WWW Version. 0505
BibRef

Schenker, A.[Adam], Last, M.[Mark], Bunke, H.[Horst], Kandel, A.[Abraham],
Classification of web documents using a graph model,
ICDAR03(240-244).
IEEE Abstract. IEEE Top Reference. 0311
BibRef

Vitali, F.[Fabio], di Iorio, A.[Angelo], Campori, E.V.[Elisa Ventura],
Rule-Based Structural Analysis of Web Pages,
DAS04(425-437).
WWW Version. 0505
BibRef

Hu, J.Y.[Jian-Ying], Bagga, A.,
Identifying story and preview images in news web pages,
ICDAR03(640-644).
IEEE Abstract. IEEE Top Reference. 0311
BibRef

Ramachandran, S., Kashi, R.,
An architecture for ink annotations on web documents,
ICDAR03(256-260).
IEEE Abstract. IEEE Top Reference. 0311
BibRef

Gagneux, A., Emptoz, H.,
Web site: a structured document,
ICDAR03(1158-1162).
IEEE Abstract. IEEE Top Reference. 0311
BibRef

Mukherjee, S., Yang, G.[Guizhen], Tan, W.F.[Wen-Fang], Ramakrishnan, I.V.,
Automatic discovery of semantic structures in HTML documents,
ICDAR03(245-249).
IEEE Abstract. IEEE Top Reference. 0311
BibRef

Alam, H., Kumar, A., Nakamura, M., Rahman, F., Tarnikova, Y., Wilcox, C.[Che],
Structured and unstructured document summarization: Design of a commercial summarizer using Lexical chains,
ICDAR03(1147-1152).
IEEE Abstract. IEEE Top Reference. 0311
BibRef

Rahman, F., Alam, H.,
A commercial Web based digital library for sharing and distributing documents,
DIAL04(93-103).
IEEE DOI Link 0404
BibRef

Alam, H., Hartono, R., Kumar, A., Rahman, F., Tarnikova, Y., Wilcox, C.[Che],
Web page summarization for handheld devices: a natural language approach,
ICDAR03(1153-1158).
IEEE Abstract. IEEE Top Reference. 0311
BibRef

Rahman, A.F.R., Alam, H., Hartono, R., Ariyoshi, K.,
Automatic summarization of Web content to smaller display devices,
ICDAR01(1064-1068).
IEEE DOI Link 0109
BibRef

Serradura, L., Slimane, M., Vincent, N.,
Web sites thematic classification using hidden Markov models,
ICDAR01(1094-1098).
IEEE DOI Link 0109
BibRef

Penn, G., Hu, J.Y.[Jian-Ying], Luo, H.[Hengbin], McDonald, R.,
Flexible Web document analysis for delivery to narrow-bandwidth devices,
ICDAR01(1074-1078).
IEEE DOI Link 0109
BibRef

Anjewierden, A.,
AIDAS: incremental logical structure discovery in PDF documents,
ICDAR01(374-378).
IEEE DOI Link 0109
BibRef

Athitsos, V., Swain, M.J., Frankel, C.,
Distinguishing photographs and graphics on the World Wide Web,
CBAIVL97(10).
IEEE DOI Link 9706
BibRef

Chapter on OCR, Document Analysis and Character Recognition Systems continues in
Document Retrieval Systems, Databases and Issues, Libraries .


Last update:Nov 16, 2009 at 19:35:14