What is the difference between structured and unstructured documents?

Structured documents contain a set of information where the formatting, number, and layout are completely static from one document instance to the next. These documents are also called fixed forms. For example, most questionnaires and application forms are fixed forms. These forms are usually distributed as blank forms ideally with constrained text boxes and "fill-in-the-bubble" responses and filled out by hand. On some occasions, fixed forms may not produce the desired results while using a standard document definition due to shifting of the text when using a fax machine or poor image quality. Using a FlexiLayout or machine learning (trained field extraction) may be recommended for better results.

Unstructured documents contain information presented in a free format, for example contracts, letters, orders, and bills of lading. Using a FlexiLayout, the document structure can use keywords and coordinates relative to them to find the desired data on a document. NLP, or natural language processing, can also be used to process unstructured documents such as contracts or deeds. 

Another form of document can be semi-structured, where there is a mix of both structured and unstructured data found in the document.




Please sign in to leave a comment.