Question
Is it possible to create a custom XML output?
Answer
During the export to XML, it is possible to either use the default XML schema or create a custom one.
Default XML Schema
You can find the XML schema in the FineReader10-schema-v1.xsd file. This file is located in the Inc folder (Start > Programs > ABBYY FineReader Engine 12 > Installation Folders > Include Files Folder). For the description of XML tags see the XML Schema Description.
Modifying the Default XML Schema
It is possible to use the following properties of the XMLExportParams object to add or remove some elements from the default XML schema:
WriteCharacterRecognitionVariants
WriteCharAttributes
WriteCharFormatting
WriteNondeskewedCoordinates
WriteWordRecognitionVariants
Full information is in FineReader Engine Help, chapter "API Reference → Parameter Objects → Export Parameters → XMLExportParams".
Custom XML Schema
It is also possible to use a custom schema, by typing it in a text editor such as StreamWriter and saving it as an XML file.
This is illustrated by the code sample below.
The sample code does the following:
Saves pictures from a recognized document separately;
writes an XML file that looks like
...
<RasterPicture>
<D:\pictures\1.jpg>
</RasterPicture>
...
Visual Basic Sample Code
Imports System.IO.StreamWriter
Private Sub Export(ByVal FRDocument As FREngine.FRDocument, ByVal filePath As String)
' Declare a FileStream and create a XML document file named file with access mode of writing
Dim fs As New FileStream(filePath, FileMode.Create, FileAccess.Write)
' Create a new StreamWriter and pass the filestream object fs as argument
Dim s As New StreamWriter(fs)
' Write text to the newly created file
s.WriteLine("<?xml version='1.0' encoding='UTF-8'?>")
Dim imagesFolderName As String
imagesFolderName = …
Dim imagesPath As String
Dim Blocks As FREngine.LayoutBlocks
For PagesIndex As Integer = 0 To FRDocument.Pages.Count - 1
Blocks = FRDocument.Pages(PagesIndex).Layout.Blocks
For BlocksIndex As Integer = 0 To Blocks.Count - 1
If Blocks(BlocksIndex).Type = FREngine.BlockTypeEnum.BT_RasterPicture Then
s.WriteLine("<RasterPicture>")
Dim ImageModification As FREngine.ImageModification
ImageModification = Engine.CreateImageModification
ImageModification.AddClipRegion(Blocks(BlocksIndex).Region)
imagesPath = …
FRDocument.Pages(PagesIndex).ImageDocument.ColorImage.WriteToFile(imagesPath,
FRDocument.Pages(PagesIndex).ImageDocument.SourceImageFileFormat,
ImageModification)
s.WriteLine("<" + imagesPath + ">")
s.WriteLine("</RasterPicture>")
End If
Next BlocksIndex
Next PagesIndex
' Close the file
s.Close()
End Sub
Comments
0 comments
Please sign in to leave a comment.