PDF Text to Variable Node

The PDF Text to Variable Node allows you to create variables from text content stored within the PDF. You will first need to set the areas or region to extract text from a template PDF document. Once set, as the documents are processed the variables will be populated with the text found in the PDF documents using the location information set for each variable.

 

Creating Variable Locations

To set the regions in a PDF you will need to launch the Text to Variable tool and select a template document to use as a reference to create the locations to extract text. The template document is for reference only.

  1. Click on "Select Template PDF" to launch the file chooser
  2. Select the PDF document you would like to use as your template document
    • Note: For best results the template must be exactly the same as the documents that will be processed through the flow.
  3. Use the document preview window to create the areas to search for text
  4. Drag the cursor over the area you wish to extract text from. This will display an Add Variable dialog with a preview of the text that is extracted
    • Note: If no text is found it means the document contains no text content. To properly extract text you will need to use OCR node BEFORE this node in order to extract text.
  5. Set the variable options in the Add Variable dialog and click OK to create the new variable.
  6. To add additional text areas click on the Plus Sign on the toolbar and repeat the process until all areas have been selected.
  7. Once you have completed created all areas click on OK to close the dialog and finish the variable creation process

Variable Settings

Variable Name: The name of the variable to be created

Type: PAS currently supports the following user defined data types for variables

Format: Sets the default format for the variable. The format can also be set dynamically at output time using Variable Formatting

Required: Sets whether a value is required in the variable to continue the flow. When set to true if no variable is created on this node it will trigger Trouble Handling. When set to false and no value is found it will pass through an empty string. 

Regular ExpressionAllows you to user a regular expression to perform more advanced search queries

Examples:

Template text: Preview of the text found in the area selected

Variable Formatting

Variables require a default format to be set in order for to be used by the workflow. You can use one of the predefined formats in the drop down or write your own using the following formatting symbols

String

NO formatting options, plain text strings only

Numbers

0 = digit placeholder (000 = 001)

# = digit no placeholder (### = 1)

. = decimal separator (#.# = 0.5)

, = grouping separator (#,### = 1,000)

Dates

yy = two-digit year

yyyy = four-digit year

m = month (1-12)

mm = two-digit month (01-12)

mmm = three letter month abbreviation (01=Jan)

mmmm = full month name (01=January)

d = day of month (1-31)

dd = two-digit day of month (01 - 31)

H = hours (1-23)

HH = two digit hour (00 - 23) (AM/PM NOT allowed)

M = minutes (0-59)

MM = two digit minute (00 - 59)

s = seconds (0-59)

ss = two digit second (00 - 59)

h = hour (0 - 12 | 12-hour AM/PM format)

hh = two digit hour (00 - 12 | 12-hour AM/PM format)

tt = AM/PM based on the time

zz = time zone (Pacific Daylight Time = PDT)

zzzz = time zone (Pacific Daylight Time)

Boolean

True / False

1 / 0 = binary format

yes / no

 

 


Qoppa Software's PDF Automation Server for Windows, Linux, Unix, and macOS

Automate PDF Document Workflows through RESTful Web Services & Folder Watching

Copyright © 2002-Present Qoppa Software. All rights reserved.