Automated File Upload Via SFTP

One of the channels for uploading data into ABBYY Timeline is SFTP. You can upload files to ABBYY Timeline automatically on a schedule or manually using any standard SFTP client that does not require read access to the destination folder. For example command line OpenSSH on Linux and Mac or WinSCP on Windows and Mac.

ABBYY Timeline will configure the SFTP destination and share the location and credentials with the client for the scheduled file push. The data upload from the SFTP server is then an automated process where the files are automatically picked up and processed into a project or one or more To-Do lists in ETL which will result in that data being loaded into a project.

You can create To-Do lists within the Repository (ETL) tool, which are essentially a list of data transformation functions. These typically end in loading that data into a project, or if the file does not require any transformation, you can specify that it should be loaded directly into a project using the predefined project table mapping.

The client should provide a small manifest file zipped along with each source file, in JSON format (examples below), which includes the instructions as to how the file should be processed.

To set up automatic data upload via SFTP, follow these steps:

  1. Generate your public/private key pair.
    Please refer to your SFTP client documentation for instructions.
  2. Contact ABBYY Timeline support at support@abbyy.com and send the public key.
    Support will get back with the username and SFTP server name.
  3. Create manifest.json file as described above.
    Important. The manifest file must be named manifest.json since the ABBYY Timeline accepts only manifest files with this name.
    For detailed instructions on creating a manifest file, see 'Manifest file' below.
  4. ZIP together the manifest and your data files.
  5. Upload the ZIP into the SFTP server.
  6. You should receive the email report once the upload and data processing is completed. You may also see the progress in the History section of your repository or project, depending on what token is specified in the manifest.
    If your data has not loaded and you have not received an email, make sure the manifest file is syntactically valid and named manifest.json.

Manifest file

The manifest file contains the instructions to the ABBYY Timeline server what to do with the data files from the ZIP. The file is in JSON format. It could be created in any simple text editor or using variety of online JSON editors.

The table below describes the nodes that are in the manifest file.

Node

Description

repositoryToken

The encrypted strings identifying the repository into which you load the data.

To obtain this string:

  1. Select View > Repository > Details, and then go to the Data Sources tab.
  2. Check Scheduled file upload.
    SftpToken-repository.png
projectToken

The encrypted strings identifying the project into which you load the data. To obtain this string:

  1. Select Project > Details, and then go to the Data Sources tab.
  2. Check Scheduled file upload.
email

One or multiple email addresses, separated by semi-column.

The reports about successful or failed uploads will be sent to these emails. We also always send the email to the project or repository owner.

commands

The ordered list of various commands which will be executed once the upload package is received and unzipped.

A command has action and optional arguments.

  • Action upload, if the repositoryToken is defined above. Parses the file and puts the data into the table.
    Arguments:
    • file – name of the file from the ZIP.
    • table – name of repository table into which the data from the file will be placed. If the table doesn’t exist, it will be created.
  • Action clone. Copies a repository table and all data in it.
    Arguments:
    • src – name of the table to be copied.
    • dst – name of the table into which the data from the file will be copied. If the table doesn’t exist, it will be created.
  • Action To-do list. Executes To-do list from the repository. Arguments:
    • table – name of the table on which the list will be executed.
    • list – name of To-do list.
  • Action upload, if the projectToken is defined above. Parses the file and puts the data into the existing project using the mapping defined for this project.
    Arguments:
    • file – name of the file from the ZIP.
    • clearProject – if true, old data from the project will be deleted before new data is loaded.

Examples of manifest

The following manifest file takes file events.csv from the ZIP, parses it into the repository table mytable, then copies the table into yourtable, the applies To-do list mylist to it. For more help with To-do list refer to the section on ETL in the Cloud. Report will be sent to repository owner and to yourname@company.com and to yourcolleague@company.com.

{
  "repositoryToken": "k2QZiJkZuH … 8v6f8BpQEdekqjgqNxBw-E0AZUz2kdVA",
  "email": [ "yourname@company.com", "yourcolleague@company.com" ],
  "commands": [
    {
      "action": "upload",
      "file": "myfile.csv",
      "table": "mytable"
    },
    {
      "action": "clone",
      "src": "mytable",
      "dst": "yourtable"
    },
    {
      "action": "todo-list",
      "table": "yourtable",
      "list": "mylist"
    }
  ]
}
    

Next example parses the file events.csv and loads it into a project identified by the token. Old data from the project is deleted.

{
  "projectToken": "",
  "email": "yourname@company.com",
  "commands": [
    {
      "action": "upload",
      "file": "events.csv",
      "clearProject": true
    }
  ]
}
    

To check the validity of the JSON, you could use this tool: https://jsonformatter.curiousconcept.com/

Example Video Demonstration

Was this article helpful?

1 out of 1 found this helpful

Have more questions? Submit a request

Comments

0 comments

Please sign in to leave a comment.

Recently viewed