You are here: Defining Files and Fields > Defining Data with the Data Definition Wizard > Identify Properties

Identify Properties

Analyzer determines the following data properties for the data you selected and asks you to confirm its analysis of each one:

Character Set (normally bypassed - see note below)
File Format
File Properties (normally bypassed - see note below)

Note: By default, the Data Definition Wizard is set to skip display of standard information panels (specifically the Character Set and File Properties panels which rarely if ever require user edits). These panels are described below for information purposes but will normally be bypassed unless the “Skip standard information” check box on the Select Data Source panel is unchecked.

Character Set

Analyzer automatically identifies the character set of your data file. There are three common character sets in use:

EBCDIC for IBM mainframe or minicomputers
ASCII for PCs and all other types of computers - when defining ASCII data that is not in the local language, you can choose the appropriate ASCII language code page from the code page drop down list to display this data correctly - if the desired code page does not appear in the pull down list, simply enter the code page number in the code page text box
UNICODE for PCs (UTF8, UTF16 and UTF16 Big Endian)

Accept Analyzer’s analysis and click [Next] to continue.

File Format - Local and Arbutus Windows Server

The Data Definition Wizard analyzes the source data file and tests for the following file formats (Local and Windows Server):

Manual Definition

If Analyzer cannot recognize the file format, it recommends this. The Wizard then takes you through each of the remaining definition panels.

MS Excel Spreadsheets

The default EXCEL option imports all versions of Excel spreadsheets using OPENXML, allowing you to create a flat file from the selected worksheet. See Defining Excel Spreadsheet Data.

The EXCEL (ODBC) option treats all versions of Excel spreadsheets as a relational database, allowing you to select desired columns, join tables and apply SQL Where and Order clauses. See Defining Excel Spreadsheet Data via ODBC.

Note: To read a specific version of an Excel spreadsheet using this option, the correct version of the Excel 32 bit ODBC driver must be installed on the relevant local machine or Arbutus Windows Server.

MS Excel Ad-Hoc Ranges (.XLS files only)

The EXCEL Ad-Hoc Ranges option in the Wizard automatically detects all versions of MS Excel spreadsheets prior to Excel 2007. You must specify the worksheet or range to be imported. The worksheet or range selected is automatically converted into a tab delimited file for subsequent processing in Analyzer. See Defining Excel Spreadsheet Ad-Hoc Ranges.

MS Access Databases

The ACCESS option treats all versions of MS Access databases as a relational database, allowing you to select desired columns, join tables and apply SQL Where and Order clauses. See Defining Access Database Tables.

Note: To read a specific version of an Access database using this option, the correct version of the Access 32 bit ODBC driver must be installed on the relevant local machine or Arbutus Windows Server.

Delimited Files

The Wizard presents an initial View of the delimited data, letting you adjust the delimited file properties parameters as necessary. See Defining Delimited Data.

XML Data

The Wizard automatically detects XML data. The Wizard displays an XML Data Selection dialog. See Defining XML Data.

PDF Files

The Wizard automatically converts the text stored in the selected PDF file into a new text file. The new text file is then defined as a print image file. For more information see Defining PDF Files.

dBASE Files

The original dBASE file and the internal field definitions are read directly and no new file is created. For more information see Defining dBASE files.

Print Image (Report) File

For more information, see Defining Print Image (Report) File Data.

External Definition (Local or Windows Server)

If you want to automatically define a Local or Arbutus Windows Server data source using an existing file definition, the Wizard lets you specify an appropriate external definition (AS/400, COBOL, or PL/1) based which the platform that the source data resides upon. For more information see Defining Data Using External Definitions.

Accept Analyzer’s analysis or choose an alternate option and click [Next] to continue.

Note: For more detailed information on defining data on the Arbutus Windows Server see Defining Data on the Arbutus Windows Server.

SAP Private File Format

The Wizard automatically detects data exported from SAP using the SAP Private File Format.

Users can choose to:

Use local language field descriptions as field names, or
Use standard-delivered SAP German abbreviations as field names

Convert Imported Unicode Data to ASCII

The "Convert Imported Unicode Data to ASCII" check box will appear for SQL-based data sources and for EXCEL. Select this option if you want any SQL-based Unicode data that is read to be converted to ASCII. This option should only be used if you are confident that the source data only contains North American English characters.

Note: The ASCII keyword is automatically inserted into the Import ODBC and OPENEXCEL command syntax generated by the Data Definition Wizard if SET SQLASCII is turned on, if the "Convert Imported Unicode Data to ASCII" table option is selected, or if the "Convert Imported Unicode Data to ASCII" check box is selected in the Data Definition Wizard via either the Select Data Source or Subsystem panel or the File Format panel.

Note: For more information see Convert Imported Unicode Data to ASCII or Set SQLASCII.

File Format - Arbutus zSeries and iSeries Server Data

For data other than DB2, IMS or Adabas (zSeries) and Database (iSeries), the Data Definition Wizard analyzes the file and tests for the following file formats:

Manual Definition

If Analyzer cannot recognize the file format, it recommends this. The Wizard then takes you through each of the remaining panels.

Print Image (Report) File

For more information, see Defining Print Image (Report) File Data.

External Definition (zSeries Server only)

If you want to automatically define an Arbutus zSeries Server data source using an existing file definition, the Wizard lets you specify an appropriate external definition (COBOL, PL/1, EASYTRIEVE or DBD’s) on the Arbutus zSeries Server. For more information see Defining Data Using External Definitions.

Note: DBD’s can only be selected if the IMS interface is activated.

Accept Analyzer’s analysis or choose an appropriate alternate option and click [Next] to continue.

Note: For more detailed information on defining Arbutus zSeries Server data see Defining Data on the Arbutus zSeries Server. For more detailed information on defining Arbutus iSeries Server data see Defining Data on the Arbutus iSeries Server.