Categorizing Data: Stratify, Age, Classify, Cross Tabulate, and Summarize

You are here: Analyzer Concepts > Single File Operations > Categorizing Data: Stratify, Age, Classify, Cross Tabulate, and Summarize

Categorizing Data: Stratify, Age, Classify, Cross Tabulate, and Summarize

These commands categorize data by separating it into specified ranges and totalling specified numeric fields. Analyzer offers you the unique advantage of allowing you to combine sorting and summarizing operations. Choose from these commands according to the type of categorizing operation you want to perform:

•

Stratify data according to numeric ranges

•

Age data according to date ranges

•

Classify data according to ranges based on unique values in a single character field

•

Cross Tabulate character fields into rows and columns while accumulating numeric values

•

Summarize data according to ranges based on multiple character or date fields, which can be displayed with selected data from associated fields

You can also summarize data in Views and reports by using the Break Column option to generate subtotals for selected character fields. You can use the break bar in a View to specify the left-most columns as break columns. If you generate a report from the View, Analyzer provides summary subtotals for every unique value in the break columns. You can also specify break column options in the Modify Columns dialog.

Stratify

Stratify produces a summary based on the size of intervals in a range of values.

You must specify the minimum and maximum values that define the range. You can also specify the number of equally-sized intervals or the start and end points of unequally-sized intervals.

Stratify works on unsorted files and lets you quickly scan and summarize data.

Stratify counts the number of records in a file and:

•

Divides the records into a specified number of intervals (strata) based on the range of values in a specified numeric field

•

Counts the number of records in each interval

•

Accumulates the values for one or more numeric fields for each interval

•

Calculates the percentage of the total count and of the total value of an accumulated field for each interval

For more information on the Stratify command see Stratify.

Age

Age produces a summary based on date intervals. The intervals are measured backwards in time from the current date or from a specified cutoff date.

You must specify the cutoff date that marks the starting date from which intervals are calculated. You can use the default intervals of 0, 30, 60, 90, 120, and 10,000 days or you can specify other intervals. An interval of 10,000 days is used to isolate records with invalid dates.

Age works on unsorted files and lets you quickly scan and summarize data. Age is commonly used to classify invoices by the number of days outstanding from a particular date.

Age counts the number of records in a file and:

•

Divides the records into intervals based on date (aging periods)

•

Counts the number of records in each interval

•

Accumulates the values of one or more numeric fields for each interval

•

Calculates the percent of the total count and of the total value of an accumulated field for each interval

For more information on the Age command see Age.

Classify

Classify produces a summary based on unique values in a character field, such as names, credit card numbers or telephone numbers.

Classify works on unsorted files and lets you quickly scan and summarize data. For example, Classify can rapidly generate a trial balance from unsorted ledger transactions.

Classify works more rapidly than Summarize because it does not need to presort the file, but the number of records it can analyze depends on available RAM. There is, however, no practical limit to the number of records that you can analyze using the Summarize command.

You must specify the character field that Analyzer will analyze to determine the unique classification ranges. The ranges are based on the number of records that correspond to each unique value in this field.

Classify counts the number of records in a file and:

•

Divides the records into ranges based on each unique value in a specified character field (classification)

•

Counts the number of records in each range

•

Accumulates the values for one or more numeric fields for each range

•

Calculates the percent of the total count and of the total value of an accumulated field for each range

For more information on the Classify command see Classify.

Cross Tabulate

Cross Tabulate analyzes character fields by setting them in rows and columns.

By cross-tabulating character fields, you can produce various summaries, show details on areas of interest, and accumulate numeric fields. Cross Tabulate can produce results in a file, table or graph.

For example, you can produce a table or graph that shows the number of customers by city. You can also choose to accumulate numeric fields to provide such information as the sales volume per salesperson in each territory, for example.

Cross Tabulate counts the number of records in a file and:

•

Counts each row value within each column value

•

Accumulates numeric fields for each row value within each column value

•

Totals the amounts for each column value

For more information on the Cross Tabulate command see Cross Tabulate.

Summarize

Summarize produces a summary based on unique values in one or more character field, such as names, credit card numbers or telephone numbers.

Summarize is similar to Classify, but lets you specify more than one field on which to summarize, which lets you define intervals more precisely. Summarize is effective for surveying the contents of tables.

Summarize also lets you list the first occurrence of information from a specified field. For example, you could summarize a file on both vendor number and invoice number, accumulate the values of one or more fields, and provide additional information for each interval such as the sales representative’s name.

Summarize can also be used to remove duplicate records from a file.

You must specify one or more character or date fields that Analyzer analyzes to determine the summarization ranges. The ranges are based on the number of records that correspond to each unique combination of values in the specified summarization fields.

All files should be presorted on the key character fields in the intended summarizing sequence. You can sort or index the file before using Summarize or you can use the Presort option.

Summarize counts the number of records in a file and:

•

Divides the records into ranges based on each unique value in one or more specified character or date fields

•

Counts the number of records in each range

•

Accumulates the values for one or more numeric fields for each range

•

Displays information from one or more selected fields for each range

For more information on the Summarize command see Summarize.