You are here: Commands > Size

Size

Menu: Found in the DATA menu

Use Size to determine the appropriate sample sizes for record and monetary unit samples (MUS).

Note: The theory behind statistical sampling is complex. If you are not familiar with the critical judgements required when performing statistical sampling, we recommend that you consult a statistics specialist before using the Size, Sample and Evaluate commands.

Sample Size Options

The appearance of the Size dialog varies according to which sampling option is selected.

Monetary Unit Sample

By default, Analyzer displays the Monetary Unit option and its corresponding parameters.

Click [Calculate] to display the calculated results after entering the Size parameter values. Sample size, interval and maximum tolerable taintings are displayed in the Results area of the dialog.

Record Sample

By default, Analyzer displays the Monetary Unit option and its corresponding parameters. Select Record to display the record sample option.

Click [Calculate] to display the calculated results after entering the Size parameter values. Sample size, interval and the number of tolerable errors are displayed in the Results area of the dialog.

Size reports:

The required sample size
The interval based on the supplied population size (if an interval sample is used)
The maximum amount of error expected in the sample

The Size command automatically creates SAMPSIZE and SAMPINT variables that contain the reported sample size and interval, respectively. Use these variables to create procedures that automatically supply parameters to a subsequent Sample command.

When errors and taintings are found, use the Evaluate command to determine their effect.

Maximum Tolerable Tainting - MUS Samples

Because dollar amounts may be only partially in error, the amount of error for a particular item is referred to as the error tainting. For example, a $100 item that is totally in error has a 100% tainting, whereas a $100 item that actually should have a value of $93 has a 7% tainting. The maximum tolerable tainting is the sum of all the individual error taintings. As long as this sum is less than the reported value of the sample size, the results are valid.

Maximum Tolerable Errors - Record Samples

As long as the actual number is less than or equal to the value of the sample, the results are valid.

Sample Sizes

The Size command generates attribute sample sizes. It is not intended to generate sample size for variable sampling or estimation sampling.

The Size command produces statistically valid attribute sample sizes for most analyses, unless:

You are sampling very small populations.
Your organization has in-house sampling experts. They will be able to define sample sizes tailored precisely for your needs.
Your organization has mandated using another sampling tool or methodology.

Analyzer generates sample sizes using the Poisson distribution, rather than the binomial distribution. The advantage of the Poisson distribution is that it:

Does not require you to know the population size before you generate a sample size.
Simplifies the calculations required to produce sample sizes and to evaluate sample errors detected.

The Poisson distribution is widely used for calculating sample sizes. It is easier to work with than the binomial distribution and Poisson distribution tables are readily available when you need to check the calculations.

For record samples the Poisson distribution generates the same sample size, regardless of the size of the population. For typical population sizes of a thousand or more records, the two distributions generate nearly identical sample sizes. For populations of under a thousand records, sample sizes determined with the Poisson ratio tend to be slightly larger and therefore more conservative than sizes determined with the binomial distribution. This is because the binomial distribution adjusts the sample size downward for small populations but the Poisson distribution does not. With very small populations, the fixed sample size generated by the Poisson distribution can actually exceed the population size.

When using Size for record sampling of small populations, recognize that the sample size may be larger than you need. This does not present an obstacle to analysis since it is common practice to manually over-test small populations.

To obtain record sample sizes that do not differ significantly from those obtained using the binomial distribution, do not use the Size command when the sample size generated is greater than ten percent of the population.

Parameters

The Size command has the following command parameters:

Monetary Unit Sample (MUS)

Confidence

Specify the reliability you would like the sample to generate. For example, entering 95 in the text box indicates 95% confidence in the sample. That is, it would likely be wrong only one time in 20.

Expected Total Errors

Specify the total dollar amount of errors expected in the population. This increases the sample size to allow for the expected errors.

Materiality

Specify the amount of money considered significant. This is the maximum amount of error you are willing to accept in the population without detection.

Monetary Unit

Indicates that a monetary unit sample (MUS) is to be taken. The likelihood that an item is selected is proportional to its size.

Use fixed interval sampling if you intend to use the Evaluate command to assess errors.

Population

Specify the absolute value of the field being sampled.

Record Sample

Confidence

Specify the reliability you would like the sample to generate. For example, entering 95 in the text box indicates 95% confidence in the sample. That is, it would likely be wrong only one time in 20.

Expected Error Rate

Specify the percentage of error that you expect in the population. If you know that 20 out of 1000 invoices contain errors, you can set the expected error rate to 2%. The closer this value is to the upper error limit, the larger the sample size will be due to the restrictive conditions.

Population

Specify the record count.

Record

Indicates that the sample is an unbiased record sample. The likelihood that an item is selected is unrelated to its size. Each record has an equal chance of selection.

Upper Error Limit

Specify the maximum percentage of undetected error that you can accept in the population. If this value is very low, Analyzer will need to select a large number of items in order to meet the confidence entered.

Command Mode Syntax

SIZE MONETARY

POPULATION population-size

CONFIDENCE confidence-level

MATERIALITY materiality-level

<ERROR expected-error-amount>

<TO text-file-name>¿

SIZE {RECORD|ATTRIBUTE}

POPULATION population-size

CONFIDENCE confidence-level

PRECISION precision-level

<ERROR expected-error-rate>

<TO text-file-name>¿

Examples

To determine the Monetary Unit sample size required to have 90% confidence that the total errors in a population of $60 million do not exceed $1 million, assuming that there are $50,000 errors in the population:

You need to draw a sample of 150 items. Because there are $50,000 in expected errors, some amount of error in the resulting sample can be tolerated. As long as the total taintings do not exceed 12.54%, your hypothesis is supported with 90% confidence. If you choose to draw this sample using an interval selection method, an interval of $398,701.29 is appropriate.

The following possible taintings and their evaluation are related to the sample described above:

A $950 item recorded as a $1,000 item implies 5% tainting (50/1,000) and is therefore acceptable.
Given the above error and another in which a $15,000 item was recorded as $14,000 (6.66% tainting; 1,000/15,000), the results are still acceptable because the total taintings are only 11.66%.
Given the above two errors and another in which a $100 item was recorded as $98 (2% tainting), the sample may be insufficient to prove your hypothesis because the total tainting is 13.66% (greater than 12.54%). Use the Evaluate command to confirm the effect of the errors.
Given a single error in the population in which a $100 item was recorded at $80 (20% tainting), the sample is probably insufficient to prove your hypothesis. Because the error was found in one item, you can assume with certainty that the sample is insufficient.

Tip: Whenever there are errors in your sample, you can use the Evaluate command to determine the impact of the errors on your results. You can also use Evaluate to determine if a given monetary sample is appropriate.

To determine the sample size for a record sample required to have 95% confidence that the total errors in a population of 40,000 do not exceed 5% (2,000 errors), assuming that there are 2% errors in the population (800 errors):

You need to draw a sample of 184 items. Because there are four tolerable errors, as long as there are four or fewer errors in the sample, your hypothesis is supported with a 95% confidence. If you choose to draw the above sample using an interval selection method, an interval of 217.39 is appropriate.

Note: The sample size in this example is significantly larger than the size in first example. This allows for errors you would expect to detect.